[UCIMC-Tech] Re: The Bike Project Web Down again

Joshua King josh at ucimc.org
Thu Jun 11 13:52:16 CDT 2009


Hi Barry,

The problem with the site was that we had a power flicker sometime last
evening, and the webserver doesn't start back up correctly on its own (it
wants you to hit F1 on boot, and because it is a very old Compaq Proliant it
doesn't have BIOS settings to change that. I'm working on tracking down and
modifying the old floppy image you need to tweak those boot settings).

There is already a monitoring platform (Zabbix) set up on Chambana.net, it's
just that a lot of stuff (servers, IPs, etc) have changed since it was first
installed. I just need to updated it and reconfigure it. Once I do that I'll
probably be looking for people to be on the notification list if you're
interested.

Unfortunately, almost all of the things you mention exceed the tech budget
for the IMC. We're still pursuing switching network providers if we can do
it in a cost-effective manner. We're already in Comcast's business service
class, and can't afford more from them (and they don't offer anything higher
in a standard package). Co-location tends to be expensive. The options I'm
pursuing for offsite failover are:

Our offsite virtual server is low on space and extremely overloaded.
However, our offsite dedicated server (hosted for free) was until recently
on too narrow of a pipe to be anything more than useless. However, it just
got moved to a fatter connection, so now we can start using it for hosting.
I am in the process of upgrading it and setting up virtualized jails, so
that we can provide service failover offsite. This might not be
instantaneous and fully automatic in the case of some sites (circular
master-master offsite MySQL replication is _extremely_ problematic) but will
provide us with more options.

Besides that, I'm working on creating a greater amount of failover options
within the IMC itself, which will partly be facilitated by the inclusion of
new gigabit switches within our server network that I'll be installing this
weekend. This will let us do synchronous database queries and serve webfiles
over NFS between servers without bogging down the building's network, thus
improving reliability.

Since this was a rather lengthy explanation, I hope you don't mind me CCing
it over to IMC-Tech for documentation purposes.

On Thu, Jun 11, 2009 at 1:14 PM, Barry Isralewitz <barryi at ks.uiuc.edu>wrote:

> Hi,
>
>
>   The site is back up now. Thanks, Josh!  (And thanks for mailing on
> this Todd.)
>
>   Short comments: 1) how about we install Argus or other monitoring
> software for chambana.net .  I'll have time to help with setup after
> July 10. 2) What do you all think is overall best strategy to improve
> site aceess reliability? Including but not limited to: change network
> provider, upgrade service class, move to server co-location, virtualize
> servers, implement cheap multi-site failover.
>
>
>  Details:
>  Judging only from thebikeproject.org accesss logs,
> there's a roughly 20 hour  gap ...between 10/Jun/2009:16:01:54 and
> 11/Jun/2009:12:19:33 .  Could have been less time, but not much less --
> thebikeproject.org gets hit many times per hour 'round the clock, thanks
> to crawler bots and insomniacs.
>
>  If this is comcast's fault, we gotta complain and get something
> changed, or somehow switch services.  A services monitor can help
> clarify the problem -- I think chambana.net does not have one right now.
> .  I'm willing to help install Argus for chambana.net after July 10 --
> but abandon monitoring to others / shared list after it was at an
> acceptable level of alert issuing.  Would be good to have auto-detection
> and notification of problems as they arise, and to have good downtime
> records for the various  services.
>
>
>                          CHeers,
>
>                          Barry
>
>
>
> On Thu, Jun 11, 2009 at 09:51:51AM -0500, Joshua King wrote:
> > Hi Todd,
> >
> > I'll be at the IMC this afternoon and can take a look.
> >
> > On Thu, Jun 11, 2009 at 8:34 AM, Spinner, David Todd
> > <spinner at illinois.edu>wrote:
> >
> > >  Just to let everyone know, the Internet is working at TBP but it seems
> > > the Bike Project Website is down again.  We couldn???t use it last
> night
> > > during open hours and I can???t access it from work this morning.
> > >
> > >
> > >
> > > Todd
> > >
> > >
> > >
> > > D. Todd Spinner
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> > --
> > Josh King
> > --
> > "I am an Anarchist not because I believe Anarchism is the final goal,
> > but because there is no such thing as a final goal." -Rudolf Rocker
>
> --
> Barry Isralewitz, Ph. D.
> Theoretical and Computational Biophysics Group
> 3043 Beckman, University of Illinois at Urbana-Champaign
> Office Phone: (217) 244-1612    Home Phone: (217) 337-6364
> email: barryi at ks.uiuc.edu   http://www.ks.uiuc.edu/~barryi<http://www.ks.uiuc.edu/%7Ebarryi>
>



-- 
Josh King
--
"I am an Anarchist not because I believe Anarchism is the final goal,
but because there is no such thing as a final goal." -Rudolf Rocker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.chambana.net/mailman/archive/imc-tech/attachments/20090611/9f02b1ad/attachment.html


More information about the IMC-Tech mailing list