[Hcoop-discuss] Re: [Hcoop-announce] /var space problem fixed

ntk at hcoop.net ntk at hcoop.net
Thu Dec 15 08:01:15 EST 2005


> On Wed, 2005-12-14 at 19:28 -0800, Adam Chlipala wrote:
>> Debian's apt package cache was using 60% of available space on /var,
>> which maxed out that partition's space and broke a number of things.
>> I'm not sure when it first became full, but, among other things, it
>> looks like mail was being dropped silently during that time.  It's fixed
>> now.
>
> It doesn't appear that HCoop has a backup MX, is this true? Wasn't the
> plan to use Abulafia as the backup server?

I think we should have a cron script or daemon running on abu that emails
admins whenever certain bad things happen--low/no disk space, critical
daemons dying, etc.  Then we should have another daemon running on Abu
that pings fyodor every now and then and emails admins if fyodor goes
down, stops serving web pages, etc.  This would not be hard to write, I
did similar perl scripts for a cluster at my last job and I could do this
over break.  It's bad when emails are dropped.  Perhaps at some point once
we get another server or two we should move mail services off of the login
box, and have smtp/pop/imap running on that server, possibly with the mail
spools mounted onto the login box(es) over network.

-ntk





More information about the HCoop-Discuss mailing list