[Hcoop-discuss] Next hardware configuration/Email service
Adam Chlipala
adamc at hcoop.net
Wed Feb 1 10:25:17 EST 2006
Justin S. Leitgeb wrote:
>What benefit would we get from rsync'ing the IMAP files to the primary
>fileserver? Wouldn't we just want users to access them through the IMAP
>daemon? I think that for properly backing up the IMAP server we will
>have to stop it momentarily anyway, or pull some other kind of trick in
>order to get a decent snapshot. There are some threads I have browsed
>on the internet regarding this but I've never run an IMAP server myself
>so I can't say for sure.
>
>
Maybe we have different ideas of what a "proper" back-up is. I think
it's OK for any IMAP files that are being modified to be lost or
corrupted in the backup images, assuming each message gets its own
file. (Most messages are only modified at receipt time, so we'll catch
them in the next backup.) We are currently backing up all mailboxes
on-disk with rsnapshot (a script based on rsync), and I'm not aware of
any IMAP-specific correctness problems with the scheme. Are you just
worried about performance?
Like I said before, things are easier if we only have to apply "heavy
duty" backup techniques to a single filesystem. This is why it's
preferable to back up IMAP files to the shared filesystem, if that's not
their primary home.
Karl Chen wrote:
>I think the first thing to decide is if we want a transparent
>distributed file system like AFS/Coda/etc, or an exposed system
>like you describe with rsync and synchronization commands.
>
>Adam, it sounds like you favor control and performance over
>synchronization and ease of use, which would mean rsync instead of
>AFS.
>
>Personally I think AFS would be easier and tweakable enough for
>performance concerns.
>
No, I'd say I favor ease of use. I suggested doing everything with a
shared filesystem, and it was Justin who said that he has some doubts
that we could achieve acceptable performance that way. It seems like a
good compromise is using an AFS work-alike as the _logical_ view of all
our files, while implementing our own rsync-based caching schemes for it
where appropriate. Perhaps AFS already provides super caching support,
such that rsync wouldn't be needed; I don't know enough about it to say.
Justin S. Leitgeb wrote:
>The point is that AFS alone is not going to deal with performance
>issues. Real-world web sites can stress even large machines by today's
>standards. I just think that we need to recognize this out front, and
>realize that we can't scrimp on the disk configuration of the fileserver
>-- RAID 10 may be something we should consider if we really want to go
>down that route, and it won't be cheap. We will also want loads of
>memory in the front-end servers so that they can cache rather than
>introducing additional overhead assembling AFS packets.
>
We could do this in a forward-looking way by putting in the
infrastructure for this sort of organization, but just not including
much capacity. A simple proof that this would work to start out with is
the fact that we get by fine with our current server specs, and in fact
we are underutilizing what we have. Maybe RAID 10 and RAM are not
cheap, but we wouldn't need much of them to start out with. We'd still
be able to gain valuable experience using them in our initially
undemanding setting.
In any case, it's probably a good idea to pick out a few most promising
hardware configurations, price them out, and see what seems worth the cost.
More information about the HCoop-Discuss
mailing list