[Hcoop-discuss] Next hardware configuration/Email service

Sun Jan 29 12:44:48 EST 2006

Justin S. Leitgeb wrote:

>I guess the tough part is figuring out an architecture that is robust, 
>while still allowing us to provide "generic" hosting services.  What 
>kind of redundancy do you think is reasonable?  Obviously we could fill 
>up a rack pretty quickly depending on the redundancy that we want to 
>provide.  I know that you mentioned a robust file server and it seems 
>like this could help us out, but in practice I haven't seen web servers 
>hitting a network filesystem.  I'm worried about performance issues and 
>scaling here (I haven't tried this but there are some threads on the web 
>about performance and scaling with Apache over a network file system 
>that concerned me).  Might work with something like RAID 10, but I know 
>applications like IMAP work horribly over NFS and AFS.
>  
>
I'm generally ignorant about established practice here.  For systems 
like the cluster you've mentioned you administer, do all of the nodes 
just sync the relevant web files from a central source every day or 
something?

>An alternative would be to set up a couple of web nodes, and give users 
>an account on one or the other.  Both could have RAID 1, or no disk 
>redundancy.  We could replicate between the two (e.g., rsync) in order 
>to recover quickly in the event of a failure, and MySQL replication 
>could work from one to the other, adding cheap DB redundancy.  Later it 
>would be nice to have a database cluster, perhaps a 4 node MySQL 
>configuration that would be suitable for dynamic web-based applications, 
>but obviously this can wait.  Then it seems we should still need to have 
>a third system for services such as IMAP, etc., that we don't want to 
>give normal users access to.  Since this would be a possible single 
>point of failure, it should have a high level of RAID, power redundancy, 
>etc.  Perhaps RAID 10 on this system because IMAP is so I/O intensive.  
>A hardware firewall would be a nice addition as well.  This would be a 
>3-node starting configuration.
>
I think we should at any time have multiple backup images 
(daily/weekly/monthly/etc.) of all data that doesn't come directly from 
a software package, and these images should be stored in a way that 
makes data loss from random failure very unlikely.  If all that data 
lives on a central shared filesystem, then the goal is probably easier 
to achieve.  Also, our current access control schemes (w.r.t. domains 
and other resources) are based on filesystem permissions.  If services 
are spread across filesystems, then this stops working as well.

Data backups are the most important kind of redundancy that I'm thinking 
about.  Servers that are able to take over for each other in case of 
failure are the next level.

The other main advantage of having multiple servers is that it's easier 
to administer a server that doesn't need to have user resource limits in 
place.  For example, we have some problems now with admin-maintained 
services running into ulimits set up to prevent members from DoS'ing fyodor.