[Hcoop-discuss] Next hardware configuration/Email service

Justin S. Leitgeb leitgebj at hcoop.net
Sun Jan 29 17:18:37 EST 2006


Adam Chlipala wrote:

>Justin S. Leitgeb wrote:
>
>  
>
>>I guess the tough part is figuring out an architecture that is robust, 
>>while still allowing us to provide "generic" hosting services.  What 
>>kind of redundancy do you think is reasonable?  Obviously we could fill 
>>up a rack pretty quickly depending on the redundancy that we want to 
>>provide.  I know that you mentioned a robust file server and it seems 
>>like this could help us out, but in practice I haven't seen web servers 
>>hitting a network filesystem.  I'm worried about performance issues and 
>>scaling here (I haven't tried this but there are some threads on the web 
>>about performance and scaling with Apache over a network file system 
>>that concerned me).  Might work with something like RAID 10, but I know 
>>applications like IMAP work horribly over NFS and AFS.
>> 
>>
>>    
>>
>I'm generally ignorant about established practice here.  For systems 
>like the cluster you've mentioned you administer, do all of the nodes 
>just sync the relevant web files from a central source every day or 
>something?
>
>  
>

Yes.  Currently we use CVS to do updates on all of the web servers every 
day, or when we need to update files manually.  That was implemented 
before I started working on the clusters.  I think that many shops use 
something like rsync to maintain the files on multiple hosts (this could 
be done with cron or through custom scripts), and this would be a much 
better tool for the job.


>>An alternative would be to set up a couple of web nodes, and give users 
>>an account on one or the other.  Both could have RAID 1, or no disk 
>>redundancy.  We could replicate between the two (e.g., rsync) in order 
>>to recover quickly in the event of a failure, and MySQL replication 
>>could work from one to the other, adding cheap DB redundancy.  Later it 
>>would be nice to have a database cluster, perhaps a 4 node MySQL 
>>configuration that would be suitable for dynamic web-based applications, 
>>but obviously this can wait.  Then it seems we should still need to have 
>>a third system for services such as IMAP, etc., that we don't want to 
>>give normal users access to.  Since this would be a possible single 
>>point of failure, it should have a high level of RAID, power redundancy, 
>>etc.  Perhaps RAID 10 on this system because IMAP is so I/O intensive.  
>>A hardware firewall would be a nice addition as well.  This would be a 
>>3-node starting configuration.
>>
>>    
>>
>I think we should at any time have multiple backup images 
>(daily/weekly/monthly/etc.) of all data that doesn't come directly from 
>a software package, and these images should be stored in a way that 
>makes data loss from random failure very unlikely.  If all that data 
>lives on a central shared filesystem, then the goal is probably easier 
>to achieve.  Also, our current access control schemes (w.r.t. domains 
>and other resources) are based on filesystem permissions.  If services 
>are spread across filesystems, then this stops working as well.
>
>  
>

I agree with this backup plan, and I'm not really against the idea of a 
shared filesystem - it would definitely make backups easier.  I just 
know that it won't work with IMAP and I'm unsure about how it's going to 
scale with our front-end web servers as our needs grow.  It seems that 
we would have to buy a really expensive server (lots of SCSI disks) to 
handle the I/O that we will have to deal with as we add web servers  
Might just make more sense to buy independent web servers that we 
distribute accounts across as our needs grow.  If we did that, we could 
cruise along adding web servers without I/O bottlenecks while scaling, 
until we can save up for a SAN, or a really fast NAS box with *lots* of 
disks at some point if we really wanted to centralize things.  For now I 
just think it makes more economic sense and would be more scaleable to 
look for:

1) one or two relatively small web servers (with or without RAID 1, 
mostly depending on if we can keep load low enough to switch users from 
one web server to the other in the event of a failure).

2) a system for services like IMAP and sendmail that only a core group 
of admins needs shell access on - we could start on RAID 1 but I think 
the I/O would kill us here, too as IMAP becomes more important.  IMAP 
servers have huge I/O needs and RAID 10 would be nice eventually.

3) a system loaded with cheap SATA disks for syslog and backups?

What about using LDAP or an alternative for managing these user accounts 
across servers?  I'm not familiar enough with the applications you've 
developed to know for sure how that would work out, but it seems that 
there are plenty of tools for account administration, and we could 
easily build something ourselves.  And in the setup above, the user 
would only need accounts on two machines -- the web host they're 
assigned to, as well as an IMAP account on the mail server.

Justin




More information about the HCoop-Discuss mailing list