[HCoop-Misc] AFS on Linux

Christopher D. Clausen cclausen at hcoop.net
Wed Jan 24 10:10:20 EST 2007


Alok G. Singh <alephnull at airtelbroadband.in> wrote:
> On 23 Jan 2007, cclausen at hcoop.net wrote:
>
>> Yes, it is possible.  I'm not sure how AFS would work with Rocks
>> though.  It shouldn't be a problem as long as each machine has its
>> own /etc space for the AFS config files (some of them need to be
>> different on each machine.)
>
> Yes. The /etc space is distinct for each box.
>
>> Note that a compromise of any one machine would comprimise ALL of
>> them though (which may already be the case.)
>
> It is indeed the case.
>
>> AFS servers generally want dedicated partitions for the server to
>> store data.  Also note that AFS does not export local files systems
>> like NFS.  You need to copy data into AFS and once copied it is only
>> accessible with an AFS client.
>
> That is fine. We only plan to use the partition for backup while we
> buy some tapes.
>
>> You might be better off using some block-level export scheme, like
>> iSCSI or nbd to share the space to a few machines that would act as
>> servers.  Of course, this isn't an efficient use of network I/O as
>> data would travel in and out of the server machine.
>
> To save some money we bought a smaller switch and only one NIC is
> being used per box. Since we run MPI code on the cluster, we would
> rather have the network as free as possible. Or at least, have some
> control over when the data transfer occurs.
>
>> If you can be more specific as to how you'd want to actually use the
>> disk space, I might be able to make a better suggestion.
>
> The intent is to combine all the small disks that are distributed
> across 16 machines into one large partition where about 1 TB of data
> can be stored. This data is not required on a day-to-day basis. In
> that sense it is just like a tape drive.
>
> Since it appears that AFS does not do this aggregation, is there
> something else that does ? From my research, Coda doesn't either.

You could share all disk via nbd and aggregate it from one host, 
providing a large partition.  Of course this master server would see 
high network usage and would probably affect your MPI stuff.  You might 
be better off having a machine that is NOT in the cluster act as the 
storage head.

Do you actually need one large 1TB partition?  You can access all the 
speace from AFS, but only in chunks the size of the partition on each 
disk.  It would be more efficient (IMHO) to have multiple clients 
writting data to multiple machines at the same time.  Any aggregation 
technique will likely require a single host to perform ALL I/O.  If the 
intent is to indeed use the space like a tape drive, I don't know of any 
1TB tapes.  With AFS you'd could access the space in chunks like 
/afs/cell/node1, /afs/cell/node2, etc.

I'd say to have a look at: http://code.google.com/p/hotcakes/
It seems to be exactly what you want, although I'm not sure if I'd trust 
important data to it.  I'm also not sure how it works.  It might not yet 
scale to clusters, as that appears to be a feature they are working on.

Most cluster filesystems won't help you, as each node requires 
block-level access to the storage.  (You could do this with iSCSI or 
nbd, but I suspect it would be a drain of your network bandwidth.)

<<CDC 






More information about the HCoop-Misc mailing list