Sunday, September 25, 2011

The Google File System

GFS is a large, reliable, distributed file system running on many commodity servers and commodity disks.  Google developed GFS to provide high aggregate throughput for data intensive applications.  GFS also addresses the issues on running a file system on commodity servers which often fail.  The developers examined application workloads and technology trends, and developed them together in order to achieve good performance.  GFS addressed changing assumptions in the data access.  Frequent component failure, huge files, mostly append instead of overwrite workloads are the assumptions which GFS is based upon.

GFS runs on many commodity servers with master servers and chunk servers.  The master maintains metadata for the file system and manages chunks, and chunk servers store the chunk data.  Clients have to communicate with the master first to get the chunk servers for specific chunk data.  However, the metadata is cached on the client for some time.  No chunk data is cached by the client, since cache coherence is difficult in a distributed environment, and applications mostly scan through the data once.  Chunks are 64MB and are lazily allocated to reduce internal fragmentation on the chunk servers.  Metadata is stored in memory to allow for fast access, and in practice, metadata is only about 100MB in size.  Data flows and control flow messages are separated to allow for full use of bandwidth when transferring data.  Data flows are pipelined and transfered through a primary chunk server.  Lazy garbage collection is used to clean up deleted and orphaned chunks.  GFS provides high availability by designing for fast recovery and replication of data.  In practice, most of the writes were appends rather than overwrites, which was expected.  The write throughput was slower than the read throughput because writes have to access 3 replicas and increase the probability of collisions.  GFS works well in practice, because the designers considered the new assumptions in data access patterns, and the application and file system were developed together.

The future holds new and changing assumptions which will drive for different design decisions for a large distributed file system.  More memory is becoming commonplace in commodity servers these days, so the future file system should take into account more memory.  In addition to the memory, SSDs are also becoming popular, which may replace disks in servers.  GFS tries to optimize for aggregate throughput and append workloads, but with these new trends in hardware, lower latencies can be achieved.  Lower latencies will be important for interactive workloads, so I think it will be interesting to see if the new hardware will allow for interactive and overwrite workloads such a transactional database workloads.

No comments:

Post a Comment