Monday, October 10, 2011

Dremel: Interactive Analysis of Web-Scale Datasets

Dremel is a system from Google which is a highly scalable, interactive ad-hoc analytic system of read-only nested data.  It uses execution trees and columnar storage data, and can run aggregations over trillion row tables in only a few seconds.  It can scale up to thousands of machines and supports petabytes of data.  Interactive response times are important for data exploration and debugging.  Too much data is too difficult to analyze interactively with map reduce, so Dremel tries to achieve low response times.  Dremel's sweet spot is for aggregating over large scale data, and is not a replacement for map reduce.  Using a serving tree, partial aggregation can be performed in the intermediate results as the data is streamed from the leaf nodes up to the root data.  The data is stored in a column wise fashion, which can provide good compression and also advantageous when projections only need a small subset of the columns.  Dremel works on the data in situ, so that there is no loading time, and other frameworks can access the data.  Experiments show that with columnar storage, only accessing a few columns can have orders of magnitude speedup.  Map reduce can also benefit from the columnar storage.  With large aggregations, having more levels in the serving tree can result in speedups since there is more partial aggregation of intermediate results.  Stragglers can be a problem for interactivity, but the system can be configured so that the system returns an answer with less than 100% of the data read.

Dremel is a successful system because Google can afford to have thousands of machines allocated for the system.  High scalability and low latency are achieved by using more machines to process the data.  Most other companies will not be able to afford so many machines for their ad hoc processing system.  The future will have to more efficiently use fewer machines but still provide the interactivity.  This may mean the hardware will have to use SSDs or use different compression or scanning techniques.  Also, future systems will have to take care of other use cases, such has streaming data and modifiable data.

No comments:

Post a Comment