Monday, October 3, 2011

Scads: Scale-independent storage for social computing applications

Internet services need to be able to adapt to the load they receive from users.  This load could vary wildly, as services gain popularity, or times of day show natural low levels of load.  In addition to adapting to the query load, the services should also handle growing data, in order to still provide good response times for users.  SCADS tries to solve these issues by with data scale independence, effective scaling up and down, and using machine learning to predict performance and resource requirements.  data scale independence is a useful property for services because it allows the data to grow without changing the application.  SCADS also provides a performance safe query language to statically analyze queries and prohibit queries which may not have constant amount of work per user.  This means all queries must have lookups and not depend on the size of the data.  This can be achieved by creating indexes.

SCADS tries to provide additional features to large scale data storage systems.  Key additions are the performance safe query language and the usage of machine learning to automatically scale up and scale down the cluster.  In the future, automatic scaling will be very important for larger systems.  Maintenance costs can grow if it becomes a manual process, so effective estimation of resource consumption and automatic scaling will be crucial.  Data scale independent queries are a useful feature, but at times, it may be too restrictive, especially if larger analysis is required.  SCADS was built for interactive queries and not large analysis, but future systems will bridge the gap between OLTP and OLAP workloads.


No comments:

Post a Comment