Clearspring’s Big Data Architecture, Part 3

We highly recommend making your way over to the Clearspring blog to read Part 3 of our Big Data Architecture blog series.

The latest post describes the distributed query system we use to quickly access terabytes of data that is distributed across hundreds of machines. The query subsystem is comprised of two key components, QueryMaster and QuerySlave.  A single cluster can have multiple QueryMasters and each processing node in the cluster will have one or more QuerySlaves.

Also, if you need catching up, here are Parts 1 and 2.