Currently taking place on the Clearspring blog is a four part series about how our team processes tens of billions of unique, new data points on a daily basis.
In storage terms, we ingest 4 to 5 TB of new data each day; that could easily double in 12-18 months. That data must be processed as live streams, re-processed from the archive as new algorithms are developed and able to be queried in flexible ways. An interesting mix of batch, live and hybrid jobs are employed.
So head on over to the Clearspring blog to learn all about the wizardry!