Tag: Hydra

How to Join Data in Hydra


Hydra is great not only for continuously processing data streams, such as web logs, but also for tasks such as special data analysis, validation, troubleshooting, etc., that call for one-off jobs. Among the latter use cases, one of the more interesting and complicated cases is joining data sets. In this post, I’ll use an example to demonstrate how to join two data sets. Continue reading

Behind the Scenes of Our Scrolling Engagement Report

This morning, we released our 2014 Q2 Engagement Report analyzing scrolling behavior on content across the AddThis network. In this report, we break this data down broadly by time and operating system, but also go deeper into how users were referred to the page (i.e. through ad campaigns), and which AddThis tools the pages were using. Here I’ll describe the mechanics of how we created the report. Continue reading

Can Social Buzz Predict Video Game Sales?


The video game industry has thrived even in the midst of a recession. Revenue is over $60 billion a year, and projected to reach $82 billion by 2017. Even so, marketing is critical to make games stand out in a bulging supply. With the growth of social media, how has searching and sharing represented sales for video games? Are there predictors in the strength of a release? Continue reading

Hydra: Storage of Nested Hierarchical Data

Hydra is a distributed data processing and storage system developed at AddThis, which we recently released as open source. It ingests streams of data and builds hierarchical tree structures that are aggregates, summaries, or transformations of the data. Sibling nodes in the tree are stored in lexicographic sorted order. This ordering is often used explicitly by the human when writing queries or implicitly by the query system to optimize the execution of queries. Continue reading