At AddThis, we believe it’s important for owners and operators of websites to implement fundamental security practices for the collection and handling of their users’ online data. Here are some tips for making sure your site and your visitors’ information is kept secure. Continue reading →
Hydra is a distributed data processing and storage system developed at AddThis, which we recently released as open source. It ingests streams of data and builds hierarchical tree structures that are aggregates, summaries, or transformations of the data. Sibling nodes in the tree are stored in lexicographic sorted order. This ordering is often used explicitly by the human when writing queries or implicitly by the query system to optimize the execution of queries. Continue reading →
Now that Hydra is open source we can start to talk about how to use it for common data processing tasks. In this post we will answer several questions about log files generated using Log-Synth. The Log-Synth files we will be using for this post have four columns:
Today we are happy to announce that Hydra—the core of our data processing platform—is now open source and available on github. It’s freely available under the Apache License for anyone to use, and we look forward to seeing just what people do with it!
We had big plans to expand our server infrastructure this year. So, we put together a rough plan that included capacity planning, hardware selection, hardware testing and validation, roll out, and finally, using the new hardware!
We periodically have to transfer files to a collection of machines in a cluster. Without a distributed filesystem we rely on user level processes to move these files to their target destinations. Previously we had been making a sequence of N rsync calls to populate a collection of N machines. When looking for an preexisting solution that would improve our workflow, I could not find one that met the following requirements:
It is a competitive advantage for websites to be fast and responsive, so we made performance a priority when building Smart Layers. Let’s take a look at some of the mobile and desktop performance best practices that help make Smart Layers blazing fast.
A concurrent data structure is “a particular way of storing and organizing data for access by multiple computing threads (or processes) on a computer.” In this blog entry, we’ll be covering one of the hidden sides of concurrent data structures that are not so documented in the literature. We’ll be looking at insertion and deletion operations, and comparing the relative complexity of implementing these two operations.
Let’s focus our attention to concurrent data structures in a shared-memory environment where multiple threads are concurrently reading and/or writing. Continue reading →