Blog

All posts by Michael Spiegel

Announcing New Open Source Libraries: Cronus and Hermes

cronus-hermes

We are pleased to announce that we have open-sourced two additional AddThis libraries. Cronus is a lightweight cron Java library. Hermes is a programmable page speed measurement tool. The libraries have been released under the Apache License 2.0.

Cronus

Cronus accepts Vixie Cron syntax and schedules actions for execution. The cronus implementation is relatively small at under 5,000 lines of code. Much of its functionality relies on the excellent JSR 310 library in Java 8 to perform date/time calculations.

Continue reading

5 Lessons Learned In Automated Browser Testing

addthis-automated-browser-testing

We really love testing our tools and products to verify they’re working correctly and efficiently. We’ve been expanding our set of automated browser testing for the addthis.com website and our suite of publisher tools. We use the Selenium browser automation framework and we are big fans of this framework. Our philosophy has been to automate the simple workflows and allow QA to focus on the more intricate workflows. Here are 5 guidelines that we’ve learned to write more effective browser automation tests. Continue reading

Hydra: Storage of Nested Hierarchical Data

Hydra is a distributed data processing and storage system developed at AddThis, which we recently released as open source. It ingests streams of data and builds hierarchical tree structures that are aggregates, summaries, or transformations of the data. Sibling nodes in the tree are stored in lexicographic sorted order. This ordering is often used explicitly by the human when writing queries or implicitly by the query system to optimize the execution of queries. Continue reading

Open-Sourcing Ssync: An Out-of-the-Box Distributed Rsync

We periodically have to transfer files to a collection of machines in a cluster. Without a distributed filesystem we rely on user level processes to move these files to their target destinations. Previously we had been making a sequence of N rsync calls to populate a collection of N machines. When looking for an preexisting solution that would improve our workflow, I could not find one that met the following requirements:

Continue reading

The Secret Life of Concurrent Data Structures

concurrent data structure is “a particular way of storing and organizing data for access by multiple computing threads (or processes) on a computer.” In this blog entry, we’ll be covering one of the hidden sides of concurrent data structures that are not so documented in the literature. We’ll be looking at insertion and deletion operations, and comparing the relative complexity of implementing these two operations.

Let’s focus our attention to concurrent data structures in a shared-memory environment where multiple threads are concurrently reading and/or writing. Continue reading