At AddThis we believe that we have the best geeks on the planet, and it’s our duty to share their knowledge and passion with the world. From time to time we hand the reins over to one of our teammates in a series we call, Talk Nerdy to Me.
Principal Software Engineer Otto Barnes is here to explain how to use containers to more efficiently run Cassandra on OSX. Take it away, Otto.
We use Cassandra at a rather large scale due to its linear scalability and fault-tolerant capabilities. There is an interesting review of Cassandra performance by Netflix on AWS that is great reading: Netflix in their Revisiting 1 Million Writes per Second. We have multiple Cassandra clusters (some for testing, but most for production) and there are many scenarios where developers will find themselves needing to make changes that impact both their application, and how it stores information in a Cassandra (and really any kind of database). If you want to migrate to a new version, alter a table/schema, perform a backup-restore drill, etc… don’t do it in production (unless you’re this guy). Containers and your local development stack are your friend. In this article, I will describe how we, and you, can use Cassandra aided by the av-shell (domain-specific shell) and Docker Beta right on your MacBook.
Note: this is an update to a previous article with a similar name Cassandra in a Container on OSX but uses Docker Toolbox rather than Docker Beta.
Getting Your Laptop Setup
We are focusing on OSX only and will use a few tools that are freely available, but at the time of writing, you will need to sign up for Docker Beta and wait for early access to the installer.
1. Sign up for and install Docker Beta
2. Install Brew for easy installation of git and npm:
/usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)” brew install git npm
3. Do a global install of av and the av-docker plugin:
npm install -g av-shell av-docker
4. Install har to retrieve code and tarball oh so easily:
"sh -c "$(curl -fsSL https://raw.githubusercontent.com/sio2boss/har/master/tools/install.sh)"
Launching the Container
The workflow here is optimized by av-shell and the av-docker plugin. There are multiple pieces of context that are passed to multiple commands and av remembers these and thus drastically simplify the steps. I invite you to compare against the docker.com’s Build your own image instructions.
1. On github, I created a simple repository that contains a few cassandra containers with av and av-docker configured to use those containers. The containers themselves are based solely on Ubuntu and perform the installation of Cassandra from deb packages and ensure the Cassandra service is bound to the external IP of the container. Official Cassandra containers found on DockerHub should be used in your own projects. Lets use har to grab the repo.
2. Get the domain specific shell up:
cd cassandra-container && av
3. You will now be at a different command prompt. To make sure that Docker Beta is setup properly, simply run:
In the output, you should see column headings but no containers running.
4. Now choose which container we will work with with:
Use the arrow keys to select cassandra-1.2
5. Now build the container:
This reads the Dockerfile which is based on Ubuntu and has been setup to install java and cassandra.
6. Now start the container with:
This reads the Runfile with specifics to map OSX directories to the container
7. Confirm the container is running:
Optionally, you can shell into the last started container with just:
8. In a new terminal, let’s login to the running container:
Now you are ready to populate your schemas, make changes, restore from a snapshot, etc….
In the sister article (using Docker Toolbox) we described issues with matching usernames so that volumes could be exposed from the container to OSX, here we don’t have that complexity because Docker has simplified underlying architecture. For versions of Cassandra 2.0+, external volumes do not work for storing sstables due to Cassandra’s use of hard links. You can see this in the Runfiles within the cassandra-containers.git repository as ‘-v’ flags are omitted. This limitation, however, is mitigated in Docker Beta as it uses a variable sized disk, up to 64 GB, which is 3x larger than the default for boot2docker. Also, for more information on how av uses docker under the hood, take a look at Cassandra in a Container – Under the Hood.