July 19, 2010

Adding Health Checks to Deckard from Chef.

Recently, we (at Cloudant) open sourced Deckard, a HTTP content check monitoring system based on CouchDB. One of the best bits about using Couch is that it gives you a ReST API and with Deckard it can be used to add new health checks. Doing a simple PUT adds new URLs to monitor. At Cloudant we love Chef and use it for everything. Chef has things called resources and providers. Resources are abstractions that describe the state you want a machine to be in. Providers perform the actions described by a resource. A good example is using the package resource on Centos uses yum while on Ubuntu it uses apt-get. The resource abstracts that away, letting the provider (and node) deal with the specifics on how to install the package. This makes your recipes nice and DRY, use the same code to install packages on all sorts of platforms. There are resources and providers for anything from installing packages to even one I wrote for executing Erlang code via erl_call. One resource that works well with Deckard is the HTTP request resource, using it makes it very easy to add health checks from your cookbooks. We use something like the following code to add checks to new nodes at Cloudant:

This code will add the document describing the check to the monitor_content_check database and then create a file so we can use “not_if” and Chef won’t attempt to add the check twice. Pretty cool stuff and even more reason that everything should have an API. Even cooler than this example would be to use Chef Search to do the same thing but I’ll save that for another blog post.

June 4, 2010

Just Opensourced: Gaff and Deckard

This post was stolen from my original post on the Cloudant blog.

Today we released two open source projects that have been in use internally at Cloudant for some time now, Gaff and Deckard.

All of our infrastructure is in the cloud and as such we need a way for disperate systems to all request resources, this is where Gaff comes in. Gaff is a pubsub daemon for asynchronously talking to cloud APIs using AMQP. Currently it supports a subset of the Dynect (DNS), Slicehost and EC2 APIs and uses geemus‘ awesome fog Ruby library. The basic workflow for Gaff is to send JSON-RPC formated messages to an AMQP exchange with a routing key corresponding to the API you are talking to, you could be sending these messages from a web application or another service.  Each message gets routed to an API specific queue and is picked up by Gaff and turned into the appropriate API call, starting, stopping, modifying your servers on EC2 or elsewhere.

We have a lot of CouchDB instances to keep tabs on to do this we wrote Deckard. Deckard is a HTTP check monitoring system based on CouchDB. Yo dawg! What better than to monitor CouchDB with CouchDB (and some Ruby)? Deckard supports basic HTTP content checks, email alerts, SMS alerts (via email) for on-call rotations, basic maintenance scheduling, replication latency alerts (between two Couches) and even has EC2 Elastic IP support for failover between two EC2 instances. Best of all since it’s based on Couch you get an API for free, just PUT a doc in the HTTP checks database and you get a new HTTP check the next time Deckard runs.

Checkout these and my other projects on GitHub and follow Cloudant and myself on Twitter.

January 1, 2010

Fun with the CouchDB _changes feed and RabbitMQ.

I was recently introduced to yajl-ruby, ruby bindings to the C based yajl json parsing/encoding libraries. After discovering that it can parse HTTP streams it seemed like it would be a perfect fit for use with CouchDB. A while back I wrote some code to push update notifications to RabbitMQ and a commenter mentioned using the _changes feed instead. Combining the _changes feed and yajl-ruby’s HttpStream seemed like a good way to do it.

The _changes feed is a running list of all the documents that have changed in a database listed in order by sequence number. This is similar to update notifications but gives more information such as the document IDs and is HTTP based (with multiple feed styles) rather than stdout. Additionally you can create design document filters which can be specified as a query parameter to give you only the parts of the feed you want. All in all _changes is a pretty powerful feature.

Now for the fun stuff, the code. There are a few dependencies I used to do this, specifically focused on making it fast. As such I used EventMachine based libraries for AMQP and HTTP requests. The first bit of code takes the _changes feed for the “test” database, parses the feed, uses the document ID to request that document and publish it to the queue. One key item to note is that this code requires the latest yajl-ruby from github to run properly. Additionally, this works nicely with feed=continuous so it grabs the documents as they are changed without a need for polling.

Note that there is a variable for since, this allows you to start from a specific sequence number so you can skip over old changes.

The next bit of code works from the other side of the queue. It subscribes to the queue, parses the JSON, performs some operations on it and puts the results back into another CouchDB database called “results”.

What could it be used for? My first thought is some sort of parallel computation, boot up a few dozen EC2 nodes and start dumping data into CouchDB. Have all those nodes pop messages off the queue, process them and dump the results back into Couch. Legitimately one could chain these together to process the results again. The queue ends up being a simple job management system with the EC2 nodes popping new messages as they finish processing them. With a little bit of work, features and the right use case I think could be a pretty powerful system.

Check out the code, my other projects and follow me on twitter @williamsjoe.

[edit: made a slight improvement to changes_sub.rb on 20100107]

November 5, 2009

Baracus.

Just did my first official Cloudant blog post on a project I created called Baracus. It’s an httperf wrapper for benchmarking CouchDB, check it out on github.

September 19, 2009

Red Black Trees.

Been reading up on Red-black trees, a self-balancing binary tree. Here are some resources I found interesting.