<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joe's Blog! &#187; Dev</title>
	<atom:link href="http://www.joeandmotorboat.com/category/dev/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.joeandmotorboat.com</link>
	<description></description>
	<lastBuildDate>Wed, 28 Jul 2010 03:00:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>SurgeCon 2010</title>
		<link>http://www.joeandmotorboat.com/2010/07/27/surgecon-2010/</link>
		<comments>http://www.joeandmotorboat.com/2010/07/27/surgecon-2010/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 03:00:34 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Operations]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=1038</guid>
		<description><![CDATA[If you haven&#8217;t heard about Surge, it&#8217;s a new web operations conference presented by the smart folks at OmniTI. They have amassed a good list of speakers including guys like John Allspaw and Theo Schlossnagle. I also happen to have been invited to talk about the cloud, Cloudant and all sorts of good stuff.]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" title="surge" src="http://s.omniti.net/surge/i/present/logo-main.png" alt="" width="271" height="123" /></p>
<p>If you haven&#8217;t heard about <a href="http://omniti.com/surge/2010">Surge</a>, it&#8217;s a new web operations conference presented by the smart folks at OmniTI. They have amassed a good list of speakers including guys like John Allspaw and Theo Schlossnagle. I also happen to have been invited to talk about the cloud, <a href="https://cloudant.com/">Cloudant</a> and all sorts of good stuff. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/07/27/surgecon-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adding Health Checks to Deckard from Chef.</title>
		<link>http://www.joeandmotorboat.com/2010/07/19/adding-health-checks-to-deckard-from-chef/</link>
		<comments>http://www.joeandmotorboat.com/2010/07/19/adding-health-checks-to-deckard-from-chef/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 20:52:09 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Operations]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=1029</guid>
		<description><![CDATA[Recently, we (at Cloudant) open sourced Deckard, a HTTP content check monitoring system based on CouchDB. One of the best bits about using Couch is that it gives you a ReST API and with Deckard it can be used to add new health checks. Doing a simple PUT adds new URLs to monitor. At Cloudant we [...]]]></description>
			<content:encoded><![CDATA[<p>Recently, we (at <a href="https://cloudant.com/">Cloudant</a>) <a href="http://www.joeandmotorboat.com/2010/06/04/just-opensourced-gaff-and-deckard/">open sourced Deckard</a>, a HTTP content check monitoring system based on CouchDB. One of the best bits about using Couch is that it gives you a ReST API and with Deckard it can be used to add new health checks. Doing a simple PUT adds new URLs to monitor. At <a href="https://cloudant.com/">Cloudant</a> we love <a href="http://www.opscode.com/">Chef</a> and use it for everything. Chef has things called resources and providers. <a href="http://wiki.opscode.com/display/chef/Resources">Resources</a> are abstractions that describe the state you want a machine to be in. <a href="http://wiki.opscode.com/display/chef/Providers">Providers</a> perform the actions described by a resource. A good example is using the <a href="http://wiki.opscode.com/display/chef/Resources#Resources-Package">package</a> resource on Centos uses yum while on Ubuntu it uses apt-get. The resource abstracts that away, letting the provider (and node) deal with the specifics on how to install the package. This makes your recipes nice and DRY, use the same code to install packages on all sorts of platforms. There are resources and providers for anything from installing packages to even one I wrote for executing Erlang code via erl_call. One resource that works well with Deckard is the <a href="http://wiki.opscode.com/display/chef/Resources#Resources-HTTPRequest">HTTP request resource</a>, using it makes it very easy to add health checks from your cookbooks. We use something like the following code to add checks to new nodes at Cloudant:</p>
<p><script src="http://gist.github.com/481962.js"> </script></p>
<p>This code will add the document describing the check to the monitor_content_check database and then create a file so we can use &#8220;not_if&#8221; and Chef won&#8217;t attempt to add the check twice. Pretty cool stuff and even more reason that everything should have an API. Even cooler than this example would be to use Chef Search to do the same thing but I&#8217;ll save that for another blog post.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/07/19/adding-health-checks-to-deckard-from-chef/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Just Opensourced: Gaff and Deckard</title>
		<link>http://www.joeandmotorboat.com/2010/06/04/just-opensourced-gaff-and-deckard/</link>
		<comments>http://www.joeandmotorboat.com/2010/06/04/just-opensourced-gaff-and-deckard/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 21:18:50 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=1021</guid>
		<description><![CDATA[This post was stolen from my original post on the Cloudant blog. Today we released two open source projects that have been in use internally at Cloudant for some time now, Gaff and Deckard. All of our infrastructure is in the cloud and as such we need a way for disperate systems to all request resources, [...]]]></description>
			<content:encoded><![CDATA[<p><em>This post was stolen from my original post on the </em><a href="http://blog.cloudant.com/just-opensourced-gaff-and-deckard"><em>Cloudant blog</em></a><em>.</em></p>
<p>Today we released two open source projects that have been in use internally at Cloudant for some time now, <a href="http://github.com/joewilliams/gaff">Gaff</a> and <a href="http://github.com/joewilliams/deckard">Deckard</a>.</p>
<p>All of our infrastructure is in the cloud and as such we need a way for disperate systems to all request resources, this is where Gaff comes in. Gaff is a pubsub daemon for asynchronously talking to cloud APIs using AMQP. Currently it supports a subset of the Dynect (DNS), Slicehost and EC2 APIs and uses <a href="http://twitter.com/geemus">geemus</a>&#8216; awesome <a href="http://github.com/geemus/fog">fog</a> Ruby library. The basic workflow for Gaff is to send <a href="http://json-rpc.org/">JSON-RPC</a> formated messages to an AMQP exchange with a routing key corresponding to the API you are talking to, you could be sending these messages from a web application or another service.  Each message gets routed to an API specific queue and is picked up by Gaff and turned into the appropriate API call, starting, stopping, modifying your servers on EC2 or elsewhere.</p>
<p>We have a lot of CouchDB instances to keep tabs on to do this we wrote Deckard. Deckard is a HTTP check monitoring system based on CouchDB. Yo dawg! What better than to monitor CouchDB with CouchDB (and some Ruby)? Deckard supports basic HTTP content checks, email alerts, SMS alerts (via email) for on-call rotations, basic maintenance scheduling, replication latency alerts (between two Couches) and even has EC2 Elastic IP support for failover between two EC2 instances. Best of all since it&#8217;s based on Couch you get an API for free, just PUT a doc in the HTTP checks database and you get a new HTTP check the next time Deckard runs.</p>
<p><em>Checkout these and my other projects on </em><a href="http://github.com/joewilliams"><em>GitHub</em></a><em> and follow </em><a href="http://twitter.com/cloudant"><em>Cloudant</em></a><em> and </em><a href="http://twitter.com/williamsjoe"><em>myself</em></a><em> on Twitter.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/06/04/just-opensourced-gaff-and-deckard/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Availability, the Cloud and Everything</title>
		<link>http://www.joeandmotorboat.com/2010/05/31/availability-the-cloud-and-everything/</link>
		<comments>http://www.joeandmotorboat.com/2010/05/31/availability-the-cloud-and-everything/#comments</comments>
		<pubDate>Sat, 01 May 2010 17:37:42 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Operations]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=1016</guid>
		<description><![CDATA[Finally posted my presentation at Erlang Factory, WTIA Cloud SIG and Seattle Scalability Meetup here on the blog. Availability, the Cloud and Everything View more presentations from logicalstack.]]></description>
			<content:encoded><![CDATA[<p>Finally posted my presentation at <a href="http://erlang-factory.com/conference/SFBay2010">Erlang Factory</a>, <a href="http://www.washingtontechnology.org/">WTIA Cloud SIG</a> and <a href="http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/">Seattle Scalability Meetup</a> here on the blog.</p>
<div style="width:425px" id="__ss_3567217"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/logicalstack/availability-the-cloud-and-everything" title="Availability, the Cloud and Everything">Availability, the Cloud and Everything</a></strong><object id="__sse3567217" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=availability-100326174512-phpapp02&#038;stripped_title=availability-the-cloud-and-everything" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed name="__sse3567217" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=availability-100326174512-phpapp02&#038;stripped_title=availability-the-cloud-and-everything" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="padding:5px 0 12px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/logicalstack">logicalstack</a>.</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/05/31/availability-the-cloud-and-everything/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Beyond BigData.</title>
		<link>http://www.joeandmotorboat.com/2010/05/31/beyond-bigdata/</link>
		<comments>http://www.joeandmotorboat.com/2010/05/31/beyond-bigdata/#comments</comments>
		<pubDate>Mon, 31 May 2010 16:54:23 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Dev]]></category>
		<category><![CDATA[Operations]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=979</guid>
		<description><![CDATA[BigData is a big deal. It&#8217;s changing how we look at data and analytics, but it isn&#8217;t the end. What are the enablers of BigData? First and foremost, cheap computing resources (CPU, disks, memory, bandwidth, etc) all thanks to Moore&#8217;s Law. Today even startups have the ability to afford huge amounts of computing power, the [...]]]></description>
			<content:encoded><![CDATA[<p>BigData is a big deal. It&#8217;s changing how we look at data and analytics, but it isn&#8217;t the end. What are the enablers of BigData? First and foremost, cheap computing resources (CPU, disks, memory, bandwidth, etc) all thanks to <a href="http://en.wikipedia.org/wiki/Moore's_law">Moore&#8217;s Law</a>. Today even startups have the ability to afford huge amounts of computing power, the likes previously only the big boys could afford. Additionally, this has given rise to commodity hardware and cloud computing, which only furthers the proliferation of large amounts cheap, quickly-provisioned, computing resources. Second, to apply all that power, we have open source data processing systems based on years of distributed systems research, like <a href="http://hadoop.apache.org/">Hadoop</a>, and many incarnations of <a href="http://en.wikipedia.org/wiki/Nosql">NoSQL</a>. The development of open source data processing sytems has allowed proliferation of systems that scale, which only the highly capitalized could afford, until recently. These two things alone have allowed for the democratization of BigData. A guy in a garage can process terabytes of data with little more than a credit card and elbow grease.</p>
<p>With all these tools and recently acquired computing power, where are we going? Of course we can expect datasets to continue to grow, and the computational complexity of our data processing to increase, as well as compute power to continue to rise (GPGPUs, multicore and so on). In addition, I anticipate the emergence of something I&#8217;m calling <em>NewData</em>. NewData will build on what we have currently with the BigData, but will include some trends just beginning to take off. First, the development of ubiquitous public APIs (<a href="http://stochasticresonance.wordpress.com/2009/04/01/meatcloud-manifesto/">Meatcloud Manifesto</a>). Public APIs have yet to proliferate to all online systems. As a consequence, there is still a lot of screen scraping going on. By having easily query-able and parse-able datasets available through ubiquitous APIs, consuming the internet with machines is easier making the application of BigData more powerful. <a href="http://developer.netflix.com/">Netflix</a> is a good example of this. Second and similarly enabling will be the development of standardized public datasets. Current datasets are generally hard to find and use, standardized dataset formats will enable BigData analysis to be more productive and not waste time munging. <a href="http://www.data.gov/">Data.gov</a> is a start. These two developments are yet to be fully realized in current systems but will allow for the rise of NewData. As these developments begin to roll out we will begin to see changes to how our BigData systems look. NewData systems will be less concerned with how big the data is and what it looks like, but will emphasize derivation of more information from the data. <a href="http://techcrunch.com/2010/03/16/big-data-freedom/">Bradford Cross gets this</a>, and as a result <a href="http://flightcaster.com/">FlightCaster</a> is an early example of what I mean by <em>NewData</em>.</p>
<blockquote><p>The scale of data and computations is an important issue, but the data age is less about the raw size of your data, and more about the cool stuff you can do with it.</p></blockquote>
<p>Asking the right questions of the data is important, especially if you&#8217;re trying to do cool stuff. The <a href="http://freakonomics.blogs.nytimes.com/">Freakonomics</a> guys proved this a few times over. NewData will be about creating value from data, and asking the right questions is worth as much as the answers. The key enablers of this will be using new found APIs and datasets to combine data from disperate sources in ways that BigData couldn&#8217;t. Asking questions that we wouldn&#8217;t have thought to ask of BigData. Where BigData was about a handful of datasets at most, NewData will be about dozens of datasets. The mashup is the cornerstone of NewData.</p>
<p>That being said, we will need new systems to process this data and enable us to ask these questions. NewData analysis will need inter-process communication and collaboration. Currently, systems like Hadoop process data by splitting the data up and processing chunks in parallel on hundreds to thousands of machines. Processes are isolated from the other processes. This will continue, but NewData will require more from these systems to ask deeper questions. Complex inter-process communication will be needed to ask these questions. Think of the simplicity of writing Map/Reduce jobs, the robustness of Hadoop, the workflow and dataflow of <a href="http://www.cascading.org/">Cascading</a> and <a href="http://research.microsoft.com/en-us/projects/dryadlinq/">DryadLINQ</a>, respectively, and the power of a message passing system like <a href="http://en.wikipedia.org/wiki/Message_Passing_Interface">MPI</a>. These jobs will likely include large in-memory collaborative computations across thousands of machines. Where data locality was key in BigData, both data and memory-locality (<a href="http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access">NUMA/ccNUMA</a>) will be important in NewData.</p>
<p>It is clear that BigData still has some runway before NewData takes over. However, if the trends in the democratization of compute and processing continue (beyond Hadoop and EC2), and the opening of APIs and datasets proliferate online and off, NewData and it&#8217;s new questions, mashups, and systems are inevitable. Where having readily available compute resources and the software to use it defined BigData, NewData will be defined solely by asking the right questions, the algorithms to derive answers, and the systems used to produce them.</p>
<p><em>Thanks to <a href="http://twitter.com/mlmilleratmit">Mike Miller</a>, <a href="http://twitter.com/lusciouspear">Bradford Stephens</a> and my awesome wife <a href="http://twitter.com/xprimerw">Erin</a> for the help on this article.</em></p>
<p><strong><em>Follow me on <a href="http://twitter.com/williamsjoe">twitter</a>.<br />
</em></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/05/31/beyond-bigdata/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fun with the CouchDB _changes feed and RabbitMQ.</title>
		<link>http://www.joeandmotorboat.com/2010/01/01/fun-with-the-couchdb-_changes-feed-and-rabbitmq/</link>
		<comments>http://www.joeandmotorboat.com/2010/01/01/fun-with-the-couchdb-_changes-feed-and-rabbitmq/#comments</comments>
		<pubDate>Fri, 01 Jan 2010 23:14:01 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=962</guid>
		<description><![CDATA[I was recently introduced to yajl-ruby, ruby bindings to the C based yajl json parsing/encoding libraries. After discovering that it can parse HTTP streams it seemed like it would be a perfect fit for use with CouchDB. A while back I wrote some code to push update notifications to RabbitMQ and a commenter mentioned using [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently <a href="http://ozmm.org/posts/2009_open_source_top_ten.html">introduced</a> to <a href="http://github.com/brianmario/yajl-ruby">yajl-ruby</a>, ruby bindings to the C based yajl json parsing/encoding libraries. After discovering that it can parse HTTP streams it seemed like it would be a perfect fit for use with <a href="http://couchdb.apache.org/">CouchDB</a>. A while back I wrote <a href="http://www.joeandmotorboat.com/2009/06/05/sending-couchdb-update-notifications-to-rabbitmq/">some code to push update notifications</a> to RabbitMQ and a commenter mentioned using the <a href="http://books.couchdb.org/relax/reference/change-notifications">_changes feed</a> instead. Combining the _changes feed and yajl-ruby&#8217;s HttpStream seemed like a good way to do it.</p>
<p>The _changes feed is a running list of all the documents that have changed in a database listed in order by sequence number. This is similar to update notifications but gives more information such as the document IDs and is HTTP based (with multiple feed styles) rather than stdout. Additionally you can create design document filters which can be specified as a query parameter to give you only the parts of the feed you want. All in all _changes is a pretty powerful feature.</p>
<p>Now for the fun stuff, the code. There are a few dependencies I used to do this, specifically focused on making it fast. As such I used EventMachine based libraries for <a href="http://github.com/tmm1/amqp/">AMQP</a> and <a href="http://github.com/igrigorik/em-http-request/">HTTP requests</a>. The first bit of code takes the _changes feed for the &#8220;test&#8221; database, parses the feed, uses the document ID to request that document and publish it to the queue. One key item to note is that this code <strong>requires the latest yajl-ruby</strong> from github to run properly. Additionally, this works nicely with <em>feed=continuous</em> so it grabs the documents as they are changed without a need for polling.</p>
<p><script src="http://gist.github.com/266991.js?file=changes_pub.rb"></script> Note that there is a variable for <em>since</em>, this allows you to start from a specific sequence number so you can skip over old changes.</p>
<p>The next bit of code works from the other side of the queue. It subscribes to the queue, parses the JSON, performs some operations on it and puts the results back into another CouchDB database called &#8220;results&#8221;.  <script src="http://gist.github.com/266991.js?file=changes_sub.rb"></script></p>
<p>What could it be used for? My first thought is some sort of parallel computation, boot up a few dozen EC2 nodes and start dumping data into CouchDB. Have all those nodes pop messages off the queue, process them and dump the results back into Couch. Legitimately one could chain these together to process the results again. The queue ends up being a simple job management system with the EC2 nodes popping new messages as they finish processing them. With a little bit of work, features and the right use case I think could be a pretty powerful system.</p>
<p>Check out the <a href="http://gist.github.com/266991">code</a>, <a href="http://github.com/joewilliams">my other projects</a> and follow me on twitter <a href="http://twitter.com/williamsjoe">@williamsjoe</a>.</p>
<p><em>[edit: made a slight improvement to changes_sub.rb on 20100107]</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2010/01/01/fun-with-the-couchdb-_changes-feed-and-rabbitmq/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Baracus.</title>
		<link>http://www.joeandmotorboat.com/2009/11/05/baracus/</link>
		<comments>http://www.joeandmotorboat.com/2009/11/05/baracus/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 20:04:39 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=949</guid>
		<description><![CDATA[Just did my first official Cloudant blog post on a project I created called Baracus. It&#8217;s an httperf wrapper for benchmarking CouchDB, check it out on github.]]></description>
			<content:encoded><![CDATA[<p>Just did my <a href="http://blog.cloudant.com/benchmarking-couchdb-with-baracus">first official Cloudant blog post</a> on a project I created called Baracus. It&#8217;s an <a href="http://www.hpl.hp.com/research/linux/httperf/">httperf</a> wrapper for benchmarking <a href="http://couchdb.apache.org/">CouchDB</a>, check it out on <a href="http://github.com/joewilliams/baracus">github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2009/11/05/baracus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Red Black Trees.</title>
		<link>http://www.joeandmotorboat.com/2009/09/19/red-black-trees/</link>
		<comments>http://www.joeandmotorboat.com/2009/09/19/red-black-trees/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 16:34:16 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=945</guid>
		<description><![CDATA[Been reading up on Red-black trees, a self-balancing binary tree. Here are some resources I found interesting. Multiple posts at Fuad AlTabba, with an erlang implementation. Ruby rbtree library (uses C). An implementation in Ruby. Trees in Erlang. Red-black trees in two hours, with a link to Chris Okaski&#8217;s Red-Black Trees in a Functional Setting with implementation in [...]]]></description>
			<content:encoded><![CDATA[<p>Been reading up on <a href="http://en.wikipedia.org/wiki/Red_black_tree">Red-black trees</a>, a self-balancing binary tree. Here are some resources I found interesting.</p>
<ul>
<li>Multiple posts at <a href="http://www.altabba.org/">Fuad AlTabba</a>, with an <a href="http://www.cs.auckland.ac.nz/~fuad/rbtree.erl">erlang implementation</a>.</li>
<li>Ruby <a href="http://rubyforge.org/projects/rbtree/">rbtree library</a> (uses C).</li>
<li>An <a href="http://www.dmh2000.com/cjpr/RBRuby.html">implementation</a> in Ruby.</li>
<li><a href="http://mark.aufflick.com/blog/2007/11/30/trees-in-erlang">Trees in Erlang</a>.</li>
<li><a href="http://semanticvector.blogspot.com/2008/05/red-black-tree-in-2-hours.html">Red-black trees in two hours</a>, with a link to Chris Okaski&#8217;s <a name="jfp99"><em>Red-Black Trees in a Functional Setting </em>with implementation in Haskell.</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2009/09/19/red-black-trees/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HAProxy Stats Socket and fun with socat.</title>
		<link>http://www.joeandmotorboat.com/2009/08/20/haproxy-stats-socket-and-fun-with-socat/</link>
		<comments>http://www.joeandmotorboat.com/2009/08/20/haproxy-stats-socket-and-fun-with-socat/#comments</comments>
		<pubDate>Thu, 20 Aug 2009 22:22:40 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Erlang]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=935</guid>
		<description><![CDATA[I&#8217;ve been debugging issues with HTTP, my backend servers and HAProxy. After a quick email to the HAProxy mailing list I found out about a configuration option stats socket PATH. This will create a socket you can send commands to and get more information out of HAProxy. To do this I just used some simle [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been debugging issues with HTTP, my backend servers and HAProxy. After a quick email to the HAProxy mailing list I found out about a configuration option <em>stats socket PATH</em>. This will create a socket you can send commands to and get more information out of HAProxy. To do this I just used some simle unix tools, the key is <a href="http://www.dest-unreach.org/socat/">socat</a>. From the man:</p>
<blockquote><p>
socat is a relay for bidirectional data transfer between two independent data channels. Each of these data channels may be a file, pipe, device (serial line etc. or a pseudo terminal), a socket (UNIX, IP4, IP6 &#8211; raw, UDP, TCP), an SSL socket, proxy CONNECT connection, a file descriptor (stdin etc.), the GNU line editor (readline), a program, or a combination of two of these. These modes include generation of &#8220;listening&#8221; sockets, named pipes, and pseudo terminals.
</p></blockquote>
<p>Here are a few examples of how to use the stats socket. First, you need to add <em>stats socket PATH</em> to your configuration and restart haproxy. You should then find a socket located at the path specified, I used <em>/tmp/haproxy</em>. Now you can send it commands to get more information and stats from HAProxy.<br />
<code><br />
echo "show stat" | socat unix-connect:/tmp/haproxy stdio<br />
</code></p>
<p>This will give you stats on all of your backends and frontends, some of the same stuff you see on the stats page enabled by the <em>stats uri</em> configuration. As an added bonus it&#8217;s all in CSV.<br />
<code><br />
echo "show errors" | socat unix-connect:/tmp/haproxy stdio<br />
</code></p>
<p><em>show errors</em> will give you a capture of last error on each backend/frontend.<br />
<code><br />
echo "show info" | socat unix-connect:/tmp/haproxy stdio<br />
</code></p>
<p>This will give you information about the running HAProxy process such as pid, uptime and etc.<br />
<code><br />
echo "show sess" | socat unix-connect:/tmp/haproxy stdio<br />
</code></p>
<p>This will dump (possibly huge) info about all know sessions.</p>
<p>For more details check out <a href="http://haproxy.1wt.eu/download/1.3/doc/configuration.txt">the docs</a> section 9 and <em>stats socket</em> in section 3.1.</p>
<p><strong>Bonus socat fun.</strong></p>
<p>socat is a more full featured cousin of <a href="http://netcat.sourceforge.net/">netcat</a>. Both can be used in similar ways, one thing I use them for occasionally is debugging REST and etc. This was a real help when working with an API that didn&#8217;t have a library, I could test things out without needing to make erroneous calls to the API. In the simplest case you can have either of them listen on a port and output all the details of the request. To do this with socat run:</p>
<p><code>socat tcp-listen:8000 stdio</code></p>
<p>This will listen for connections on port 8000. Doing the same thing with netcat is easy as well:</p>
<p><code>netcat -l -p 8000</code></p>
<p>For instance you can see the output from creating a document in CouchDB.</p>
<p>In one terminal:<br />
<code><br />
$ irb<br />
irb(main):001:0> require 'rubygems'<br />
=> true<br />
irb(main):002:0> require 'rest_client'<br />
=> true<br />
irb(main):003:0> RestClient.put("http://localhost:8000/somedb/somedoc", "{\"somekey\": \"somevalue\"}", :content_type => "application/json")<br />
</code></p>
<p>In another run your mock server:<br />
<code><br />
$ socat tcp-listen:8000 stdio<br />
PUT /somedb/somedoc HTTP/1.1<br />
Accept: application/xml<br />
Content-Type: application/json<br />
Accept-Encoding: gzip, deflate<br />
Content-Length: 24<br />
Host: localhost:8000</p>
<p>{"somekey": "somevalue"}<br />
</code></p>
<p>Oh! By the way, if you install netcat from source, don&#8217;t compile with <em>-DGAPING_SECURITY_HOLE</em> unless you know what you are doing. <img src='http://www.joeandmotorboat.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2009/08/20/haproxy-stats-socket-and-fun-with-socat/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>tens3 : dead simple s3 backups</title>
		<link>http://www.joeandmotorboat.com/2009/07/29/tens3-dead-simple-s3-backups/</link>
		<comments>http://www.joeandmotorboat.com/2009/07/29/tens3-dead-simple-s3-backups/#comments</comments>
		<pubDate>Wed, 29 Jul 2009 16:16:53 +0000</pubDate>
		<dc:creator>joe</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.joeandmotorboat.com/?p=933</guid>
		<description><![CDATA[I recently needed some simple scripts to backup files on various machines, stuff like configs and even some small CouchDB files. Not finding something already out there I put together tens3, two simple scripts to get and put files to Amazon S3. They provide the following: uses s3 to backup a directory of files (no [...]]]></description>
			<content:encoded><![CDATA[<p>I recently needed some simple scripts to backup files on various machines, stuff like configs and even some small CouchDB files. Not finding something already out there I put together <a href="http://github.com/joewilliams/tens3/tree/master">tens3</a>, two simple scripts to get and put files to Amazon S3. They provide the following:</p>
<ul>
<li>uses s3 to backup a directory of files (no subdirectories)</li>
<li>uses fadvise to be easy on filesystem caches and disks</li>
<li>purges files after X days</li>
<li>streams files rather than loading them entirely into memory</li>
</ul>
<p>They are very simple to use, just create a configuration file together:<br />
<code><br />
amazon_access_key_id: "someid"<br />
amazon_secret_access_key: "somekey"<br />
backup_dir: "/some/path/"<br />
purge_threshold: 3<br />
bucket_name: "somebucket"<br />
</code></p>
<p>Backup a directory of files:</p>
<p><code>$ ./tens3_put tens3.conf</code></p>
<p>Restore a file from a backup:</p>
<p><code>$ ./tens3_get tens3.conf date somefile ./somefile</code></p>
<p>The date is the date that the file was backed up in a YYYYMMDD format.</p>
<p>Enjoy and let me know if you find any bugs or want new features.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joeandmotorboat.com/2009/07/29/tens3-dead-simple-s3-backups/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
