January 27, 2009

CouchDB Load Balancing and Replication using HAProxy.

Last night, I decided to dig into CouchDB a bit more than I have in the past and setup a simple load balanced and replicated setup using HAProxy. In the end it was a pretty easy feat and seems to work fairly well. Here’s what I had to do.

First, I setup three instances of CouchDB on the same machine using different configuration files, PIDs and loopback addresses for each. This can certainly be exchanged for three different machines. Running them on the same machine make sure you adjust the DbRootDir, BindAddress, LogFile in the configuration file and use a command like the following to start things up. This will make sure the non-default configuration and PID location are used.

./couchdb -c SOME_PATH/couchdb2.ini -p SOME_PATH/couchdb2.pid

As you may already know CouchDB has a nice web interface called futon, http://HOSTNAME:5984/_utils/ Using futon I created a database with the same name on all three. I then chose which instance would be my “master”, couchdb1 and couchdb2 and 3 will be “slaves”. I put master and slave in quotes because there isn’t this type of relationship in CouchDB as far as I can tell. All instances can replicate to each other as long as they can connect to each other, so master-slave replication is simply the type of configuration I am enforcing with HAProxy and my replication POST commands. More on these bits later. I then created created a document on my master node and using futon’s replicator replicated the changes to the other nodes. I then wanted to find a way to automate or schedule this. You can initiate replication simply by sending a POST request to couchdb so I wrote a simple curl script to do just that.

First I created the replication POST body in a file:

{“source”:”test_rep”,”target”:”http://couchdb2:5984/test_rep”}

When run against the master this will replicate the master to couchdb2. I wrote a similar file for couchdb3 as well.

Then using curl I can send this body to the master:

curl -X POST –data @couchdb1_2_rep http://couchdb1:5984/_replicate
curl -X POST –data @couchdb1_3_rep http://couchdb1:5984/_replicate

After running you should see some output that starts with {“ok”:true,”session_id …} this means things went well. You should also see some output in the logs on both instances. These commands can be put in a cron to run a specific intervals to keep the slaves updated. You can also create a script and configure DbUpdateNotificationProcess to replicate after each update. The later is probably a nicer solution but a cron and curl should get you started.

I then moved on to setting up HAProxy to load balance between the nodes. Since I wanted a master-slave relationship between the nodes I needed to set HAProxy to only send POSTs, PUTs and DELETEs to the master and GET requests to the two slaves. After checking the docs and playing with a couple different ACL configurations I didn’t find a solution. I then contacted the mailing list for some advice and conveniently a solution was sent back to me quickly. They also told me about another piece of documentation I didn’t find initially. My configuration for HAProxy is pretty basic but it shows what needs to be done.

global
maxconn 4096
nbproc 2

defaults
mode http
clitimeout 150000
srvtimeout 30000
contimeout 4000
balance roundrobin
stats enable
stats uri /haproxy?stats

frontend couchdb_lb
bind localhost:8080

acl master_methods method POST DELETE PUT
use_backend master_backend if master_methods
default_backend slave_backend

backend master_backend
server couchdb1 couchdb1:5984 weight 1 maxconn 512 check

backend slave_backend
server couchdb2 couchdb2:5984 weight 1 maxconn 512 check
server couchdb3 couchdb3:5984 weight 1 maxconn 512 check

The part that enforces where the PUTs, DELETEs and POSTs go is the ACL definition and it basically says that if HAProxy receives a POST, DELETE or PUT then use the master node otherwise use a slave.

Once done I started up HAProxy and tested it out and found that it worked out nicely with GETs going to the slaves in roundrobin fashion and PUTs, DELETEs and POSTs going to the master. I then made a slight change to my curl command from earlier to have the replication POSTs go through HAProxy just to make sure.

curl -X POST –data @couchdb1_2_rep http://localhost:8080/_replicate
curl -X POST –data @couchdb1_3_rep http://localhost:8080/_replicate

If things are working properly you should find that the replication POST commands only go to the master node and the GET commands got to the two slaves.

CouchDB is pretty easy to get going and fun to work with. Hopefully this will help you get going.

January 3, 2009

Nginx vs Yaws vs MochiWeb : Web Server Performance Deathmatch, Part 2 [Update x 2]

Update 1: Retest data (using different machine and Erlang kernel polling) added near bottom of post.

Update 2: More details and testing on the weird MochiWeb kernel polling results, bottom of post.

Almost a year ago I did some Apache and Nginx performance testing. Apparently I have the bug again and have done some performance testing on Nginx, Yaws and MochiWeb. The latter two being Erlang based. Again deathmatch may be an overstatement but this is my attempt at gleaning some interesting performance data from some high performance web servers. Also, I attempted to improve the graphs this time around since they were a bit hard to read the last time.

The Setup:

I was not able to use the same server and setup as the last time, so comparing between this and my last deathmatch probably isn’t very accurate. For this test I used a Intel Dual Core 2.2GHz, 4GB RAM machine running Ubuntu 8.10 (64bit) and for the test server. Erlang (R12B-3), Yaws (1.77) and Nginx (0.6.32) are installed from the standard repository and mochiweb from subversion (rev 88). All are using the default configurations outside of adjusting listening port numbers. The test is again against a basic robots.txt file. The tests were done using a consumer grade 100mb switch and all tests originated from an old laptop I had laying around. I think that about covers the test bed, if you have any questions let me know.

For the tests I used autobench (httperf under the hood) with the following command, each test ran ten minutes apart. The order of the tests were done in was MochiWeb then Yaws and lastly Nginx.

autobench –single_host –host1 HOST –port1 PORT –uri1 /robots.txt –low_rate 10 –high_rate 200 –rate_step 10 –num_call 10 –num_conn 5000 –timeout 5 –file SERVER-results-`date +%F-%H:%M:%S`.tsv

The Results:

There are a few results from httperf/autobench that I would like to show, errors, network I/O, reply rate (and it’s standard deviation) and response time. (click on the graphs for a larger view)

nginx yaws mochiweb errors

MochiWeb and Yaws both seem to be the most consistent here. Nginx had a couple of funky spikes, I do not know if this was an issue with Nginx or with my tests and/or test bed. Take from it what you will.

mochiweb yaws nginx network io

Nginx seems to use a bit more network I/O consistently through the lower ranges of this test and then again as some spikes. MochiWeb and Yaws seem to have some inconsistencies as well.

mochiweb yaws nginx reply rate

The reply rate and network I/O graphs certainly seem to be tied, which would make sense. Edit: Average reply rate is average replies per second.

mochiweb yaws nginx reply rate standard deviation

In the higher reaches of the tests Yaws seems to be most consistent.

mochiweb yaws nginx response time

MochiWeb seems to have consistently the highest response times with Nginx has the lowest. This also follows the data from the first deathmatch. Nginx had consistently low response times against Apache. Edit: Response time is how quickly replies are sent in milliseconds.

Next up are the system graphs, I have CPU usage (both cores combined), context switches, interrupts and load. To help read these please note recall that each test ran ten minutes apart and the order of the tests was MochiWeb then Yaws and lastly Nginx. The data was gathered using sar at five minute intervals and graphed using ksar.

nginx yaws mochiweb cpu usage

It seems Nginx is the clear winner here. Kernel polling may be the answer here, a retest may be in order to see if it makes a difference.

nginx yaws mochiweb context switch

MochiWeb and Nginx seem pretty even on context switches with Yaws a little higher. I suppose turning on kernel polling might make this a bit more even, since Erlang and Nginx both use epoll. This may also account for the CPU usage difference above.

nginx yaws mochiweb interrupts

Interrupts are fairly even across all of them.

nginx yaws mochiweb load

Again Nginx takes it, again likely due to kernel polling being disabled. That’s my best guess anywho.

The data I used to create the graphs and etc is available here.

Let me know if you are interested in me retesting anything, I may try to enable kernel polling and try again if I get a chance.

Note that these are *my* experiences with each webserver, your testing and experiences may be different. As with most things there are pro’s, con’s, trade offs and pitfalls. The only way to find out what will work best for your environment is to test, test and test.

Update:

I performed the upper half of the tests again to see if there were any changes to sporadic jumps in the graphs http performance graphs. My initial test using the old laptop I saw the same results. I then ran the tests from a VM (running Ubuntu 8.10 in a KVM VM) on my dual core machine and found that the results were much more even. Unfortunately it’s the same machine that the webservers are running on but the results look much better. The first set is using the same setup as before but just adjusted to have the top half test. The second is the same test but with kernel polling turned on in Erlang.

nginx yaws mochiweb reply rate

All of them are very even and close, no real winners here.

nginx yaws mochiweb response time

Looks like Nginx is the clear winner with Yaws next, followed by MochiWeb.

nginx yaws mochiweb cpu usage

Pretty much the same as last time (likely a little higher across the board due to running the tests in a VM on the same machine). Note that Nginx is a system process, so for Yaws and MochiWeb follow the blue line and Nginx follow the green.

nginx yaws mochiweb context switches

About the same as before, other than being higher due to running a VM.

nginx yaws mochiweb load

Pretty much the same as before again, Nginx seems the lowest.

Now for the tests with kernel polling enabled in Erlang (erl +K true).

nginx yaws mochiweb reply rate kernel polling

With kernel polling on it looks like Yaws actually performs better in the reply rate test with MochiWeb performing worse and Nginx in the middle

nginx yaws mochiweb response time kernel polling

In the response time test a huge change is noted, MochiWeb goes from roughly a ~14 ms response time at 2000 requests to ~65 ms. Also noted Yaws performs much better matching or beating Nginx.

nginx yaws mochiweb cpu usage kernel polling

With kernel polling in the Erlang webservers Nginx still seems to come out on top for CPU usage.

nginx yaws mochiweb context switches kernel polling

Following the performance trend we saw above Yaws sees a drop in context switches and MochiWeb increases.

nginx yaws mochiweb load kernel polling

Load-wise things stay roughly the same with Nginx being the lowest.

While it certainly seems that my old laptop that I did the original tests on is too slow or has a network issue, hopefully with these new tests we have some more clarity. It seems that Yaws improves with kernel polling enabled and competes well with Nginx. MochiWeb on the other hand apparently has issues with kernel polling and actually degrades performance. If anyone has more info on the internals of MochiWeb and possible causes I would be certainly interested.

If anyone would like the data from the second round of tests it is available here.

Update 2:

I did some more testing to see what the issue might be with MochiWeb, response times and kernel polling. I did a few tests with different versions of Erlang, with and without kernel polling and testing from within and outside a KVM VM. From what I can tell the issue seems to be isolated to testing from within a VM with MochiWeb and kernel polling. Seems to be sorta strange but all my testing and retesting shows the same issue. Just to be clear on my setup, I am running httperf from with in a VM to MochiWeb running outside the VM. Here is the latest round of testing to show this point.

nginx yaws mochiweb kvm vm response time kernel polling

Even though the numbers are higher from within the VM without kernel polling, it certainly seems to be an issue with the combination of MochiWeb, KVM and kernel polling. Since I did not see the same spike from within a VM in the earlier tests with Yaws and kernel polling I assume it is not an issue with Erlang or it’s kernel polling mechanism conflicting with KVM. I am not entirely sure what to make of this other than MochiWeb, kernel polling and KVM don’t play well together and that kernel polling actually helps MochiWeb significantly when KVM is not involved. If anyone has any ideas on why that may be I am all ears.

October 13, 2008

SSH and Ruby

The last couple days I have been a bit distracted from the Erlang stuff I have been doing lately and ended up some how playing with Ruby and the SSH library. For running commands on a bunch of machines at once it would work really well. Here’s some code I wrote and paraphrased from various sources.

require ‘rubygems’
require ‘net/ssh’

username=”yourusername”
hostnames=["node01","node02"]
script=”date;uptime;”

hostnames.each {|hostname|
Net::SSH.start( hostname, username ) do |session|
session.open_channel do |channel|
channel.on_data { |chan,output| puts “#{output.inspect}” }
channel.on_extended_data { |chan,type,output| print output }
channel.exec script
end
session.loop
end
}

This will run the commands contained in the script variable on the hosts in the hostnames array as the specified user. As it is currently it does not supply a password, so you’ll need keys setup. Adding your password is pretty simple, just check out the API here.

September 24, 2008

Ubuntu Ibex Alpha 6 Intel GigE Adapter Bug.

Don’t use the latest Ibex Alpha 6 if you run an Intel Gigabit ethernet card, there is a bug (here too) currently that will screw with the firmware render it inoperable by making the checksum fail. This applies to the e1000e driver but can cause issues if you have used e1000 in the past. The image download page says:

Due to an unresolved bug in the Linux kernel included in these images, they should not be used on Intel ethernet hardware supported by the e1000e driver (Intel GigE). Doing so may render your network hardware permanently inoperable.

Older Intel ethernet hardware which uses the e1000 driver is not affected by this; however, some hardware which used the e1000 driver in previous Ubuntu releases, such as hardware that uses a PCI Express bus, has been moved from e1000 to e1000e in the latest kernel releases. If in doubt, do not use these images, and subscribe to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263555 to be notified when the bug is fixed.

Yikes! Hope they have it fixed by the 10th.

Update: More info can be found in a discussion on the kernel mailing list.

Update #2: Looks like the main bug report is here. Looks like they are getting close to a resolution.

Update #3: Seems that a fix as been released and the final release of Ibex (8.10) will be out Oct 30th.

September 8, 2008

Disco.

Something I happened to see over here this weekend was Disco. It is a Map/Reduce framework written in Erlang. A user/implementer doesn’t need to know a lick of Erlang to get rolling but according to their site most folks use Python to write the actual jobs. If you as me a Map/Reduce framework built using Erlang makes a great amount of sense due to its message passing and light weight processes.