January 3, 2009

Nginx vs Yaws vs MochiWeb : Web Server Performance Deathmatch, Part 2 [Update x 2]

Update 1: Retest data (using different machine and Erlang kernel polling) added near bottom of post.

Update 2: More details and testing on the weird MochiWeb kernel polling results, bottom of post.

Almost a year ago I did some Apache and Nginx performance testing. Apparently I have the bug again and have done some performance testing on Nginx, Yaws and MochiWeb. The latter two being Erlang based. Again deathmatch may be an overstatement but this is my attempt at gleaning some interesting performance data from some high performance web servers. Also, I attempted to improve the graphs this time around since they were a bit hard to read the last time.

The Setup:

I was not able to use the same server and setup as the last time, so comparing between this and my last deathmatch probably isn’t very accurate. For this test I used a Intel Dual Core 2.2GHz, 4GB RAM machine running Ubuntu 8.10 (64bit) and for the test server. Erlang (R12B-3), Yaws (1.77) and Nginx (0.6.32) are installed from the standard repository and mochiweb from subversion (rev 88). All are using the default configurations outside of adjusting listening port numbers. The test is again against a basic robots.txt file. The tests were done using a consumer grade 100mb switch and all tests originated from an old laptop I had laying around. I think that about covers the test bed, if you have any questions let me know.

For the tests I used autobench (httperf under the hood) with the following command, each test ran ten minutes apart. The order of the tests were done in was MochiWeb then Yaws and lastly Nginx.

autobench –single_host –host1 HOST –port1 PORT –uri1 /robots.txt –low_rate 10 –high_rate 200 –rate_step 10 –num_call 10 –num_conn 5000 –timeout 5 –file SERVER-results-`date +%F-%H:%M:%S`.tsv

The Results:

There are a few results from httperf/autobench that I would like to show, errors, network I/O, reply rate (and it’s standard deviation) and response time. (click on the graphs for a larger view)

nginx yaws mochiweb errors

MochiWeb and Yaws both seem to be the most consistent here. Nginx had a couple of funky spikes, I do not know if this was an issue with Nginx or with my tests and/or test bed. Take from it what you will.

mochiweb yaws nginx network io

Nginx seems to use a bit more network I/O consistently through the lower ranges of this test and then again as some spikes. MochiWeb and Yaws seem to have some inconsistencies as well.

mochiweb yaws nginx reply rate

The reply rate and network I/O graphs certainly seem to be tied, which would make sense. Edit: Average reply rate is average replies per second.

mochiweb yaws nginx reply rate standard deviation

In the higher reaches of the tests Yaws seems to be most consistent.

mochiweb yaws nginx response time

MochiWeb seems to have consistently the highest response times with Nginx has the lowest. This also follows the data from the first deathmatch. Nginx had consistently low response times against Apache. Edit: Response time is how quickly replies are sent in milliseconds.

Next up are the system graphs, I have CPU usage (both cores combined), context switches, interrupts and load. To help read these please note recall that each test ran ten minutes apart and the order of the tests was MochiWeb then Yaws and lastly Nginx. The data was gathered using sar at five minute intervals and graphed using ksar.

nginx yaws mochiweb cpu usage

It seems Nginx is the clear winner here. Kernel polling may be the answer here, a retest may be in order to see if it makes a difference.

nginx yaws mochiweb context switch

MochiWeb and Nginx seem pretty even on context switches with Yaws a little higher. I suppose turning on kernel polling might make this a bit more even, since Erlang and Nginx both use epoll. This may also account for the CPU usage difference above.

nginx yaws mochiweb interrupts

Interrupts are fairly even across all of them.

nginx yaws mochiweb load

Again Nginx takes it, again likely due to kernel polling being disabled. That’s my best guess anywho.

The data I used to create the graphs and etc is available here.

Let me know if you are interested in me retesting anything, I may try to enable kernel polling and try again if I get a chance.

Note that these are *my* experiences with each webserver, your testing and experiences may be different. As with most things there are pro’s, con’s, trade offs and pitfalls. The only way to find out what will work best for your environment is to test, test and test.

Update:

I performed the upper half of the tests again to see if there were any changes to sporadic jumps in the graphs http performance graphs. My initial test using the old laptop I saw the same results. I then ran the tests from a VM (running Ubuntu 8.10 in a KVM VM) on my dual core machine and found that the results were much more even. Unfortunately it’s the same machine that the webservers are running on but the results look much better. The first set is using the same setup as before but just adjusted to have the top half test. The second is the same test but with kernel polling turned on in Erlang.

nginx yaws mochiweb reply rate

All of them are very even and close, no real winners here.

nginx yaws mochiweb response time

Looks like Nginx is the clear winner with Yaws next, followed by MochiWeb.

nginx yaws mochiweb cpu usage

Pretty much the same as last time (likely a little higher across the board due to running the tests in a VM on the same machine). Note that Nginx is a system process, so for Yaws and MochiWeb follow the blue line and Nginx follow the green.

nginx yaws mochiweb context switches

About the same as before, other than being higher due to running a VM.

nginx yaws mochiweb load

Pretty much the same as before again, Nginx seems the lowest.

Now for the tests with kernel polling enabled in Erlang (erl +K true).

nginx yaws mochiweb reply rate kernel polling

With kernel polling on it looks like Yaws actually performs better in the reply rate test with MochiWeb performing worse and Nginx in the middle

nginx yaws mochiweb response time kernel polling

In the response time test a huge change is noted, MochiWeb goes from roughly a ~14 ms response time at 2000 requests to ~65 ms. Also noted Yaws performs much better matching or beating Nginx.

nginx yaws mochiweb cpu usage kernel polling

With kernel polling in the Erlang webservers Nginx still seems to come out on top for CPU usage.

nginx yaws mochiweb context switches kernel polling

Following the performance trend we saw above Yaws sees a drop in context switches and MochiWeb increases.

nginx yaws mochiweb load kernel polling

Load-wise things stay roughly the same with Nginx being the lowest.

While it certainly seems that my old laptop that I did the original tests on is too slow or has a network issue, hopefully with these new tests we have some more clarity. It seems that Yaws improves with kernel polling enabled and competes well with Nginx. MochiWeb on the other hand apparently has issues with kernel polling and actually degrades performance. If anyone has more info on the internals of MochiWeb and possible causes I would be certainly interested.

If anyone would like the data from the second round of tests it is available here.

Update 2:

I did some more testing to see what the issue might be with MochiWeb, response times and kernel polling. I did a few tests with different versions of Erlang, with and without kernel polling and testing from within and outside a KVM VM. From what I can tell the issue seems to be isolated to testing from within a VM with MochiWeb and kernel polling. Seems to be sorta strange but all my testing and retesting shows the same issue. Just to be clear on my setup, I am running httperf from with in a VM to MochiWeb running outside the VM. Here is the latest round of testing to show this point.

nginx yaws mochiweb kvm vm response time kernel polling

Even though the numbers are higher from within the VM without kernel polling, it certainly seems to be an issue with the combination of MochiWeb, KVM and kernel polling. Since I did not see the same spike from within a VM in the earlier tests with Yaws and kernel polling I assume it is not an issue with Erlang or it’s kernel polling mechanism conflicting with KVM. I am not entirely sure what to make of this other than MochiWeb, kernel polling and KVM don’t play well together and that kernel polling actually helps MochiWeb significantly when KVM is not involved. If anyone has any ideas on why that may be I am all ears.

18 Comments

  1. Jay Phillips Jan 04, 2009 12:50 am

    It seems like in all cases the benchmarks start to wildly vary after 1100 requests. I wonder if this is an issue with httperf running on your laptop. I’d be interested to see httperf limitations factored out by running the tests against the web servers from many nodes on the network, each staying below 1100 requests each (assuming they have the exact hardware as your laptop; each node should probably stay under 500 just to be safe).

    Also, your graph titled “average reply rate” is very vague. The Y axis doesn’t mention its units and I can’t picture in my head how “reply rate” and “response time” differ, but their graphs apparently do, enormously so. Maybe you could clarify what “reply rate” is actually measuring?

  2. joe Jan 04, 2009 12:58 am

    Thanks for the comment Jay. I agree that all of them seem to start to jump after 1100. I am not sure what the issue would or if there is one. In the future I may give a multi-node test a shot.

    Reply rate is replies per second and response time is how quickly responses get replied to. I edited the post to include this as well.

  3. Paul Keeble Jan 04, 2009 6:55 am

    The client machine (the laptop) is being maxed out in some way which is causing the consistent erratic behaviour above around 1100 req/s. It is really unlikely these tools start to break down at such low CPU utilisations. That invalidates all the results past that point.

  4. Tim Jan 04, 2009 10:33 am

    Could you benchmark these three web servers performance when serving a non-static page.

    E.g. – a PHP page the echo “Hello World!” or the current date()

  5. joe Jan 04, 2009 12:43 pm

    @Paul, I see what you are saying and the laptop I used is a bit old/slow. I am fairly certain it’s actually the same laptop I used to perform my last deathmatch and didn’t see any issues. Regardless there is definitely a difference between this test and the last. Unfortunately, these are the only two machines I have at the moment so retesting with another is not possible.

    @Tim, I do not believe this is entirely possible as I do not believe MochiWeb will run PHP. Yaws has some PHP capabilities as does Nginx. Plus I am more interested in the raw speed and concurrency that the three could serve pages rather than run PHP or other code.

  6. Pichi Jan 04, 2009 6:19 pm

    @joe: you can simply write “Hello World!” in Erlang and compare Mochieb and Yaws. I assumed that Yaws can serve static pages better than Mochi because does some advance caching for it but I thought that Mochi can outperform Yaws in very simple dynamic page. It will be interesting to see your results. For me this sounds good that Yaws and Mochi works well comparable to Nginx even for static pages.

  7. joe Jan 04, 2009 6:29 pm

    @pichi, I was hoping to steer away from that and focus simply on speed and concurrency for static pages. I am not positive a simple hello world type of a test would prove much since a real application has numerous other factors such as a database. These other factors will likely have more of an effect on performance than processing of compiled Erlang code.

  8. CanoeLabs » MochiWeb vs. Yaws vs. Nginx Jan 05, 2009 9:29 am
  9. Delano Mandelbaum Jan 05, 2009 7:04 pm

    Re: the issue with 1100 req/s, it’s a common problem for tests with very low response times. A machine has only around 60000 sockets available and there’s only 1 request per socket unless keep-alive is enabled. Once a socket is closed it generally remains in an unusable TIME_WAIT state for 60 seconds. After 60 seconds @ 1000 req/s, httperf needs to wait for sockets to be freed up. As you can imagine, things get wacky when that happens.

    If you get a chance to run the tests again, keep an eye on the number of sockets in TIME_WAIT state. You can use netstat (”netstat | grep TIME | wc -l”) or read /proc/net/sockstat for that info.

  10. Andy Jan 05, 2009 11:40 pm

    Could you benchmark the dynamic page performances of the 3 servers?

    Yaws & MochiWeb are mainly used to, and are designed to, generate dynamic content. So a static content benchmark does not really touch on their main design goal.

    A benchmark with
    — Nginx-FastCGI-PHP (or Python, etc)
    — Yaws-Erlang
    — Mochiweb-Erlang
    comparison would be really informative.

  11. Bob Ippolito Jan 06, 2009 1:22 am

    “Since I did not see the same spike from within a VM in the earlier tests with Yaws and kernel polling I assume it is not an issue with Erlang or it’s kernel polling mechanism conflicting with KVM”

    I don’t think that conclusion is necessarily correct, because Yaws and MochiWeb use sockets very differently in Erlang. There could be a bug somewhere between Erlang and the Linux VM when using {active, false} sockets (which is what MochiWeb uses).

    We don’t use MochiWeb with VMs, so we haven’t run up against this problem, but if there’s a bug it’s higher up the chain.

  12. David N. Welton Jan 06, 2009 8:57 am

    I agree with Andy, above. Why bother with the Erlang servers if you’re not going to take advantage of what they do beyond nginx, which is spit out dynamic content? They do admirably compared to nginx at serving a static file, but dynamic content is a more difficult problem, and, under stress, is where Erlang is likely to really shine.

  13. joe Jan 06, 2009 11:11 am

    @bob, Thanks for chiming in and for the insight. I’ll see what I can do to nail down where the issue might be.

  14. Dev Blog AF83 » Blog Archive » Veille technologique : articles, IE6, Javascript, technologies, frameworks, performances, OpenID… Jan 06, 2009 12:16 pm
  15. Steve Vinoski Jan 11, 2009 4:03 pm

    You ought to pick up Yaws 1.78, released Friday, since it now has a sendfile driver that helps on the CPU usage front when static content is involved. See http://steve.vinoski.net/blog/2009/01/05/sendfile-for-yaws/ for some measurements.

  16. joe Jan 11, 2009 4:05 pm

    Thanks for the note Steve, I will check it out.

  17. Steve Davis Jan 26, 2009 6:36 am

    I wonder how inets httpd compares…

  18. links for 2009-01-27 Jan 27, 2009 7:32 pm

Leave a Comment

(required)

(will not be published) (required)