If you been around in the django community for a while chances are you have heard of staticgenerator, the excellent tool that speeds up your blog or content heavy site by generating static html out of it which can be served by nginx (the webserver/reverse proxy/mail proxy from Russia with love that needs no introduction today) directly. Recently more and more has done the samething but instead of generating static files, they put the generated content in memcached and lets nginx serve it directly from memcached with it’s sweet memcached module. Which should you use? Memcached or static files? With this post I’d like to adress the benefits by using either of these strategies to speed up your dynamic webapp monster.
Benchmarks
Some people chooses strategy after benchmarks numbers without any other considerations so I’ll start by just stating that nginx serves static content ~270% faster than memcached content and Im sure some people stops reading here and goes with staticgenerator.
I didn’t just make that number up, I used the excellent httperf with the following settings:
$ httperf --client=0/1 --server=foo.bar --port=80 --uri=/test/ --send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=10000
With a static html file on /test/ i got this result:
Request rate: 6243.8 req/s (0.2 ms/req)
And with the exact same html in memcached i got this:
Request rate: 2285.5 req/s (0.4 ms/req)
These benchmarks were made with “warm cache”, the html file was pregenerated aswell as put into the memcache before the tests were made. So does this mean nginx is actually 270% faster on serving the same html from static files than from memcached? No. I don’t have any proper benchmark environment. This numbers just gave me hints about nginx being faster on static content than serving from memcached with its memcached module and most important, the test numbers made me thinking about what was going on behind the curtains.
Interpreting the benchs
I think one of the reasons the memcached solution does’nt perform as well as static files is that on every single request the nginx needs to open up a new tcp socket, handshake with the memcached etc. Hold it right there, this is starting to sound like a dialog from 24 with Chloe O’Brian. Im not familiar with the core stuff of the tcp protocol but what I mean is that on every single nginx request, nginx has to connect to the memcached, there is no keepalive or persistent connection between nginx and memcached. Running both memcached and nginx on localhost may not add any significant delay, but the fact that it’s a completly another program with its own processes that nginx needs to “connect to”, and wait for an answer is pretty clear to me that it adds some delay on vs just serving the file from local disk within the same process.
But doesn’t serving static files means that nginx hits the disk on every request? Yes and no. All *bsd and linux systems (I can’t speak for windows) has excellent fs usage. Frequently requested files are cached automaticly by the os, though the files are stated on every request to see if they have been changed for example. The above benchsmarks was done on two laptops (with 5400 rpm drives), one with nginx and memcache and one running httperf. With 5400 rpm drives.
Conclusion
Either you choose to use staticgenerator or one of the memcached ways you will speed up your site significantly. If you want the fastest available option use a staticgenerator with a ramdisk (mount some of your ram as a dir). Generally I think it’s a cleaner concept to put generated content in mem instead of disk. A restart of memcache and the cache is gone. Also, your memcache doesn’t need to be running on the same machine as the nginx or the webapp. It makes it easier for flexible setups and scale, pushing around static files over network with scp/rsync/nfs/<your preferred tool for moving data over network here> is messier but fact remains static files are served faster.
Consider adding expires headers on like 1 hour and clients browsing around the site will not even have to download the same html over and over again aswell. But then clients won’t necessary always see the latest content you say? They may not, you could adjust the expire header depending on how often you post to your blog or how often you update your content.
I use these strategies even on sites without any load. Why? When that somebody actually visits the site I wan’t it to be served as fast as possible. Performence and usabillity goes hand in hand, a fast experience is a good experience.
[...] StaticGeneratorMem for django 2009 March 23 by andreas As a result of the last post about serving static content vs memcached content with nginx I’m hereby introducing StaticGeneratorMem, a fork of the excellent StaticGenerator by Jared [...]
Hi, thanks for this nice post. I am looking at a number of options with Nginx for a blog I am working on, and this was definitely useful.
Out of curiosity: rather than letting nginx serve static files from disk, have you tried using its built in caching and… letting it cache to a tempfs/ramdisk location? I guess the difference would be even nicer. :D
The difference will probably be minimal if you use tmpfs/ramdisk. If a html file is read from disk multiple times by nginx its automatically cached in ram by nginx as you said :)