Survive Heavy Traffic With Your Webserver

Recently two of my articles reached the Digg frontpage at the same day. My web server isn't state of the art and it had to handle gigantic amounts of traffic. But still it served pages to visitors swiftly thanks to a lot of optimizations. This is how you can prevent heavy traffic from killing your server.

About this article

There are many things you can do to speed up your website. This article focuses on practical things that I used, without any spending money on additional hardware or commercial software.

In this article I assume that you're already familiar with system administration and hosting / creating websites. In examples I use Ubuntu, but if you use another distro, just make some minor adjustments (like package management) and it should work as well.

Beware, if you don't know what you're doing you could seriously mess up your system.

Cache PHP output

Every time a request hits your server, PHP has to do a lot of processing, all of your code has to be compiled & executed for every single visit. Even though the outcome of all this processing is identical for both visitor 21600 and 21601. So why not save the flat HTML generated for visitor 21600, and serve that to 21601 as well? This will relieve resources of your web server and database server because less PHP often means less database queries.

Now you could write such a system yourself but there's a neat package in PEAR called Cache_Lite that can do this for us, benefits:

  • it saves us the time of inventing the wheel
  • it's been thoroughly tested
  • it's easy to implement
  • it's got some cool features like lifetime, read/write control, etc.

Installing is like taking candy from a baby. On Ubuntu I would:

$ sudo aptitude install php-pear
$ sudo pear install Cache_Lite

And we're ready to use one of our most important assets!

To learn exactly how to implement Cache_Lite into your code I've written another article called: Speedup your website with Cache_Lite.

Create turbo charged storage

With the PHP caching mechanism in place, we take away a lot of stress from your CPU & RAM, but not from your disk. This can be solved by creating a storage device with your system's RAM, like this:

$ mkdir -p /var/www/www.mysite.com/ramdrive
$ mount -t tmpfs -o size=500M,mode=0744 tmpfs /var/www/www.mysite.com/ramdrive

Now the directory /var/www/www.mysite.com/ramdrive is not located on your disk, but in your system's memory. And that's about 30 times faster : ) So why not store your PHP cache files in this directory? You could even copy all static files (images, css, js) to this device to minimize disk IO. Two things to remember:

  • All files in your ramdrive are lost on reboot, so create a script to restore files from disk to RAM
  • The ramdrive itself is lost on reboot, but you can add an entry to /etc/fstab to prevent that

To learn exactly how to tackle te above, I've written another article called: Create turbocharged storage using tmpfs.

Leave heavy processing to cronjobs

For example. I count the number of visits for every singe article. But instead of updating a counter for an article every visit (which involves row locking and a WHERE statement), I use simple and relativity performance-cheap SQL INSERTS into a separate table.

The gathered data is processed every 5 minutes by a separate PHP script that's automatically run by my server. It counts the hits per article, then deletes the gathered data and updates the grand totals in a separate field in my article table. So finally accessing the hit count of an article takes no extra processing time or heavy queries.

If you want more in depth information on writing cronjobs, I've written another article called: Schedule tasks on Linux using crontab.

Optimize your database

Use the InnoDB storage engine

If you use MySQL, the default storage engine for tables is MyISAM. That not ideal for a high traffic website because MyISAM uses table level locking, which means during an UPDATE, nobody can access any other record of the same table. It puts everyone on hold!

InnoDB however, uses Row level locking. Row level locking ensures that during an UPDATE, nobody can access that particular row, until the locking transaction issues a COMMIT.

phpmyadmin allows you to easily change the table type in the Operations tab. Though it never caused me any problems, it's wise to first create a backup of the table you're going to ALTER.

Use optimal field types

Wherever you can, make integer fields as small as possible. Nnot by changing the length but by changing it's actual integer type. The length is only used padding.

So if you don't need negative numbers in a column, always make a field unsigned. That way you can store maximum values with minimum space (bytes). Also make sure foreign keys have matching field types, and place indexes on them. This will greatly speedup queries.

In phpmyadmin there's a link Propose Table Structure. Take a look sometime, it will try to tell you what fields can be optimized for your specific db layout.

Queries

Never select more fields than strictly necessary. Sometimes when you're lazy you might do a:

SELECT * FROM `blog_posts`

even though a

SELECT `blog_post_id`,`title` FROM `blog_posts`

would suffice. Normally that's OK, but not when performance is your no.1 priority.

Tweak the MySQL config

Furthermore there are quite some things you can do to the my.cnf file, but I'll save that for another article as it's a bit out of this article's scope.

Save some bandwidth

Save some sockets first

Small optimizations make for big bandwidth savings when volumes are high. If traffic is a big issue, or you really need that extra server capacity, you could throw all CSS code into one big .css file. Do this with the JS code as well. This will save you some Apache sockets that other visitors can use for their requests. It will also give you better compression rations, should you choose to mod_deflate or compress your javascript with Dean Edwards Packer.

I know what your thinking. No, don't throw all the CSS and JS in the main page. You still really want this separation to:

  • make use of the visitor's browser cache. Once they've got your CSS, it won't be downloaded again
  • not pollute your HTML with that stuff

And now some bandwidth ; )

  • Limit the number of images on your site
  • Compress your images
  • Eliminate unnecessary whitespace or even compress JS with tools available everywhere.
  • Apache can compress the output before it's sent back to the client through mod_deflate. This results in a smaller page being sent over the Internet at the expense of CPU cycles on the Web server. For those servers that can afford the CPU overhead, this is an excellent way of saving bandwidth. But I would turn all compression off to save some extra CPU cycles.

Store PHP sessions in your database

If you use PHP sessions to keep track of your logged in users, then you may want to have a look at PHP's function: session_set_save_handler. With this function you can overrule PHP's session handling system with you own class, and store sessions in a database table or in Memcached.

Now a key attribute to success, is to make this table's storage engine: MEMORY (also known as HEAP). This stores all session information (should be tiny variables) in the database server's RAM. Taking away disk IO stress from your web server, plus allowing to share the sessions with multiple web servers in the future, so that if you're logged in on server A, you're also logged in on server B, making it possible to load balance.

Sessions on tmpfs

If it's too much of a hassle to store sessions in a MEMORY database, storing session files on a ramdisk is also a good options to gain some performance. Just make the /var/lib/php5 live in RAM. To learn exactly how to do this, I've written another article called: Create turbocharged storage using tmpfs ยป.

Sessions in Memcached

I recently (22th June, 08) found another (better) way to store sessions in a cluster-proof, resource-cheap way and dedicated a separate article on it called: Enhance PHP session management.

More tips

Some other things to google on if you want even more:

  • eAccelerator
  • memcached
  • tweak the apache config
  • squid
  • turn off apache logging
  • Add 'noatime' in /etc/fstab on your web and data drives to prevent disk writes on every read