- Published on
Revisiting Faster PHP Sessions
- Authors

- Name
- Kevin van Zonneveld
- @kvz
"Simplicity is prerequisite for reliability."
As our experience grows, we learn from past mistakes and discover what's truly important in reliable systems. When designing systems, simplicity is an often heard mantra, but it isn't getting applied nearly as much as spoken of. I'm guilty of this too. I think it's mainly because engineers love to, well, engineer :) and will naturally try to outsmart problems by throwing more tech at it.
Article vs Article
In the light of this, I revisit my 2008 article Enhance PHP session management. The article explains how you can use a central memcache server to store sessions for performance & scalability purposes.
Having a shared something when you can avoid it is asking for problems, and I was just throwing unneeded tech at this: network protocols, pecl modules, configuration. All vulnerable to bugs, maintenance, performance penalties and outage.
Using 2007 article Create turbocharged storage using tmpfs, we can
defeat some of this over-engineering and take a simpler approach to speeding up sessions in PHP.
We'll store them decentralized in memory by mounting RAM onto the existing /var/lib/php5 session directories throughout your application servers, which I will call nodes from now on.
Make Session Dir Live in RAM
Add this to your /etc/fstab:
$ # Make PHP Sessions live in RAM
$ tmpfs /var/lib/php5 tmpfs size=300M,atime 0 0
This will make sure the 300MB RAM device will be available on your next reboot as well.
300MB is a lot.
You can decrease it later on by changing the /etc/fstab entry and
executing mount -o remount /var/lib/php5
Activate & Migrate Existing Sessions
Then execute:
$ # Create a temporary place for current sessions
$ mkdir -p /tmp/phpsessions/
$ # Move current sessions to it
$ mv /var/lib/php5/* /tmp/phpsessions/
$ # Activate our ramdisk
$ mount -a
$ # Move the current sessions back
$ mv /tmp/phpsessions/* /var/lib/php5/
$ # Remove the temporary placeholder
$ rmdir /tmp/phpsessions
Advantages
What's nice about saving sessions in a tmpfs device compared with saving in memcache is:
- you can migrate to this solution without logging people out :)
- nothing needs to be installed
- instead of throwing errors, it degrades gracefully as disk storage if implementation fails
- you can restart/flush/upgrade any existing memcache instances without people losing sessions
- it uses the default
/var/lib/php5directory, so no.inichanges, and PHP's garbage collector will still purge old sessions - it takes away a bottleneck & single point of failure in your architecture
- it's just a mountpoint, so existing monitoring tools will automatically trigger alerts when you need to allocate more space
- no locking issues with ajax calls (though I believe fixed in memcached-3.0.4beta)
- no protocol overhead
- less tech, so less prone to errors & bugs, easier upgrade process
Decentralizing
Now this doesn't work in clusters without Sticky Sessions. But you've got to ask yourself: in huge clusters, do you really want Shared Sessions? The bigger the cluster, the more vulnerable you'll become as it really only adds a bottle-neck & single point of failure to your architecture.
With decent loadbalancers like EC2's ELB, Pound, HAProxy it becomes child's play to implement Sticky Sessions so that people keep ending up on the node that has their session.
When you're designing to tolerate failure, this architecture is much more robust than depending on anything shared.
Yes, some people will be logged out when you shut down a node (vs all when your session store goes down).
To counter you could:
- drain a node's connections before you take it into planned maintenance, this way nobody is affected
- rsync sessions between nodes if it's crucial that all sessions survive outage.
This could even be automated where nodes can cover for each other. If it's worth the investment depends on your application. Are your nodes likely to go down completely? How many customers will get logged out? What kind of data is lost?
Even if your session store is clustered and uses persistent storage like Redis or MySQL (not the right tool for the job people): network outage, maintenance and misconfiguration can hurt you badly, logging out all customers or worse, throwing errors throughout your platform.
Problems will be bigger and harder to solve.
Whereas if the RAM mountpoint fails, /var/lib/php5 just degrades gracefully as normal disk-based storage. Making sessions slower on that 1 node, but at you'll still be serving customers.
I welcome your thoughts on this!
Legacy Comments (14)
These comments were imported from the previous blog system (Disqus).
This is a great idea. What I like most is that you get rid of the dependencies of other applications and that you don't have to change your ini settings for it. As you said, it would make it easier to maintain and to migrate. Excellent article, cheers.
Interesting idea. I'd like to see some benchmarks of the performance of this vs straight up disk access sessions.
I'd be surprised if there is a huge performance improvement over regular disk based sessions. Unless you're on something like Amazon's ELB which isn't a real disk anyways.
A technique that I'm doing is using signed data in cookies as a session storage technique.
Pros:
1. no server side state
2. no disk access to check
3. No locking
Cons:
1. bigger cookies (but still < 100 bytes)
2. no server side invalidation of a session w/out some server side state. We use memcache for this, but in a simple, does this key exists? Ok cookie is valid.
For the application I'm working on we'll be seeing hundreds of req/second and keeping less state makes scaling out a lot easier.
Also each user is sending requests concurrently so locking *all* requests until the current one finishes really slows down performance. Instead, to prevent race conditions on data we let the database do atomic operations and always code knowing that data could have changed from the time you read to the time to write back to the db. Compare/Swap patterns are great here.
For security, so users can't forge their own cookies we:
a) sign them w/ sha1
b) expire them with a timestamp (int, embedded into data)
When the cookie comes in, we validate the data (json serialized), the signature and that the cookie has not expired.
If everything is OK we process the request and issue a *new* cookie.
When I tried to move my session store to the newly created tmpfs it ran out of space (inodes). To counter this I added "nr_inodes=400k" to the fstab entry.
Also, try to specify a good mode, so your webserver can read/write session files (something like mode=777 in your fstab entry, or less for more security).
First impression: the i/o wait on my server is significantly down!
Question: I use "3;/var/lib/php5" as session.save_path, so there's a directory structure of 3 levels deep. When my server is rebooted now, this directory structure has to be regenerated... Should I just abandon this approach and just use 1 directory with 235K files?
We have a more techy solution - db with memcached.
Does any storage over 300M roll over to disk like "virtual memory"? If not, how do you handle too many/too big session files?
Hi,
I implemented this as we had a lot of issues with session_start taking up to 3 seconds, just because of opening files, and our page loads decreased drastically!
Since this was an absolutely fantastic result, we decided to save our cached objects (which were now cached into a database), in ram as well... translations are now saved in a serialized array into ram, and still has a copy in the database.
page loads for some of our larger pages went from 7 seconds to <0.3 seconds.
So yes - our profiling shows that this is a GREAT result. - we just put in 4 GIGS of additional RAM, and off we go!
Thanks for a great post! This absolutely saves our day, and helps us maintain good structured code, as we didn't want to start compromising by taking shortcuts in our code "just for performance reasons", as we see happen so many times, unfortunately.
Well now, sticky sessions are a big nono, since those will hinder your very purpose for using a cluster.
If I have 3 servers, and I use sticky sessions and I do have a flash flood, I would get:
- webserver1 - 10% load
- webserver2 - 150% load
- webserver3 - 40% load
instead of using all three in an even way.
Maybe sticky would be nice for 10-20+ servers in a cluster, numbers that would led the statistics work and assure an almost even distribution.
But having inbalance in an small cluster is like not having the cluster at all. We do LB for performance _and_ failover not only for failover.
nice work
AWESOME! Instant, noticeable improvement. And super useful for me as I have some servers with pretty slow Disk I/O.
You got a point about load balancers. At least for the presentation layer.
Still might need to cluster a central data layer.
Sessions in ramfs is a very, very bad idea.
1. The obvious: when the machine crashes all sessions are lost. While sometimes that is acceptable it's a definite no-go for ecommerce sites that share session id with cart id (common practice to reduce number of cookies).
2. Limited (compared to other storage solutions) capacity. With small sessions you'll end up with a 1-4 KB session files. That means that 300MB you are using allows 75-300k concurrent sessions. Seems a lot til you remember that webcrawlers from Google, Yahoo, Bing, etc don't use cookies and each request equals new session. With a large website number of requests in millions per day is not uncommon. Boom, suddenly no new session can be created, 502.
3. As with any file storage you loose all the benefits of storing sessions in relational database - the most basic being an ability to modify session of a specific user (log him out for example) - this way you can actually store permissions in session and avoid querying permissions per request.
I went for a hybrid solution:
- sessions are stored both in database and key-store (APC/memcached/whatever I plug in depending on site size)
- database is checked only if key-store returns no match - if there is a match in db it's copied to key-store.
- database is written to only if session value is modified or if time since last persistent (db) store is longer than predefined interval
- most popular/aggressive crawlers are matched by user agent and no session is created for them
Result is fast. Not as fast as tmpfs, but close enough and it supports sharding out of the box (add modulo of request id to select memcached and/or db shard and you're done) without the need to sync files. No fancy clusters - just add new box if you need it and change divisor in config.
Garbage collection is dead simple - cron job that deletes entries from db that had no persistent store for predefined interval - key-store expires on it's own.
You can kill any memcached instance - new one will load all the sessions from db (there will be a load spike, but each will be read just once).
It degrades gracefully if at least db works. If it doesn't you're fucked anyway ;-)
How's it going? I enjoyed reading through this publish. My husband and I have been researching for this kind of article with the longest time and We know that your details about the issue at hand is spot on. I'll be certain to introduce this posting to my neice. Can you tell me how to acquire your new RSS feed? Continue to keep on blogging!
Its possible now to rank no.1 on all search engine with our unique white hats SEO techniques. Rank on the top and get maximum traffic which means maximum sales.Make your site popular online with our seo services, search engine optimization, link building, social media marketing services. High search engine rank assured.
Thank you for the auspicious writeup. It in fact was a amusement account it.
Look advanced to far added agreeable from
you! However, how can we communicate?