20 08 2013
Your PHP Framework Choice doesn’t Matter
This is going to be a post I’ve been wanting to write for a long time, it’s a subject that I’m passionate about and enjoy working on the most and I believe it’s one of the more misunderstood aspects of PHP development.
I’m talking about the speed of PHP and more specifically, evaluating frameworks and tools based on “speed”.
If you have been in the PHP developer community for more than a few months, you would have seen at least a few discussions about what the fastest PHP framework is, as if this were one of the first key metrics you should evaluate first when choosing a framework for your team. You may even be contemplating switching from your current framework because you heard of a new framework that is faster.
In the rest of this article, I’m going to do my best to show you why this not the best line of thinking and provide alternate and in my opinion better metrics for evaluating tools.
Edit: This isn’t an anti-framework article, I’m firmly in the framework camp, because I know I can scale applications written on frameworks and get the benefits that they bring. The point of this article is that your particular choice of framework doesn’t matter, so you should use the one you are most comfortable with, hopefully it takes care of boiler plate code in your application and allows you to test and re-factor easily.
Where PHP Developers go Wrong
This doesn’t even include your “framework of choice” to complete your tool set, not to mention that these tools are constantly evolving.
It’s not surprising to me that as a result of this, deploying your finished PHP code is usually an afterthought.
I’ve coined a term to describe this (very important!) php developer role, the “frontend php developer”. The frontend developer implements APIs, writes great libraries, uses mvc frameworks and works on interfaces.
The other part to a cohesive unit is (surprise), what I call the “backend php developer”. The backend developer knows how php runs in a server environment, how to setup, run, manage and monitor clusters of linux servers, how to deploy code with tools such as capistrano and jenkins and how to write infrastructure as code with tools like puppet. In my opinion they should also have a stake in developing some of the PHP application, usually the parts that interact with key backend services whilst taking a back seat to the frontend developers, this gives them the proper insight into how to tune for best performance. The industry is now calling this role “devops”.
I believe that more PHP developers fit into the front end role and may not have a good understanding of how their application actually runs on a server so when things do go wrong they have to fall back onto the skills that they are very good at. For example, let’s say that your production application is very slow, most programmers first thoughts would be that they need to increase the efficiency of their code and thus make it faster.
This is where the concept of “speed” comes in, your average programmer understands that a loop that loops 1000 times is going to take more than one that loops 10 times. Similarly they can profile their application in their development environments and see that a function is taking a certain amount of time and that they can gain speed by improving it.
Programmers then start asking questions like, “How fast is symfony2″?, “is framework A faster than framework B”? and thus this becomes the vital metric to evaluate tools and enhance the performance of applications.
The Hello World Framework Benchmark
I need to address another issue before we continue, the issue of the “Hello World” PHP framework benchmark. I only need to be brief here as this issue has been brought up and discussed many times but I’ll say it again here. These benchmarks are completely useless and a waste of your time, it doesn’t matter how fast “Hello World” is printed on the screen, I care about real applications and getting real work done. Real applications connect to databases, connect to key value stores like redis and write to disk, all of these by the way are much more likely to cause more concurrency and speed issues in your application than PHP itself.
An Important Point about Code Efficiency
Before we continue I need to clarify some of the things I just said. I’m not advocating for disregarding the efficiency of your code, I’m advocating for disregarding micro optimizations to your code or choosing tools purely based on speed.
For example, I use the Doctrine ORM extensively and I’ve heard people disregard it because it’s “slow”. However the thing is these people don’t realize is that all of the intensive operations in doctrine can be cached in memory making it extremely fast when you compare the trade offs of speed with the benefits it brings in terms of testing and designing complex entity relationships.
I personally use Doctrine in an application that serves 5000 requests per second and with the right setup and tuning, you can also have exponential gains in the performance of your applications for an order of magnitude cheaper.
Sound good? I will elaborate on these points in more detail later in the article.
Setting up a Test
When I decided to sit down and write this article, I thought about some good tests that would support my way of thinking and initially I was planning on creating a standard non-trivial application using a variety of top PHP frameworks. Unfortunately due to time constraints I have been unable to do this but I have come up with an existing solution.
I will take a wordpress installation, fill it with dummy content and use that as the basis of my argument that your framework doesn’t matter.
I’m not advocating for wordpress, I do not use it other than for my blog, the reasons for my selection of wordpress is that it is widely loathed by PHP developers who work with MVC frameworks and is generally seen as archaic by the very PHP developers I’m trying to reach.
For the record, I have no problem with wordpress, it serves a need and does its job well from the end users perspective, a viewpoint that we as developers neglect sometimes. All we see is spaghetti code and an unmaintainable mess, whilst your clients see an easy and intuitive interface.
For those of you who hate wordpress, you can breathe a sigh of relief because this article contains no wordpress code as the whole point of this article is that the framework doesn’t matter.
I want you to think about wordpress only in terms of a dynamic PHP application, it has a database, makes multiple queries per page and it shows fresh content as you update it in some admin interface.
For our purposes, that is all you need to be aware of.
I will be filling that wordpress site with dummy content generated by a plugin that can generate strings of varying length to simulate the dynamic site.
The following screenshots show the test environment as well as a debug pages showing that these wordpress pages execute 20-30 SQL queries on an average page load.
The goal of this test is to show you just how far this standard PHP application can be scaled just by tuning the environment it’s hosted on and caching intensive operations.
My aim is to show you that servers are cheaper than man power and that in your team, you should select tools that work for you, tools that you are comfortable and productive in and tools that allow you to write testable, maintainable code as the first priority.
I believe that once you get to certain level of programming skill, you just “know” how intensive a certain operation could be and you incorporate that level of knowledge into every line of code that you write. This leads to more efficient code in any language or framework that you work with and makes optimizing easier later on.
The aim of this article is show you that you should be writing efficient code, but disregarding “speed”, the latest fad, micro optimizations and instead focusing on proper infrastructure and furthering your code base in ways that actually benefit such as adding new features and writing more tests.
I setup a VPS at linode on their smallest plan which includes 1 GB of ram and what they call x1 priority on an 8 core CPU.
I would have preferred to do this test on a dedicated box with known CPU power and other resources because it can be hard to track down resource issues on a virtual server. For example, just what does x1 priority on a CPU mean? Unfortunately I did not have a dedicated box available for the test. But you should know that the performance outcomes of my tests would have been far greater on a dedicated machine. In fact I prefer to use dedicated servers for the core parts of my infrastructure and using cloud servers to supplement that dedicated core, which is known as a hybrid setup.
The VPS is running debian squeeze and a standard lamp stack was setup using PHP 5.4. I then ran some apache bench tests from my remote machine to target the homepage of wordpress.
Using apache bench (ab) is this way is not ideal due to the required data transfer however it’s not relevant in this case as you’ll soon see.
The test was 1000 requests with 50 requests concurrently, using a standard lamp stack (apache2.2, php5.4 and mysql5.5) on a VPS with 1 GB of ram
The VPS was able to serve 10 requests per second but the load shot up to 37.46.
Load average on servers is not inherently bad, you just need to know how load is calculated and what resources are available to you, in general you should aim to keep the load under the amount of CPU cores that your server has. For example if you had a quad core machine, a load under 4 is good. Keep in mind your server isn’t just going to stop working once the load goes over this mark, but you may see things like slowdowns in response times eventually. It’s also good to look at how much of your CPU is idle in top, other key metrics include disk io, network io and ram usage, which won’t be discussed here.
Obviously a load of 37 on a small VPS which should have a load or 1-2 is a big problem and I’m going to set out and try to solve that problem.
The first thing I did was replace apache2 and mod_php with nginx and php-fpm without any changes to the default configuration for these services.
For the rest of the article, I’m going to focus on nginx/php-fpm. For those of you who use apache, I encourage you to look at apache 2.4 which has more effective ways to serve PHP similar to the way nginx does now and by doing so you will realize more performance.
My goal from here is to continue to tune the environment that this PHP application runs on and achieve a moderate requests/second that you may see on a mid level application without focusing on the application itself. One exception to this is caching, which we’ll talk about now.
If you use a framework, it probably comes with libraries to cache things and wordpress in this case isn’t any different. It has something called the WP Object cache which is an api that wordpress uses to store things like the results of queries. You can implement that api and store that data in any way that you need too, such as memcached or redis, which is very similar to what you will find in modern frameworks. This will allow us to greatly reduce the amount of SQL queries per page and maximize our response times.
I’m not going to spend too much time here, other than to say that in WP, there are many object cache plugins available, I chose one that stores data in memcached. I didn’t choose a full page caching plugin because I wanted the target of this test to be dynamic, a page that is compiled every time from a variety of sources and makes connections to external services such as memcached.
I also compiled and installed zend optimizer for PHP 5.4 and did some tuning of PHP FPM and nginx to effectively use the limited resources available on the VPS.
The test of 1000 requests and 50 of those current was a load of 2.97 and requests per second of 24.20. This means we have doubled the amount of the requests we can handle concurrently with minimal changes to our code on a tiny VPS that costs just $20 a month and in the process of doing so, generated a load average that is suitable for a server of this size.
Server Cost vs Developer Cost
Squeezing performance out of a tiny $20 per month VPS isn’t going to do much to illustrate the point I am trying to make, for that we need a bigger test, but first we need to define a few things.
Lets start off by collecting a few data points that will serve as the basis for my arguments. Lets assume that the average developer salary is $75000 USD per year and that you have two developers on your team for a (surprise) grand total of $150000 per year in development fees or $12500 a month. Some of you make more than this, some less, but we need a figure and we’ll go with this. For simplicity’s sake, nobody pays any taxes.
Now that we have some dollar amounts, I went and upgraded the Linode 1 GB VPS to a Linode 16GB VPS for the additional cost of $300 per month bringing us to a total of $320 per month.
After the migration, I did some tuning of nginx, php-fpm and the kernel to take advantage of the new resources available and set up another test except this time, instead of 1000 requests and 50 concurrent users it was 5000 requests and 1000 concurrent users.
The results were a total time of 10.8 seconds, requests per second of 456.71 and a load average of 6.03.
The load average is a little higher than I would have liked but this is to due the virtual environment and the host machine deciding which users can use what amount of resources at a certain time (the x1 priority, the x8 priority that they advertise affects how much CPU you can use). However, in this case it isn’t too bad since this VPS can comfortably handle around a 8.00 load. This is where I would have liked a dedicated box for this test. Despite all of this, in approximately the same amount of time as our first test with apache on the small VPS that only served 10 requests per second, we have served 5000 requests with hundreds concurrently and every request was dynamic PHP.
The cost for this new found power was a measly three hundred dollars and I by no means pushed this VPS to its limits.
When we take into account the developer salaries from above, it comes in at around $12500 a month, the amount of $300 that we are paying for hosting doesn’t even register on the scale of the cost to employ said developers.
If we take that monthly amount and work out the daily cost, it’s just over $400 per day to have this development team so unless that development team can realize an order of magnitude gain in performance of their application in a day and a half, you should spend more time looking at your infrastructure than at your code.
I also don’t believe I am doing the comparison justice by using virtual private servers, the gains are far greater when you use dedicated hardware with known resources such as number of CPU cores available and disk speed. If you’re having disk I/O issues, see if those issues still exist on a dedicated box with 4 SSD hard drives in a raid 10 array!
I hope that it is also clear that this virtual private server does not mimic a proper production environment for a non trivial application, the gains would be even greater with a cluster of servers that would allow us to split off the individual services and also load balance them.
What I wanted to show in this article is that your framework doesn’t matter, but what I really mean by that is that you should select the tools that you like, tools that are easy for you to work with and tools that are easy for your team to work with. You don’t need to jump on the bandwagon and framework hop the minute the next big thing comes out.
I’ve shown that you can get good performance out of any PHP application with the right tuning and right planning. The most important thing that you can do as a developer is implement caching in your application and make it easy to change the storage that your caching uses. This will ensure your application will scale well initially.
It’s also very important to develop a sense of what will slow your application down and avoid those bumps in the road whilst you are developing. It’s also important to realize when the time you are spending to optimize will not render the greatest performance rewards.
It’s also important to note that at some point, micro optimizations do matter, but that’s not until you’re approaching facebook status and by the time you get there, people are already working on solutions such as HHVM. Most of your applications can be scaled just fine with the right infrastructure.
And finally, get to know your infrastructure, don’t neglect the operations side of your application, chances are huge performance gains are waiting to be realized with the right team. It’s a different skill-set and you should aim for a mix of these skills in your team. If you outsource your operations to platform as a service providers, you may be just fine, but you are limited in your ability to understand what is truly going on.