Archive for the ‘Tuning’ tag
New Hosting Environment – Part 5 – Varnish
In this fifth and final installment I would like to discuss using Varnish to further improve the performance of your website. Varnish is a state-of-the-art, high-performance HTTP accelerator. What this actually means when you remove the buzz words and replace them with English is that Varnish is a reverse caching web proxy. When we think about caching web proxies we usually think of something that sits on a corporate network that caches incoming web content so that sites that see a lot of traffic have most, if not all, of their content stored locally. A reverse proxy works exactly the same, but not. When we run dynamic websites like WordPress, Joomla, Plone, or Drupal, each time a page is loaded the PHP script has to pull data from the database, render a static .html page and serve that static page to your browser. Earlier we talked about a plugin for WordPress called WP-SuperCache which does cache these pre-rendered files and serves them from the file system. Not all content managers have a plugin like WP-SuperCache. Also WP-SuperCache still has to check the database to determine if the content it has cached is fresh or if it needs to be updated on every page load. Finally WP-SuperCache must run on the server that is running WordPress.
A typical dynamic web server will look something like this:
Enter Varnish. Varnish, like the caching web proxy at your office, is a proxy. It sits between my nginx web server and your browser. The difference is, while your proxy at work is configured to cache the content of many websites and serve that cached content to users within your office, Varnish is configured to cache the content of only one website. Varnish serves that cached content to everyone who attempts to visit the site.
A typical dynamic web server with Varnish will look something like this:
Varnish can also be made aware of more than one backend server. Which means I could have a single internet facing varnish server, and two or more load balanced web servers behind it.
Two options for how this might be configured are laid out here:
Varnish is also able to add, remove, modify, or otherwise mangle the headers passed from the server to your browser. You can strip cookies, add content expires, and a whole host of other things. Things like expires headers can be used to make the client cache content such as images, style sheets, and javascript in their browser on their local machine. That is the fastest way to serve content as it eliminates the network completely. Varnish really is a fantastic product for accelerating your dynamic website.
Because my connection at home is nowhere near fast enough to really benchmark my new server, especially with Varnish in place, I decided to run ApacheBench locally. To give you something to compare against I ran it against my single backend server, as well as against the Varnish front end.
Completed Requests: 17316 vs 50000 a 65.368% improvementRequests Per Second: 34.63 vs 219.305 a 83.99% improvementTime Per Request: 28.88 vs 4.39 a 22.63% improvement
New Hosting Environment – Part 4 – WP-Cache
Continuing my saga of trying to extract every ounce of performance out of my new hosting environment, it’s time to move on to caching. Up until this point every time a page is loaded the PHP scripts have to query the database for the appropriate records, take the information from the database, and render the page into static HTML that your browser can render. Computers are fast so this doesn’t take very long, as evidenced by previous benchmarks. However you’ll notice that even with the APC OpCode Cache in place we’re still hammering the CPU for 50% of its capacity during the benchmarking. I know this can be improved!
Enter the plugin for WordPress called WP-SuperCache. Now because I’m running the nginx web server and not Apache I will be unable to use the “super” portion of WP-SuperCache which works with the Apache Module Mod_Rewrite. However I can still use a static cache of pre-rendered .html to improve site performance. Installation of WP-SuperCache is pretty straightforward. You can either upload the folder to your /wp-content/plugins/ folder, or you can use the FTP plugin system within your WordPress Dashboard. I opted for the manual method as I was able to just wget the file from the server which has a lot more bandwidth than I have at home.
Once WP-SuperCache is installed you have to enable it through the dashboard. After the plugin is enabled you can adjust its settings. Because of my selection in webservers I opted to set the plugin to Half-On, Don’t Cache pages for logged in users, and Cache Rebuild. I then benchmarked the site again from my home computer.
longhammer$ ab -c 25 -t 500 -r http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/Benchmarking insa.w9zeb.org (be patient)
Finished 4667 requestsServer Software: nginx/0.6.36
Server Hostname: insa.w9zeb.org
Server Port: 80Document Path: /
Document Length: 37195 bytesConcurrency Level: 25
Time taken for tests: 500.041 seconds
Complete requests: 4667
Failed requests: 8
(Connect: 8, Receive: 0, Length: 0, Exceptions: 0)
Write errors: 0
Total transferred: 174921511 bytes
HTML transferred: 173825455 bytes
Requests per second: 9.33 [#/sec] (mean)
Time per request: 2678.598 [ms] (mean)
Time per request: 107.144 [ms] (mean, across all concurrent requests)
Transfer rate: 341.62 [Kbytes/sec] receivedConnection Times (ms)
min mean[+/-sd] median max
Connect: 40 785 5663.6 149 67346
Processing: 451 1753 675.3 1605 7765
Waiting: 49 175 126.6 156 1983
Total: 553 2538 5737.4 1767 70928
During the benchmark top reported the following server load
last pid: 7037; load averages: 0.00, 0.12, 0.20 up 0+00:48:03 21:17:42
42 processes: 1 running, 41 sleeping
CPU: 0.2% user, 0.0% nice, 1.1% system, 0.2% interrupt, 98.5% idle
Mem: 53M Active, 33M Inact, 57M Wired, 1444K Cache, 55M Buf, 848M Free
Swap: 1920M Total, 1920M Free
I’d like to highlight a few of the important numbers in comparison to the APC benchmark.
Completed Requests: 4578 vs 4667 a 1.94% improvementRequests Per Second: 9.16 vs 9.33 a 1.86% improvementTime Per Request: 109.218 vs 107.144 a 1.90% improvementTransfer Rate: 333.78 KB/s vs 341.62 KB/s which is a 2.35% improvement.CPU Utilization: 56.6% vs 0.2% which is a 99.65% improvement
Thanks to: Bolt of Blue for the image
New Hosting Environment – Part 3 – APC OpCode Cache
As I talked about in a previous post, PHP is an interpreted language. Each page load requires the code to be read, interpreted, and compiled on the fly each time a browser makes a request. I’m running nginx as my base web server which doesn’t natively support PHP so I have to use FastCGI. In the previous post, if you could have seen the full output of top you would have seen four PHP-CGI processes adding up to that 75% CPU load.
Just like the vBulletin board tuning I did on Apache earlier I opted to install APC OpCode Cache on my server. This gives the advantage of pre-compiling the PHP into OpCode and then calling that from cache rather than recompiling every time a page loads.
Installing APC on FreeBSD is a simple task. I used
[lars@insa]$ sudo portinstall -Rr www/pecl-apc
to install the FreeBSD Port for APC and added the following to my /usr/local/etc/php.ini file
extension=apc.so
apc.enabled = 1
apc.shm_segments = 1
apc.shm_size = 32
apc.filters=wp-cache-config
As I mentioned in previous posts I intend to benchmark my server configuration each step of the way. Below are the benchmarks run from my home against the remote server.
longhammer$ ab -c 25 -t 500 http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/Benchmarking insa.w9zeb.org (be patient)
Finished 4578 requestsServer Software: nginx/0.6.36
Server Hostname: insa.w9zeb.org
Server Port: 80Document Path: /
Document Length: 37071 bytesConcurrency Level: 25
Time taken for tests: 500.001 seconds
Complete requests: 4578
Failed requests: 5
(Connect: 5, Receive: 0, Length: 0, Exceptions: 0)
Write errors: 0
Total transferred: 170893494 bytes
HTML transferred: 169938150 bytes
Requests per second: 9.16 [#/sec] (mean)
Time per request: 2730.456 [ms] (mean)
Time per request: 109.218 [ms] (mean, across all concurrent requests)
Transfer rate: 333.78 [Kbytes/sec] receivedConnection Times (ms)
min mean[+/-sd] median max
Connect: 47 736 5350.5 143 67352
Processing: 731 1915 673.9 1764 11314
Waiting: 106 285 162.9 239 2093
Total: 850 2651 5429.3 1925 71230
last pid: 903; load averages: 2.33, 1.79, 0.91 up 0+00:10:56 20:40:3543 processes: 5 running, 38 sleepingCPU: 56.6% user, 0.0% nice, 11.7% system, 0.6% interrupt, 31.2% idleMem: 52M Active, 18M Inact, 31M Wired, 64K Cache, 20M Buf, 892M FreeSwap: 1920M Total, 1920M Free
Completed Requests: 3542 vs 4578 a 29.25% improvementRequests Per Second: 7.08 vs 9.16 a 29.38% improvementTime Per Request: 141.169 vs 109.218 a 22.63% improvementTransfer Rate: 257.99 KB/s vs 333.78 KB/s which is a 24.73% improvement.CPU Utilization: 75.2% vs 56.6% which is a 24.73% improvement
Thanks to: LiquidX for the image
New Hosting Environment – Part 1 – Technologies
I’m in the process of moving http://w9zeb.org to a new server. I’m moving out of a shared environment managed by H-Sphere, hosted on CentOS 5.2 and onto a FreeBSD 7.1-Release Virtual Machine hosted on VMWare ESXi. It’s not that I was unhappy with my previous host, but rather my interest in tinkering with all of the pieces that deliver a website. I wanted to be able to tune MySQL, PHP, the file system, etc. and benchmark the effects of those changes without worrying about breaking the other 100 or so sites hosted under H-Sphere.
The technologies that I am planning on using for this site initially are as follows:
- FreeBSD 7.1-Release
- nginx 0.6.36
- varnish 2.0.4
- MySQL 5.1.33
- PHP 5.2.9
- FastCGI
I am intending to benchmark the performance of the server at each step along the way. Below is the first benchmark which is on a raw, virtually untuned server.
longhammer$ ab -c 25 -t 500 http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/Benchmarking insa.w9zeb.org (be patient)
Finished 3542 requestsServer Software: nginx/0.6.36
Server Hostname: insa.w9zeb.org
Server Port: 80Document Path: /
Document Length: 37071 bytesConcurrency Level: 25
Time taken for tests: 500.021 seconds
Complete requests: 3542
Failed requests: 5
(Connect: 5, Receive: 0, Length: 0, Exceptions: 0)
Write errors: 0
Total transferred: 132094346 bytes
HTML transferred: 131356986 bytes
Requests per second: 7.08 [#/sec] (mean)
Time per request: 3529.228 [ms] (mean)
Time per request: 141.169 [ms] (mean, across all concurrent requests)
Transfer rate: 257.99 [Kbytes/sec] receivedConnection Times (ms)
min mean[+/-sd] median max
Connect: 40 445 4386.8 48 67208
Processing: 651 2982 579.3 2959 8071
Waiting: 342 2173 442.8 2228 3694
Total: 694 3427 4468.5 3015 73897
During this base benchmark top reported the following load on the server
last pid: 9201; load averages: 4.21, 3.84, 2.55 up 0+04:10:32 20:21:32
42 processes: 5 running, 37 sleeping
CPU: 75.2% user, 0.0% nice, 23.1% system, 1.7% interrupt, 0.0% idle
Mem: 66M Active, 33M Inact, 81M Wired, 1200K Cache, 92M Buf, 811M Free
Swap: 1920M Total, 1920M Free
As you can see the CPU is being hammered and our load averages are > 2.5 during the test. I know that with some tuning we can improve these numbers. Be sure to check back for followups on the performance tuning I am doing on the new server.
Thanks to: paradoxperfect for the image
Part 2: Tuning a 1&1 VPS to improve a vBulletin board.
In part 2 of tuning a 1&1 VPS to improve vBulletin I installed an Op-Code Cache utility. Because PHP is an interpreted language using simply mod_php requires the code to be read, interpreted, and compiled on the fly each time a page loads. This doesn’t take very long on modern computers. However the more traffic your site gets obviously the more even milliseconds add up.
I looked at several Op-Code Cache utilities, and after some discussion with another tech decided not to use eAccelerator but rather APC. In part because of the simplicity of installation, and because it’s going to be included as a core part of PHP 6. I would like to thank @floris for this post on installing APC on CentOS
Below are benchmarks run using Apache Bench: ingo # ab -c 100 -t 5000 website.com/forums
APC disabled Performance:
Concurrency Level: 100
Time taken for tests: 53.368813 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Non-2xx responses: 50000
Total transferred: 28500000 bytes
HTML transferred: 15900000 bytes
Requests per second: 936.88 [#/sec] (mean)
Time per request: 106.738 [ms] (mean)
Time per request: 1.067 [ms] (mean, across all concurrent requests)
Transfer rate: 521.50 [Kbytes/sec] received
APC Enabled Performance:
Concurrency Level: 100
Time taken for tests: 33.140479 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Non-2xx responses: 50000
Total transferred: 28500000 bytes
HTML transferred: 15900000 bytes
Requests per second: 1508.73 [#/sec] (mean)
Time per request: 66.281 [ms] (mean)
Time per request: 0.663 [ms] (mean, across all concurrent requests)
Transfer rate: 839.82 [Kbytes/sec] received
As you can see our requests per second went up by 61% which is a pretty big deal. Also our mean time per request dropped. All in all for a fairly simple installation this was a worth while improvement.
In Part 3, we’ll discuss some extremely minor MySQL tuning I did which made quite possibly the biggest difference of them all as far as performance.
Thanks to: i_am_sam_lee for the image
Part 1: Tuning a 1&1 VPS to improve a vBulletin board.
A friend of mine runs a moderate sized vBulletin web forum. By moderate sized I mean in 12 months of being online he has 2,851 members, 18,257 threads, 206,904 posts, and an average number of online users that floats between 50 in the middle of the night, and 200 during peak hours, transmitting a daily average of 7gb of data.
About six months ago another forum member and I helped move the site off of “A Small Orange” shared hosting and on to a 1&1 Virtual Private Server. The performance boost was immediately apparent. Of course like any web forum we grew from just over 1000 members to 2000 members in 4 months, two months later another 800 members. Performance is still better than it was on ASO, but it was starting to suffer.
I fired up YSlow and determined that there were a fair number of efficiencies that could be gained by tuning some settings in Apache. Specifically by using mod_deflate & expires headers. The settings I added to /etc/httpd/conf/httpd.conf are as follows.
#Mod_Deflate Rules
# Netscape 4.x or IE 5.5/6.0
BrowserMatch ^Mozilla/4 no-gzip
# IE 5.5 and IE 6.0 have bugs! Ignore them until IE 7.0+
BrowserMatch \bMSIE\s7 !no-gzip
# IE 6.0 after SP2 has no gzip bugs!
BrowserMatch \bMSIE.*SV !no-gzip
# Sometimes Opera pretends to be IE with “Mozila/4.0″
BrowserMatch \bOpera !no-gzipAddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE text/javascript
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE image/svg+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/atom_xml
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE application/x-httpd-php
AddOutputFilterByType DEFLATE application/x-httpd-fastphp
AddOutputFilterByType DEFLATE application/x-httpd-eruby
AddOutputFilterByType DEFLATE text/html
Header append Vary User-Agent
These settings alone lowered the daily average from 7gb down to 3gb! Mod_Deflate uses gzip to compress data by filter type. Older browsers aren’t able to read gzipped data. The first few rules at the top actually tell apache not to use gzip if the browser matches.
Next we turn on Header Expires. These add a header for all jpg, jpeg, gif, js, css, and png files that say the content expires and should be refreshed in April of 2010. This will hopefully tell browsers like IE, Firefox, and Safari to cache the files in their local store. Things like logos, navigation images, etc. don’t change often. Downloading them every time a page loads is wasteful.
<FilesMatch “(jpg|jpeg|gif|js|css|png)”>
Header set Expires “Thu, 15 Apr 2010 20:00:00 GMT”
</FilesMatch>
These are just two of the steps I’ve taken to improve performance of this vBulletin site. I will detail some of the additional changes in upcoming posts. If you have any questions feel free to comment. If you have suggestions on how I could improve these settings even further I’d love to hear from you as well.
Thanks to: Leff for the image

