Sleep Is For The Weak

A Caffeinated Ham Radio Geek’s Unix Musings

Archive for the ‘Tuning’ tag

New Hosting Environment – Part 5 – Varnish

with 2 comments

In this fifth and final installment I would like to discuss using Varnish to further improve the performance of your website.  Varnish is a state-of-the-art, high-performance HTTP accelerator.  What this actually means when you remove the buzz words and replace them with English is that Varnish is a reverse caching web proxy.  When we think about caching web proxies we usually think of something that sits on a corporate network that caches incoming web content so that sites that see a lot of traffic have most, if not all, of their content stored locally.  A reverse proxy works exactly the same, but not.  When we run dynamic websites like WordPress, Joomla, Plone, or Drupal, each time a page is loaded the PHP script has to pull data from the database, render a static .html page and serve that static page to your browser.  Earlier we talked about a plugin for WordPress called WP-SuperCache which does cache these pre-rendered files and serves them from the file system.  Not all content managers have a plugin like WP-SuperCache.  Also WP-SuperCache still has to check the database to determine if the content it has cached is fresh or if it needs to be updated on every page load.  Finally WP-SuperCache must run on the server that is running WordPress.

A typical dynamic web server will look something like this:

Basic Dynamic Server Configuration

Basic Dynamic Server Configuration

Enter Varnish.  Varnish, like the caching web proxy at your office, is a proxy.  It sits between my nginx web server and your browser.  The difference is, while your proxy at work is configured to cache the content of many websites and serve that cached content to users within your office, Varnish is configured to cache the content of only one website.  Varnish serves that cached content to everyone who attempts to visit the site.

A typical dynamic web server with Varnish will look something like this:

Cache Enabled Basic Server Configuration

Cache Enabled Basic Server Configuration

Varnish can also be made aware of more than one backend server.  Which means I could have a single internet facing varnish server, and two or more load balanced web servers behind it.

Two options for how this might be configured are laid out here:

A Load Balanced configuration

A Load Balanced configuration

Scaled, high performance, load balanced configuration

Scaled, high performance, load balanced configuration

Varnish is also able to add, remove, modify, or otherwise mangle the headers passed from the server to your browser.  You can strip cookies, add content expires, and a whole host of other things.  Things like expires headers can be used to make the client cache content such as images, style sheets, and javascript in their browser on their local machine.  That is the fastest way to serve content as it eliminates the network completely.  Varnish really is a fantastic product for accelerating your dynamic website.

Because my connection at home is nowhere near fast enough to really benchmark my new server, especially with Varnish in place, I decided to run ApacheBench locally.  To give you something to compare against I ran it against my single backend server, as well as against the Varnish front end.

I am only going to highlight a few of the important numbers in comparison to the base benchmark.
Completed Requests:  17316 vs 50000 a 65.368% improvement
Requests Per Second: 34.63 vs 219.305 a 83.99% improvement
Time Per Request:  28.88 vs 4.39 a 22.63% improvement
As I think you can see Varnish makes a pretty impressive improvement across the board when it comes to performance.  These numbers can not be compared to previous benchmarks as they were run locally eliminating any network latency between the benchmarking station and the server.  Down the road I would like to spin up two or three more VMs and play around with load balanced varnish servers using CARP on the front end with two or more load balanced web servers behind it.  However for now I believe I have a setup that will suit my site’s needs for long into the future.
Thank you for reading the five part series on my new hosting environment.  If you have any questions please feel free to contact me via the comments section.
Thanks to: pb031 for the image

Written by W9ZEB

May 2nd, 2009 at 8:00 am

New Hosting Environment – Part 4 – WP-Cache

without comments

Continuing my saga of trying to extract every ounce of performance out of my new hosting environment, it’s time to move on to caching.  Up until this point every time a page is loaded the PHP scripts have to query the database for the appropriate records, take the information from the database, and render the page into static HTML that your browser can render.  Computers are fast so this doesn’t take very long, as evidenced by previous benchmarks.  However you’ll notice that even with the APC OpCode Cache in place we’re still hammering the CPU for 50% of its capacity during the benchmarking.  I know this can be improved!

Enter the plugin for WordPress called WP-SuperCache.  Now because I’m running the nginx web server and not Apache I will be unable to use the “super” portion of WP-SuperCache which works with the Apache Module Mod_Rewrite.  However I can still use a static cache of pre-rendered .html to improve site performance.  Installation of WP-SuperCache is pretty straightforward.  You can either upload the folder to your /wp-content/plugins/ folder, or you can use the FTP plugin system within your WordPress Dashboard.  I opted for the manual method as I was able to just wget the file from the server which has a lot more bandwidth than I have at home.

Once WP-SuperCache is installed you have to enable it through the dashboard.  After the plugin is enabled you can adjust its settings.  Because of my selection in webservers I opted to set the plugin to Half-On, Don’t Cache pages for logged in users, and Cache Rebuild.  I then benchmarked the site again from my home computer.

longhammer$ ab -c 25 -t 500 -r http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking insa.w9zeb.org (be patient)
Finished 4667 requests

Server Software:        nginx/0.6.36
Server Hostname:        insa.w9zeb.org
Server Port:            80

Document Path:          /
Document Length:        37195 bytes

Concurrency Level:      25
Time taken for tests:   500.041 seconds
Complete requests:      4667
Failed requests:        8
(Connect: 8, Receive: 0, Length: 0, Exceptions: 0)
Write errors:           0
Total transferred:      174921511 bytes
HTML transferred:       173825455 bytes
Requests per second:    9.33 [#/sec] (mean)
Time per request:       2678.598 [ms] (mean)
Time per request:       107.144 [ms] (mean, across all concurrent requests)
Transfer rate:          341.62 [Kbytes/sec] received

Connection Times (ms)
min  mean[+/-sd] median   max
Connect:       40  785 5663.6    149   67346
Processing:   451 1753 675.3   1605    7765
Waiting:       49  175 126.6    156    1983
Total:        553 2538 5737.4   1767   70928

During the benchmark top reported the following server load

last pid:  7037;  load averages:  0.00,  0.12,  0.20    up 0+00:48:03  21:17:42
42 processes:  1 running, 41 sleeping
CPU:  0.2% user,  0.0% nice,  1.1% system,  0.2% interrupt, 98.5% idle
Mem: 53M Active, 33M Inact, 57M Wired, 1444K Cache, 55M Buf, 848M Free
Swap: 1920M Total, 1920M Free

I’d like to highlight a few of the important numbers in comparison to the APC benchmark.

Completed Requests:  4578 vs 4667 a 1.94% improvement
Requests Per Second: 9.16 vs 9.33 a 1.86% improvement
Time Per Request:  109.218 vs 107.144 a 1.90% improvement
Transfer Rate: 333.78 KB/s vs 341.62 KB/s which is a 2.35% improvement.
CPU Utilization: 56.6% vs 0.2% which is a 99.65% improvement
In terms of pages served the change wasn’t all that dramatic, at least in part because we were already at or very near the capacity of my internet connection at home when we ran the APC OpCode Cache benchmarks.  What you will notice however is the CPU utilization shows a dramatic change.  When the benchmark was running for this test, rather than the four PHP-CGI processes being the top four processes on the server, nginx was.  And nginx had the CPU at just slightly above idle.  Usually when PHP/MySQL based websites get Slashdotted, or on the front page of Digg you will see MySQL connection failure messages.  These are sites that have not properly configured caching.  WP-SuperCache isn’t the best form of caching available to us, but it’s a quick and easy way to prevent site outages during peak traffic hours.
Next we’ll replace the WP-SuperCache with an honest to goodness web accelerator called Varnish.

Thanks to: Bolt of Blue for the image

Written by W9ZEB

April 30th, 2009 at 7:30 pm

New Hosting Environment – Part 3 – APC OpCode Cache

without comments

As I talked about in a previous post, PHP is an interpreted language.  Each page load requires the code to be read, interpreted, and compiled on the fly each time a browser makes a request.  I’m running nginx as my base web server which doesn’t natively support PHP so I have to use FastCGI.  In the previous post, if you could have seen the full output of top you would have seen four PHP-CGI processes adding up to that 75% CPU load.

Just like the vBulletin board tuning I did on Apache earlier I opted to install APC OpCode Cache on my server.  This gives the advantage of pre-compiling the PHP into OpCode and then calling that from cache rather than recompiling every time a page loads.

Installing APC on FreeBSD is a simple task.  I used

[lars@insa]$ sudo portinstall -Rr www/pecl-apc

to install the FreeBSD Port for APC and added the following to my /usr/local/etc/php.ini file

extension=apc.so

apc.enabled = 1
apc.shm_segments = 1
apc.shm_size = 32
apc.filters=wp-cache-config

As I mentioned in previous posts I intend to benchmark my server configuration each step of the way.  Below are the benchmarks run from my home against the remote server.

longhammer$ ab -c 25 -t 500 http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking insa.w9zeb.org (be patient)
Finished 4578 requests

Server Software:        nginx/0.6.36
Server Hostname:        insa.w9zeb.org
Server Port:            80

Document Path:          /
Document Length:        37071 bytes

Concurrency Level:      25
Time taken for tests:   500.001 seconds
Complete requests:      4578
Failed requests:        5
(Connect: 5, Receive: 0, Length: 0, Exceptions: 0)
Write errors:           0
Total transferred:      170893494 bytes
HTML transferred:       169938150 bytes
Requests per second:    9.16 [#/sec] (mean)
Time per request:       2730.456 [ms] (mean)
Time per request:       109.218 [ms] (mean, across all concurrent requests)
Transfer rate:          333.78 [Kbytes/sec] received

Connection Times (ms)
min  mean[+/-sd] median   max
Connect:       47  736 5350.5    143   67352
Processing:   731 1915 673.9   1764   11314
Waiting:      106  285 162.9    239    2093
Total:        850 2651 5429.3   1925   71230

During the benchmark top reported the following server load
last pid:   903;  load averages:  2.33,  1.79,  0.91    up 0+00:10:56  20:40:35
43 processes:  5 running, 38 sleeping
CPU: 56.6% user,  0.0% nice, 11.7% system,  0.6% interrupt, 31.2% idle
Mem: 52M Active, 18M Inact, 31M Wired, 64K Cache, 20M Buf, 892M Free
Swap: 1920M Total, 1920M Free
I’d like to highlight a few of the important numbers in comparison to the base benchmark.
Completed Requests:  3542 vs 4578 a 29.25% improvement
Requests Per Second: 7.08 vs 9.16 a 29.38% improvement
Time Per Request:  141.169 vs 109.218 a 22.63% improvement
Transfer Rate: 257.99 KB/s vs 333.78 KB/s which is a 24.73% improvement.
CPU Utilization: 75.2% vs 56.6% which is a 24.73% improvement
I think it’s safe to say adding an OpCode cache can result in a pretty substantial improvement in the overall performance of your website.  It’s worth noting that 333.78 KB/s isn’t far from maxing out the incoming internet connection into my house.
In Part 4 I will be enabling WP-Cache and will also cover mounting the /var partition as “noatime”

Thanks to: LiquidX for the image

Written by W9ZEB

April 28th, 2009 at 7:30 pm

New Hosting Environment – Part 1 – Technologies

without comments

I’m in the process of moving http://w9zeb.org to a new server.  I’m moving out of a shared environment managed by H-Sphere, hosted on CentOS 5.2 and onto a FreeBSD 7.1-Release Virtual Machine hosted on VMWare ESXi.  It’s not that I was unhappy with my previous host, but rather my interest in tinkering with all of the pieces that deliver a website.  I wanted to be able to tune MySQL, PHP, the file system, etc. and benchmark the effects of those changes without worrying about breaking the other 100 or so sites hosted under H-Sphere.

The technologies that I am planning on using for this site initially are as follows:

  • FreeBSD 7.1-Release
  • nginx 0.6.36
  • varnish 2.0.4
  • MySQL 5.1.33
  • PHP 5.2.9
  • FastCGI

I am intending to benchmark the performance of the server at each step along the way.  Below is the first benchmark which is on a raw, virtually untuned server.

longhammer$ ab -c 25 -t 500 http://insa.w9zeb.org/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking insa.w9zeb.org (be patient)
Finished 3542 requests

Server Software:        nginx/0.6.36
Server Hostname:        insa.w9zeb.org
Server Port:            80

Document Path:          /
Document Length:        37071 bytes

Concurrency Level:      25
Time taken for tests:   500.021 seconds
Complete requests:      3542
Failed requests:        5
(Connect: 5, Receive: 0, Length: 0, Exceptions: 0)
Write errors:           0
Total transferred:      132094346 bytes
HTML transferred:       131356986 bytes
Requests per second:    7.08 [#/sec] (mean)
Time per request:       3529.228 [ms] (mean)
Time per request:       141.169 [ms] (mean, across all concurrent requests)
Transfer rate:          257.99 [Kbytes/sec] received

Connection Times (ms)
min  mean[+/-sd] median   max
Connect:       40  445 4386.8     48   67208
Processing:   651 2982 579.3   2959    8071
Waiting:      342 2173 442.8   2228    3694
Total:        694 3427 4468.5   3015   73897

During this base benchmark top reported the following load on the server

last pid:  9201;  load averages:  4.21,  3.84,  2.55    up 0+04:10:32  20:21:32
42 processes:  5 running, 37 sleeping
CPU: 75.2% user,  0.0% nice, 23.1% system,  1.7% interrupt,  0.0% idle
Mem: 66M Active, 33M Inact, 81M Wired, 1200K Cache, 92M Buf, 811M Free
Swap: 1920M Total, 1920M Free

As you can see the CPU is being hammered and our load averages are > 2.5 during the test.  I know that with some tuning we can improve these numbers.  Be sure to check back for followups on the performance tuning I am doing on the new server.

Thanks to: paradoxperfect for the image

Written by W9ZEB

April 24th, 2009 at 7:19 pm

Part 2: Tuning a 1&1 VPS to improve a vBulletin board.

without comments

In part 2 of tuning a 1&1 VPS to improve vBulletin I installed an Op-Code Cache utility. Because PHP is an interpreted language using simply mod_php requires the code to be read, interpreted, and compiled on the fly each time a page loads. This doesn’t take very long on modern computers. However the more traffic your site gets obviously the more even milliseconds add up.

I looked at several Op-Code Cache utilities, and after some discussion with another tech decided not to use eAccelerator but rather APC. In part because of the simplicity of installation, and because it’s going to be included as a core part of PHP 6. I would like to thank @floris for this post on installing APC on CentOS

Below are benchmarks run using Apache Bench: ingo # ab -c 100 -t 5000 website.com/forums

APC disabled Performance:

Concurrency Level: 100
Time taken for tests: 53.368813 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Non-2xx responses: 50000
Total transferred: 28500000 bytes
HTML transferred: 15900000 bytes
Requests per second: 936.88 [#/sec] (mean)
Time per request: 106.738 [ms] (mean)
Time per request: 1.067 [ms] (mean, across all concurrent requests)
Transfer rate: 521.50 [Kbytes/sec] received

APC Enabled Performance:

Concurrency Level: 100
Time taken for tests: 33.140479 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Non-2xx responses: 50000
Total transferred: 28500000 bytes
HTML transferred: 15900000 bytes
Requests per second: 1508.73 [#/sec] (mean)
Time per request: 66.281 [ms] (mean)
Time per request: 0.663 [ms] (mean, across all concurrent requests)
Transfer rate: 839.82 [Kbytes/sec] received

As you can see our requests per second went up by 61% which is a pretty big deal. Also our mean time per request dropped. All in all for a fairly simple installation this was a worth while improvement.

In Part 3, we’ll discuss some extremely minor MySQL tuning I did which made quite possibly the biggest difference of them all as far as performance.

Thanks to: i_am_sam_lee for the image

Written by W9ZEB

February 14th, 2009 at 9:35 pm

Part 1: Tuning a 1&1 VPS to improve a vBulletin board.

with one comment

A friend of mine runs a moderate sized vBulletin web forum. By moderate sized I mean in 12 months of being online he has 2,851 members, 18,257 threads, 206,904 posts, and an average number of online users that floats between 50 in the middle of the night, and 200 during peak hours, transmitting a daily average of 7gb of data.

About six months ago another forum member and I helped move the site off of “A Small Orange” shared hosting and on to a 1&1 Virtual Private Server. The performance boost was immediately apparent. Of course like any web forum we grew from just over 1000 members to 2000 members in 4 months, two months later another 800 members. Performance is still better than it was on ASO, but it was starting to suffer.

I fired up YSlow and determined that there were a fair number of efficiencies that could be gained by tuning some settings in Apache. Specifically by using mod_deflate & expires headers. The settings I added to /etc/httpd/conf/httpd.conf are as follows.

#Mod_Deflate Rules
# Netscape 4.x or IE 5.5/6.0
BrowserMatch ^Mozilla/4 no-gzip
# IE 5.5 and IE 6.0 have bugs! Ignore them until IE 7.0+
BrowserMatch \bMSIE\s7 !no-gzip
# IE 6.0 after SP2 has no gzip bugs!
BrowserMatch \bMSIE.*SV !no-gzip
# Sometimes Opera pretends to be IE with “Mozila/4.0″
BrowserMatch \bOpera !no-gzip

AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE text/javascript
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE image/svg+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/atom_xml
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE application/x-httpd-php
AddOutputFilterByType DEFLATE application/x-httpd-fastphp
AddOutputFilterByType DEFLATE application/x-httpd-eruby
AddOutputFilterByType DEFLATE text/html
Header append Vary User-Agent

These settings alone lowered the daily average from 7gb down to 3gb! Mod_Deflate uses gzip to compress data by filter type. Older browsers aren’t able to read gzipped data. The first few rules at the top actually tell apache not to use gzip if the browser matches.

Next we turn on Header Expires. These add a header for all jpg, jpeg, gif, js, css, and png files that say the content expires and should be refreshed in April of 2010. This will hopefully tell browsers like IE, Firefox, and Safari to cache the files in their local store. Things like logos, navigation images, etc. don’t change often. Downloading them every time a page loads is wasteful.

<FilesMatch “(jpg|jpeg|gif|js|css|png)”>

Header set Expires “Thu, 15 Apr 2010 20:00:00 GMT”

</FilesMatch>

These are just two of the steps I’ve taken to improve performance of this vBulletin site. I will detail some of the additional changes in upcoming posts. If you have any questions feel free to comment. If you have suggestions on how I could improve these settings even further I’d love to hear from you as well.

Thanks to: Leff for the image

Written by W9ZEB

February 11th, 2009 at 11:50 pm