A Day in the Life of #Apache
Tips for Making Your Server Run Faster
by Rich Bowen, coauthor of
Apache Cookbook
Editor's note: After a brief summer hiatus that included a trip to Portland, Oregon for OSCON 2004, Rich Bowen is back this month with his latest column based on his conversations on the IRC channel #apache. Want to know how to make your web site faster? Rich has some tips to enhance your server's performance.
#apache is an IRC channel that runs on the irc.freenode.net IRC
network. To join this channel, you need to install an IRC client (XChat, MIRC,
and bitchx are popular clients) and enter the following commands:
/server
irc.freenode.net
/join #apache
Day Eight
First, a note. I'm writing this at OSCON 2004. It is Friday
morning, and this article was due on Tuesday evening. So, many thanks to my editor
for her understanding, and to paraphrase Douglas Adams, here are some words of wisdom from fajita:
<DrBacchus> fajita: deadlines
<fajita> The great thing about deadlines is the whooshing sound they make as they fly past.
Today we're talking about the rather common question that comes up a couple of
times every week:
<Quixote> How do I make my server run faster?
As you might imagine, the answers to this vary greatly, and primarily depend
on the type of content you have on your web site, and in what ways
you have already reconfigured your server. So we'll approach this question from
a variety of different angles. Also, before we get started, you might be interested to
know that there's a document on the Apache site that addresses this question,
somewhat. That document may be found at httpd.apache.org/docs-2.0/misc/perf-tuning.html.
We'll start with another question.
<Quixote> How do I measure performance?
<DrBacchus>
fajita: benchmarking tools
<fajita> Some available benchmarking tools
include ab, flood, jmeter, daiquiri, siege
Measuring performance is a tricky business, because you're seldom measuring
exactly the thing that you want to be. You typically want to know how a web site
will perform in real-world scenarios. Most measuring tools try to simulate these
scenarios to some degree, but it's never quite the same thing. And as long as
you understand that it's not the same thing, these tools can be valuable in
making your web site faster.
The tools that fajita recommends are all freely available, and have various
degrees of functionality. I will not spend this article telling you about all of
them, although we will look at ab a little, because it comes with
Apache, and is a convenient starting place for performance testing.
- Flood
- JMeter
- Daiquiri
- Siege
Now that that's out of the way, we'll talk briefly about using
ab.
ab is a very simple-minded benchmarking tool. Don't be fooled
into thinking that it simulates reality in any meaningful way. However, it's
still valuable to test to see if your performance changes have actually made a
difference. ab requests a URL multiple times, and then reports a
few statistics about those transaction:
ab -n 1000 -c 10 http://localhost/index.html
Here's some partial output from that command:
Concurrency Level: 10
Time taken for tests: 2.303 seconds
Complete requests: 1000
Failed requests: 0
Broken pipe errors: 0
Total transferred: 253260 bytes
HTML transferred: 12060 bytes
Requests per second: 434.22 [#/sec] (mean)
Time per request: 23.03 [ms] (mean)
Time per request: 2.30 [ms] (mean, across all concurrent requests)
Transfer rate: 109.97 [Kbytes/sec] received
The information provided in this output allows for some basic measurement of
the speed of a particular resource, so that you can observe any changes that
occur when you modify your server configuration. Now that you have a way to
verify that my recommendations are actually doing something, let's get on with
the tips.
DNS
Avoid DNS whenever possible. It is slow, and it is out of your control. DNS
lookups take as long as they take. So you want to avoid forcing Apache to do
them.
There are two particular places where this might come up: access control and
logging.
If you are using allow from or deny from lines in
your configuration to do access control based on the address of the client, try
to use the IP address, rather than the hostname, of the client you want to
permit or deny. If you use a hostname, Apache will have to do a DNS lookup in order
to determine if the client in question is from that hostname.
One option in your logfile configuration is the directive
HostNameLookups. If it's set to Off, which is the default,
Apache will log the IP address of the client. If it's set to On, it will
instead log the hostname.
Don't do that.
Causing Apache to do a DNS lookup for every client access will slow down
performance significantly, and will also cause the number of Apache child
processes to grow, as various processes are using their time to do DNS
lookups.
.htaccess Files
I've discussed .htaccess files before,
and briefly touched on the performance aspects of using them. The short form
here is that you should avoid using .htaccess files whenever
possible. They are a huge performance drain.
The reason for this is two-fold.
First, there's the fact that Apache has to look in the .htaccess
file every single time a resource is requested from the directory in question.
.htaccess files are not cached, and changes to them take effect
immediately. So Apache has to check for that file every time. Meaning that
you're opening that .htaccess file, reading it in, and parsing the
contents, with every single request.
But, wait, there's more! Because .htaccess files apply to
subdirectories, Apache will have to check the directory above, and perhaps the
one above that, and so on, until it reaches a directory where
.htaccess files are not permitted. This means that every resource
requested from that directory generates two or three or four (etc.) file system accesses,
even if there aren't any .htaccess files in those directories --
Apache still has to look.
The moral here is to set AllowOverride None wherever
possible, and for places that you really need .htaccess files, turn
the feature on only for that directory. Or, better yet, put directives in
httpd.conf, where they belong. (Yes, there are times when
.htaccess files are useful. I just think it's less often than some
folks seem to think.)
Content Negotiation
Content
negotiation is a feature that uses the user's browser preferences to
determine what variant of a resource (e.g., which of several languages) is served up.
While this is a wonderful feature, it comes with a pretty large performance
price. Don't use it unless you need it. In practical terms, this means removing
the MultiViews from your Options lines in your
config files if you're not using it.
Other Resources
This list is not complete by any means. And there's more thorough
documentation on the Apache web site. (See 1.3 and
2.0.) But
these are the places where the largest number of mistakes are made, and so those
are a pretty good place to start.
Other things that you might look at are mod_deflate (mod_gzip if you're on 1.3)
and mod_file_cache
(mod_mmap_static
if you're on 1.3).
And, if you're doing Perl CGI programs, make sure you take a look at mod_perl.
And be sure to drop by #apache with any further questions.
Rich Bowen
is a member of the Apache Software Foundation, working primarily on the documentation for the Apache Web Server. DrBacchus, Rich's handle on IRC, can be found on the web at www.drbacchus.com/journal.
In November 2003, O'Reilly Media, Inc. released Apache Cookbook.
Sample Chapter 9, "Error Handling," is available
free online.
You can also look at the Table of Contents, the
Index, and the full description of
the book.
For more information, or to order the book,
click here.
Return to the Apache DevCenter