Apache Web Server by babbian


									APACHE Web Server

A web server is a file server that serves files in HTML format over a specific port. The
browser client interprets these HTML files to draw the screen on the remote client
system. Content served can be static webpages or dynamic content served up by
imbedded scripting (PHP, ASP), Common Gateway Interface (CGI) scripts or binary
programs or a combination technology (Java, Flash).

The webserver returns two parts of data in response to a GET or POST command
browser client – the URL and URI. URL – Universal Resource Locator – is the
webserver itself. The URI – Universal Resource Indicator – is the local target resource
stored on the server along with an optional pathname as input data to the target resource.
How this optional data is passed depends on the HTTP method used – GET or POST.

Installing Apache

Apache is the most popular Open Source webserver. It has a port to Windows replacing
IIS. Under Most RedHat and Fedora Linux it is distributed filename starting with the
“httpd” followed by a version number. Debian/Ubuntu method refers apt-get to binaries
named “apache” plus a version #.

Installing Apache components under RedHat:

yum install httpd, httpd-dools, httpd-devel, system-config-httpd

After installation, use the chkconfig command to configure Apache to start at boot:
chkconfig httpd on

Use the httpd init script in the /etc/init.d directory to start,stop, and restart Apache after
service httpd start
service httpd stop
service httpd restart

You can test whether the Apache process is running with
pgrep httpd
you should get a response of plain old process ID numbers.

It is not recommended to run Apache services as an XINETD application for
performance reasons.

Configuring Apache
The configuration file used by Apache is /etc/httpd/conf/httpd.conf. Apache must be
restarted for changes to this configuration file take effect. This file is a series of
directives – global, per site, per container. Some directives use the standard HTML <tag>
</tag> delimiters. If all virtual sites share a single IP address, all site directives need to
remain in the main httpd.conf file. The directives indicate server/site attributes, file
locations, loadable modules, access control list.

Files in the /etc/httpd/conf.d directory are read and automatically appended to the
configuration in the httpd.conf file every time Apache is restarted. This is usually done
for server supporting multiple sites on multiple IP address. Create one configuration file
in this directory per Web site per dedicated IP address with its own set of
NameVirtualHost , <VirtualHost> and <Directory>; then remove the corresponding
directives from the main httpd.conf file (if applicable). The files located in the
/etc/httpd/conf.d directory don't have to have any special names, and you don't have to
refer to them in the httpd.conf file they are appended automatically. Convention usually
has the config file name associated with the unqulified hostname of the corresponding./

Web Contents

All the statements that define the features of each web site are grouped together inside
their own <VirtualHost> section, or container, in the httpd.conf file. The most commonly
used statements, or directives, inside a <VirtualHost> container are:

      servername: Defines the name of the website managed by the <VirtualHost>
       container. This is needed in named virtual hosting only.
      ServerRoot: Defines directory where server configuration information is found.
      DocumentRoot: Defines the directory in which the web pages for the site can be

By default, Apache expects to find all its web page files in the /var/www/html/ directory
with a generic DocumentRoot statement at the beginning of httpd.conf. Apache searches
the DocumentRoot directory for an index home, page named index.html. For example: a
servername of www.my-site.com with a DocumentRoot directory of /home/www/site1/,
Apache displays, the first page on the website is /home/www/site1/index.html. Apache
does not recognize the index pages topmost named as in Windows based systems:
index.htm, default.htnl or default.htm unless specified in the DirectoryIndex parameter.
So a link to index.html is sometimes requried for content generated by packages using
this convention.

File Security

Apache will display Web page files as long as they are world readable and executable
(chmod 755). You have to make sure you make all the files and subdirectories in your
DocumentRoot have the correct permissions. It is a good idea to have the files owned by
a nonprivileged user so that Web developers can update the files using FTP or SCP
without requiring the root password.

    1. Create a user with a home directory of /home/www. useradd -g users www
    2. Recursively change the file ownership permissions of the /home/www directory
       and all its subdirectories. chown -R www:users /home/www
    3. Change the permissions on the /home/www directory to 755, which allows all
       users, including the Apache's httpd daemon, to read the files inside. chmod 755

Use FTP or SCP to transfer new files to your web server as this new user. This will make
all the transferred files automatically have the correct ownership. "403 Forbidden"
indicates incorrect permissions on files or directories under DocumentRoot.

Virtual Hosting and DNS

HTML 1.0

Apache webservers are usually known by a www A entry in the DNS zone. The default
specification under HTML 1.0 – one IP address per <VirtualHost> content to be served.

EXAMPLE 1: Apache listens on all interfaces and gives the same content for any IP
address that resolves to the <VirtualHost *> directive enforces a single <VirtualHost>
container per IP address ignoring any ServerName directives you may use inside it.
<VirtualHost *>
  DocumentRoot /home/www/site1

EXAMPLE 2: Apache listens on all interfaces, but gives different content for addresses and Web surfers get the site1 content if they try to access
the web server on any of its other IP addresses:
<VirtualHost *>
  DocumentRoot /home/www/site1

  DocumentRoot /home/www/site2

  DocumentRoot /home/www/site3

HTML 1.1

Under HTML 1.1 “HTTP Headers” allows specification of more than one server per IP
address by using the NameVirtualHost directive in the /etc/httpd/conf/httpd.conf file.
The DocumentRoot directive defines the directory that contains the index page for that
 The <VirtualHost> container files tell Apache where it should look for the Web pages
used on each Web site using the <servername> driective. You must specify the IP address
for which each <VirtualHost> container applies and the primary Web site domain name
for that IP address with the ServerName directive. You can list secondary domain names
to serve the same content using the ServerAlias directive.

Apache searches for a perfect match of NameVirtualHost, <VirtualHost>, and
ServerName. If no match, then Apache uses the first <VirtualHost> in the list that
matches the target IP address. <VirtualHost *> statement indicates it should be used for
all other Web queries (non-matched). A <VirtualHost> with a specific IP address always
gets higher priority than a <VirtualHost *> to cover the same IP address, even if the
ServerName directive doesn't match. As a result, always place <VirtualHost *>
statements at the beginning of the list to cover addresses your server may have. You can
also have multiple NameVirtualHost directives, each with a single IP address, in cases
where your Web server has more than one IP address.

Example: a server is configured to provide content on


<VirtualHost *>
  Default Directives. (In other words, not site #1 or site #2)

  servername www.my-site.com
  Directives for site #1

  servername www.another-site.com
  Directives for site #2

With SSL

If you installed Apache with support for secure HTTPS/SSL, always direct the SSL
request to a specific IP address. Virtual host wild cards don't work because Apache SSL
module demands at least one explicit <VirtualHost> directive for IP-based virtual
hosting. When you use wild cards, Apache interprets it as an overlap of name-based and
IP-based <VirtualHost> directives and gives error messages because it can't make up its
mind about which method to use:
Starting httpd: [Sat Oct 12 21:21:49 2002] [error] VirtualHost _default_:443 -- mixing * ports and non-* ports with a
NameVirtualHost address is not supported, proceeding with undefined results

If you try to load any Web page on your web server, you'll see the error:
Bad request!
Your browser (or proxy) sent a request that this server could not understand.
If you think this is a server error, please contact the webmaster

Don't use virtual hosting statements with wild cards except for the very first
<VirtualHost> directive that defines the web pages to be displayed when matches to the
other <VirtualHost> directives cannot be found:.
NameVirtualHost *
<VirtualHost *>
  Directives for other sites

  Directives for site that also run on SSL

Compressing / Compacting Web Pages

Apache has the ability to dynamically compress static Web pages into gzip or deflate
format and then send the result to the browser using the mod_deflate.so loadable module
(see web for current Apache directives implementing this module) . Most Web browsers
support this format, transparently uncompressing the data and presenting it on the screen.
Most commercial websites don’t use this format as compression can be very CPU
intensive. Instead weberservers have SSL encryption and compression performed by
built-in hardware modules or using outboard network devices to take the CPU load off of
a web server.

Apache Website Security

Behind A NAT Firewall

If your web server sits behind a NAT firewall (public->Private NAT), you may want to
have the server respond on both public and private IP addresses. Apache allows you to
specify multiple IP addresses in the <VirtualHost> statements to serve the same content
on both IP addresses:


  DocumentRoot /www/server1
  ServerName www.my-site.com
  ServerAlias bigboy, www.my-site-192-168-1-100.com

In addition to having the latest Apache code, review Internet info on the latest Apache
security patches and procedures, proper application coding technniques and code
isolation (SELINUX CGI). Apache security is setup on a per-directory basis.

Disable Directory Listings

Include an index.html pages in each subdirectory under your DocumentRoot directory;
otherwise Apache will provide a directory listing of all the files in that subdirectory
unless you disable the directory listing by using a -Indexes option in the <Directory>
directive for the DocumentRoot like this:
<Directory "/home/www/*">
 Options MultiViews -Indexes SymLinksIfOwnerMatch IncludesNoExec
Users attempting to access the nonexistent index page will instead get a "403 Access
denied" message.

Access Control Lists (By Hosts)

Apache can restrict site access by host or network the same as TCPD_WRAPPERs under
XINETD. This is usually applied per site DocumentRoot directory as follows:

<Directory “/var/www/site1”>
  Options Indexes FollowSymLinks
  Order allow,deny                                  .. specifies the order
  Allow from all                                   .. can also be a subnet or domain
  Deny from none                                   .. can also be a subnet or domain
  AllowOverride all                      .. permmits use of user level security

Password Protected Web Pages

Web pages served from main and subdirectories of DocumentRoot can be password
protected using the command “httpasswd. This command creates a user/password file
similar to /etc/passwd. One password file for each directory to be protected in the site all
the way up to DocumentRoot. The password file can have any name; by convention
“.htpasswd” is used The password file SHOULD NOT be placed in a directory path
under DocumentRoot where it might be exposed to a browser. There are two ways to
specify a password file location – by placing a file called .htaccess in the directory to be
protected or by specifying AuthUserFile as a Directory or Location directive within the
site directives.

     1) Use Apache's htpasswd password utility to create username/password
        combination in the .htpasswd file (web server, not system passwd file) for Web
        page access. Specify the location of the password file (/etc/httpd/conf is good)
        and if it doesn't yet exist, include a -c, or create switch on the command line to
        create it. Any directory away from DocumentRoot tree where Web users could
        possibly view it.
     httpasswd -c /etc/httpd/conf/.htpasswd peter .. -c for first time)
     httpasswd /etc/httpd/conf/.htpasswd paul            .. each successive user

          You will be prompted to supply a password as in the passwd command.

     2) Make the .htpasswd file readable by all users: chmod 644
     3) Create a .htaccess file in the directory to which you want password control with
        these entries.
AuthUserFile /etc/httpd/conf/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic
require valid-user
or as follows in the site directives

<Directory “/var/www/site1”>
 AllowOverride all                                 .. permmits use of user level security
 AuthUserFile /etc/httpd/conf/.htpasswd
 AuthGroupFile /dev/null                           .. or a separate file in /etc/group format
 AuthName “EnterPassword”               .. name of the security “realm” displayed on the LOGIN box
 AuthType Basic                                    .. Digest authentication not always supported, use SSL isntead.
 require valid-user                                .. required to specify use of the /etc/passwd file

.htaccess password protects the directory and all its subdirectories. AuthUserFile tells
Apache to use the .htpasswd file. The require user statement tells Apache that only user
peter in the .htpasswd file should have access. If you want all .htpasswd users to have
access, replace this line with require valid-user. AuthType Basic instructs Apache to
accept basic unencrypted passwords from the remote users' Web browser.

     4) Make the .htpasswd file readable by all users: chmod 644
     5) Make sure your /etc/httpd/conf/http.conf file has an AllowOverride statement in a
        <Directory> directive for any directory requiring password authorization.
<Directory /home/www/*>
  AllowOverride AuthConfig
     6) Make sure that you have a <VirtualHost> directive that defines access to
        /home/www or another directory higher up in the tree.
<VirtualHost *>
  DocumentRoot /home/www

     7) You can combine Host ACLs and Password protection under specific Directory or
        Location site directives usng the “satisfy any” or “satisfy all”parameter

<Directory “/var/www/site1”>
  Options Indexes FollowSymLinks
  Order allow,deny                                   .. specifies the order
  Allow from all                                   .. can also be a subnet or domain
  Deny from none                                   .. can also be a subnet or domain
 AllowOverride all                                 .. permmits use of user level security
 AuthUserFile /etc/httpd/conf/.htpasswd
 AuthGroupFile /dev/null                           .. or a separate file in /etc/group format
 AuthName “EnterPassword”               .. name of the security “realm” displayed on the LOGIN box
 AuthType Basic                                    .. Digest authentication not always supported, use SSL isntead.
 require valid-user                                .. required to specify use of the /etc/passwd file
 satisfy any

Troubleshooting Apache

Testing Basic HTTP Connectivity
TELNET to port 80 (HTTP) or the specified Listen port at the desired URL. Failure to
do so indicates connectivity issue (network or ACL), incorrect Listen port or the service
is not started.

Basic HTTP Status Codes

Are found in the main httpd.conf file.
      HTTP_Code                                 Description

           200         Successful request

           304         Successful request, but the web page requested hasn't been
                       modified since the current version in the remote web
                       browser's cache. This means the web page will not be sent to
                       the remote browser, it will just use its cached version instead.
                       Frequently occurs when a surfer is browsing back and forth
                       on a site.

           401         Unauthorized access. Someone entered an incorrect username
                       / password on a password protected page.

           403         Forbidden. File permissions prevents Apache from reading
                       the file. Often occurs when the web page file is owned by
                       user "root" even though it has universal read access.

           404         Not found. Page requested doesn't exist.

           500         Internal server error. Frequently generated by CGI scripts that
                       fail due to bad syntax. Check your error_log file for further
                       details on the script's error message.

Missing Web Pages – 404 Mesages

Default action in Apache for missing web pages is to display of a generic "404 file Not
Found" message. You can configure tell Apache to display a predefined HTML file
whenever a web surfer attempts to access a non-index page that doesn't exist by You
placing this statement in the httpd.conf file:

ErrorDocument 404 /missing.htm

Then put a file with the name missing.htm in each DocumentRoot directory.

Browser 403 Forbidden Messages
Browser 403 Forbidden messages are usually caused by file permissions or security
context issues as in SELINUX.
A sure sign of problems related to security context are "avc: denied" messages in your
/var/log/messages log file;
Nov 21 20:41:23 bigboy kernel: audit(1101098483.897:0): avc: denied { getattr } for
pid=1377 exe=/usr/sbin/httpd path=/home/www/index.html dev=hda5 ino=12
scontext=root:system_r:httpd_t tcontext=root:object_r:home_root_t tclass=file

Only The Default Apache Page Appears

When only the default Apache page appears, there are two main causes. The first is the
lack of an index.html file in your Web site's DocumentRoot directory. The second cause
is usually related to an incorrect security context for the Web page's file.

Server Name Errors

All ServerName directives must list a domain that is resolvable in DNS, or else you'll get
an error similar to these when starting httpd.
Starting httpd: httpd: Could not determine the server's fully qualified domain name, using for ServerName

Starting httpd: [Wed Feb 04 21:18:16 2004] [error] (EAI 2)Name or service not known: Failed to resolve server name for (check DNS) -- or specify an explicit ServerName

You can avoid this by adding a default generic ServerName directive at the top of the
httpd.conf file that references localhost instead of the default new.host.name:80.
#ServerName new.host.name:80
ServerName localhost

The Apache Status Log Files

/var/log/httpd/access_log is updated after every HTTP query and is a good source of
general purpose information about your website. There is a fixed formatting style with
each entry being separated by spaces or quotation marks.

Apache Log File Format
         FieldNumber                               Description                                 Separator

                  1            IP Address of the remote web surfer                        Spaces

                  2            Time Stamp                                                 Square Brackets

                  3            HTTP query including the web page                          Quotes ""
             4         HTTP result code                           Spaces

             5         The amount of data in bytes sent to the    Spaces
                       remote web browser

             6         The web page that contained the link to    Quotes ""
                       the page served.

             7         The version of the web browser used to     Quotes ""
                       get the page

The HTTP status code can provide some insight into the types of operations surfers are
trying to attempt and may help to isolate problems with your pages, not the operation of
the Apache. For example 404 errors are generated when someone tries to access a web
page that doesn't exist anymore. This could be caused by incorrect URL links in other
pages on you site.

/var/log/httpd/error_log file is a good source for error information. Unlike the
/var/log/httpd/access_log file, there is no standardized formatting. Typical errors that
you'll find here are HTTP queries for files that don't exist or forbidden requests for
directory listings. The file will also include Apache startup errors which can be very
useful. /var/log/httpd/error_log file also is the location where CGI script errors are
written. Many times CGI scripts fail with a blank screen on your browser; the
/var/log/httpd/error_log file most likely lists the cause of the problem.

To top