Friday, December 28, 2007

Web server tutorial

Web server tutorial - Part 1

Basically for communication where there is a client-server flavor, the server process creates a socket and the client socket accesses the server through client socket techniques.

Socket

A socket is fundamentally nothing but an end point of communication. It can be of two types: Physical socket and Logical socket. In Logical socket operating system has its system calls, which creates them. Now for client-server access the socket needs three things to provide service or ask for service.
1) Service name (example: telnet)
2) Protocol (TCP-stream)
3) Port no (23)
The service uses protocol and protocol uses port number to provide service at server end and to get service at client end. Ultimately we find that the port number is mainly responsible for a client server communication. The protocols supported by Linux is shown by /etc/protocols and the services can be seen in /etc/services.

Let's take few more examples then start with Web server.
* telnet service uses TCP/IP protocol and communicate through port no. 23
* ftp service uses TCP/IP protocol and communicate through 20,21 port numbers
* www service uses http protocol and communicate through port no 80.

Web communication

Web communication deals with a browser type of client process and Web server type of server process. What actually happens when a user writes http://www.yahoo.com? Well, the browser transfers the URL to current machine's operating system with a destination address' operating system, which is responsible for extracting protocol i.e. "http" from the client socket (browsers) and then it packets data using layer software and over the packet it attaches the header http. This enables the remote machine to hand over the request to Web server of remote machine. Why so? Because there can be many a server running on the same machine so the particular services are distinguished by their protocol.

But how should we explain when telnet and ftp both are using same protocol but have different server Processes? The answer is that they are distinguished by their port numbers. Services may have same protocol but not the same port number. After this the operating system throws the data to network interface card through the ram and then network interface card gives it to nearest gateway, which sends the data to the server machine at server end.

The network card gives a signal back to operating system that a data enclosed with http header using TCP/IP header has arrived. One's operating system checks that data has http wrapper and searches for Web server on that machine. When it finds, it hands over the data and pays attention to other processes.

Before the Web server processes the data, it goes through a filtration by the gateway process implemented on the Web server, which actually filters the raw data. This concept implemented is called as common gateway interface that has the Web server environment variables, which stores the data in different variable. When the user asks for some unnecessary data, headers also get attached with data and so the need for filtration.

Apache as Web server

Setup:
The Web server is meant for keeping Websites. There are three ways a Website can be stored. They are:
1) default directory hosting
2) virtual directory hosting
3) virtual domain hosting

We have to first configure the DNS. Then configure the following file (redhat 6.2) /etc/httpd/conf/httpd.conf If we use Apache as a Web server whether on Windows platform or Linux, the main file which is used is called /etc/httpd/conf/httpd.conf

The root directory of Web server is /etc/httpd, which is divided into three parts:
1) /etc/httpd/conf (where configuration files stays)
2) /etc/httpd/logs (where the logs of Web server and site accessing stay)
3) /etc/httpd/modules (where the module stays, which enables the server side programmer to do programming in the languages supported by Web server)

Lets open the file /etc/httpd/conf/httpd.conf and take a detailed look at the macros to be used.

httpd.conf-Apache HTTP server configuration file
(Based upon the NCSA server configuration files originally by Rob McCool.)

This is the main Apache server configuration file. It contains the configuration directives that give the server its instructions.

Note: See http://www.Apache.org/docs for detailed information about the directives. Do not simply read the instructions in here without understanding what they do. They're here as hints or reminders. If you are unsure consult the online docs.

After this (httpd.conf) file is processed, the server will look for and process (only in the case of 6.1 the following mentioned file is checked. If it is 6.2 they are not checked):
/usr/conf/srm.conf
and then
/usr/conf/access.conf
unless you have overridden these with ResourceConfig and/or AccessConfig directives here.

Directives

The configuration directives are grouped into three basic sections:
1. Directives that control the operation of the Apache server process as a whole (the 'global environment').
2. Directives that define the parameters of the `main' or `default' server, which responds to requests that aren't handled by a virtual host. These directives also provide default values for the settings of all virtual hosts.
3. Settings for virtual hosts, which allow Web requests to be sent to different IP addresses or hostnames and have them handled by the same Apache server process.

Section 1: Global Environment

The directives in this section affect the overall operation of Apache, such as the number of concurrent requests it can handle or where it can find its configuration files.

ServerType: ServerType is either inetd, or standalone. Inetd mode is only supported on Unix platforms.

ServerRoot: The top of the directory tree under which the server's configuration, error, and log files are kept.

NOTE: If you intend to place this on an NFS (or otherwise network) mounted filesystem then please read the LockFile documentation (available at http://www.Apache.org/docs/mod/core.htmllockfile); You will save yourself a lot of trouble. Do not add a slash at the end of the directory path.
ServerRoot "/etc/httpd"

LockFile: The LockFile directive sets the path to the lockfile used when Apache is compiled with either
USE_FCNTL_SERIALIZED_ACCEPT or
USE_FLOCK_SERIALIZED_ACCEPT.

This directive should normally be left at its default value. The main reason for changing it is if the logs directory is NFS mounted, since the lockfile must be stored on a local disk. The PID of the main server process is automatically appended to the filename.
LockFile /var/lock/httpd.lock

PidFile: The file in which the server should record its process identification number when it starts.
PidFile /var/run/httpd.pid

ScoreBoardFile: File used to store internal server process information. Not all architectures require this. But if yours does (you'll know because this file will be created when you run Apache) then you must ensure that no two invocations of Apache share the same scoreboard file.
ScoreBoardFile /var/run/httpd.scoreboard

In the standard configuration, the server will process this file, srm.conf, and access.conf in that order. The latter two files are now distributed empty, as it is recommended that all directives be kept in a single file for simplicity. The commented-out values below are the built-in defaults. You can have the server ignore these files altogether by using "/dev/null" (for Unix) or "nul" (for Win32) for the arguments to the directives.

ResourceConfig conf/srm.conf
AccessConfig conf/access.conf

Timeout: The number of seconds before receives and sends time out.
Timeout 300

KeepAlive: Whether or not to allow persistent connections (more than one request per connection). Set to "Off" to deactivate. But we keep it :
KeepAlive On

MaxKeepAliveRequests: The maximum number of requests to be allowed during a persistent connection. Set to 0 to allow an unlimited amount. We recommend you leave this number high, for maximum performance.
MaxKeepAliveRequests 100

KeepAliveTimeout: Number of seconds to wait for the next request from the same client on the same connection.
KeepAliveTimeout 15

Server-pool size regulation: Rather than making you guess how many server processes you need, Apache dynamically adapts to the load it sees --- that is, it tries to maintain enough server processes to handle the current load, plus a few spare servers to handle transient load spikes (e.g, multiple simultaneous requests from a single Netscape browser).

It does this by periodically checking how many servers are waiting for a request. If there are fewer than MinSpareServers, it creates a new spare. If there are more than MaxSpareServers, some of the spares die off. The default values are probably OK for most sites.
MinSpareServers 5
MaxSpareServers 20

Number of servers to start initially should be a reasonable ballpark figure.
StartServers 8

Limit on total number of servers running: Limit on the number of clients who can simultaneously connect. If this limit is ever reached, clients will be `locked out', so it should not be set too low. It is intended, mainly, as a brake to keep a runaway server from taking the system with it as it spirals down.
MaxClients 150

MaxRequestsPerChild: The number of requests each child process is allowed to process before the child dies. The child will exit so as to avoid problems after prolonged use when Apache (and maybe the libraries it uses) leak memory or other resources. On most systems, this isn't really needed, but a few (such as Solaris) do have notable leaks in the libraries. For these platforms, set to something like 10000 or so; a setting of 0 means unlimited.

NOTE: This value does not include keepalive requests after the initial request per connection. For example, if a child process handles an initial request and 10 subsequent "keptalive" requests, it would only count as 1 request towards this limit.
MaxRequestsPerChild 100

Listen: Allows you to bind Apache to specific IP addresses and/or ports, in addition to the default. See also the directive.
Listen 3000
Listen 12.34.56.78:80

BindAddress: You can support virtual hosts with this option. This directive is used to tell the server which IP address to listen to. It can either contain "*", an IP address, or a fully qualified Internet domain name.
BindAddress *

Well that's all for now. In the second part we shall look into Dynamic Shared Object (DSO) Support and also 'Main' server configuration.

source: http://www.freeos.com

Other articles by Vans Information:

Setting up PPP and KPPPD

Web server tutorial - Part 3

Web server tutorial - Part 2

Web server tutorial - Part 1

1 comment:

  1. Thanks.

    My blogs pls donte delete i need bl ; )
    http://yagmurunsesiorg.blogspot.com
    http://www.renovationdoctors.com
    http://websitesiyapamak.blogspot.com
    http://turizmseyahat.blogspot.com
    http://saglik-k.blogspot.com
    http://www.yagmurunsesi.org
    http://yagmurunsesiorg.blogspot.com
    http://ders-hane.blogspot.com

    ReplyDelete

Please leave your comments or your promotion links, but don't add HTML links into the comment body, because I consider it as a spam, and will be delete..

Thank you for your visit..