Load_Balancing_And_Yahoo_

Document Sample
Load_Balancing_And_Yahoo_ Powered By Docstoc
					Title:
Load Balancing And Yahoo!


Word Count:
1089


Summary:
A high-volume site like Yahoo! knows that the actual quality of service any web server provides to end
users basically depends on network-transfer speed and server response time. Network-transfer speed refers
to the Internet-link bandwidth while server-response time depends upon resources including fast CPU, lots
of RAM and good I/O performance. Once these resources are exhausted and the web-server is encountering
heavy traffic, a problem would surely arise.


Load Balancing...



Keywords:
internet, load balancing, yahoo, seo



Article Body:
A high-volume site like Yahoo! knows that the actual quality of service any web server provides to end
users basically depends on network-transfer speed and server response time. Network-transfer speed refers
to the Internet-link bandwidth while server-response time depends upon resources including fast CPU, lots
of RAM and good I/O performance. Once these resources are exhausted and the web-server is encountering
heavy traffic, a problem would surely arise.


Load Balancing


A problematic situation pertaining to difficulty in handling high volumes of incoming traffic can be solved
either through installing more RAM on existing machines or replacing the CPU with a faster one. The use of
faster or dedicated SCSI controllers and disks with shorter access time can also be done. Software can be
tuned so that the operating system parameters and web server software can be adjusted to achieve better
performance.


An alternative approach is to improve performance by increasing the number of web servers. This approach
would attempt to distribute traffic unto a cluster of back-end web servers that need not be large-scale
machines. Web server scalability is achieved when more servers are added to distribute the load among the
group of servers or server cluster.


This is what load balancing is all about. It involves the fine tuning of a computer system, network or disk
subsystem in order to more evenly distribute the data and/or processing across available resources. Load
balancing is distributing, processing and communications activity evenly across a computer network so that
no single device is overwhelmed. Busy websites usually use two or more web servers in a load balancing
scheme so that when one server gets overwhelmed with requests, traffic is forwarded to another server with
more capacity.


There are two probable reasons why a company could want to load balance traffic across firewalls. One is
for purely technical reasons and the other is centered on winning business. The technical aspect should be
quickly addressed as soon as funds and environment allow.


When there is only one web server responding to all incoming HTTP requests for a website, it may not be
able to perform accordingly especially if the website has gained popularity. Loading of web pages will be
very slow and some users would have to wait for their requests to be processed. It can come to a point where
upgrading the server hardware is no longer cost effective due to the increased traffic and connections to a
website.


Yahoo! was granted a patent from a filing done in 1999 regarding coordinating information between
multiple servers that share information as well as servers that may cache some of the information. Load
balancing devices are becoming very common in supporting high-traffic websites. These devices evolve as
websites grow in terms of size, complexity and traffic flow.


The presence of multiple web servers in a server group requires that HTTP traffic be evenly distributed
among the servers. These servers should appear as a single web server to the web client. The load balancer
simply intercepts each request and redirects it to an available server in the server cluster.


Methods of Load Balancing


Load balancing can be achieved in a number of ways. Choice would depend on the individual requirement,
available features, complexity of implementation and the cost. The user company would have to determine
its circumstances to determine which option would work best.


The Round Robin DNS Load Balancing is one of the early adapted load balancing techniques. The built-in
round robin feature of BIND of a DNS server facilitates cycling through the IP addresses corresponding to a
group of servers in a cluster. It is a fairly simple and inexpensive method which is very easy to implement.
However, its downside is that the DNS server does not have any knowledge of server availability thus may
continually point to an unavailable server. It has the ability to differentiate by IP address but not by server
port. There is also the possibility that the IP address is cached by other name servers which would result to
request not being sent to the load balancing DNS server.


In Hardware Load Balancing, hardware load balances route TCP/IP packets to various servers in a cluster.
This method is said to provide a powerful topology with high availability. It uses circuit level network
gateway to route traffic. Its one downside is the higher cost incurred as compared to other methods.


The most commonly used method is Software Load Balancing. Load balancers often come as an integrated
component of expensive web server and application server software packages. This method is more
configurable based on requirements and can incorporate intelligent routing base on multiple input
parameters. An additional hardware needs to be provided to isolate the load balancers.


Algorithm of Server Load Balancing


When HTTP requests are assigned to any server picked randomly among the group of servers, this is called
random allocation. It is possible that one server may be assigned more requests than the others, but generally
each server gets its share of the load. It can be very easy to implement but the risk of overloading one while
under-utilizing another is big.


The IP sprayer assigns the requests to a list of the servers on a rotating basis when the round-robin allocation
is used. The first request goes to a randomly picked server in a group so that the entire first request need not
go to the same server especially if more than one IP sprayer is involved. The circular order is followed in
redirecting the traffic for subsequent requests. The server which has been assigned a request moves to the
end of the list to ensure that all servers are equally assigned. The allocation is much orderly than random but
it may not be enough based on processing overhead required and when there are differences in server
specification in a server group.


The shortcoming of the round-robin allocation has been eliminated by the weighted round-robin version. In
this case, a server that is capable of handling twice as much load as the other can get a weight of two. This
means that the IP sprayer will assign two requests to the powerful server as against one request assigned to
the weaker one. This takes care of the capacity of the servers in the group. However, it does not consider the
advanced load balancing requirements like processing time for individual request. An efficient load balancer
should be capable of intelligent monitoring that would help it direct requests to the server that is more
capable of handling them.




software creator

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:7/23/2011
language:English
pages:3