DNS by sharanam

VIEWS: 99 PAGES: 14

									Load Balancing Web Proxy Service
Long Le & Dorian Miller December 18, 2000 Distributed Systems UNC comp 243

Load Balancing Web Proxy Service

Table of contents

1 2
2.1

SYSTEM SERVICES ...................................................................................... 1 SYSTEM STRUCTURE .................................................................................. 2
Process Group theory ................................................................................................................... 3

3 4
4.1 4.2

MAINTAINED SHARED STATE AND LOAD BALANCING ALGORITHM .... 4 SYSTEM PROTOCOLS ................................................................................. 5
Surge/ADNS protocol ................................................................................................................... 6 Proxy/ADNS protocol ................................................................................................................... 7

5
5.1 5.2

FAULT TOLERANCE AND AVAILABILITY................................................... 8
Handled errors .............................................................................................................................. 8 Error not handled ......................................................................................................................... 9

6 7

IMPLEMENTATION TECHNIQUES ............................................................. 10 DEMONSTRATION ...................................................................................... 11

Table of figures
Figure 1: System services ............................................................................................................................... 1 Figure 2: System processes ............................................................................................................................ 2 Figure 3: Protocol overview ........................................................................................................................... 5 Figure 4: Multicast protocol ........................................................................................................................... 5 Figure 5: Surge/ADNS protocol ..................................................................................................................... 6 Figure 6: Proxy/ADNS protocol ..................................................................................................................... 7

Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

1 System services
The practical implication of the "Web Proxy Load Balancing service" is to provide a system that reduces the latency of a web browser to retrieve internet information. Web browsers' are connected to web proxy caches that speed information retrieval by storing the data closer to web clients rather than retrieving it from the original web server. The focus of the delivered software is a distributed system to provide web browsers reliable ongoing access to the web caches. In addition, the system load balances the web requests amongst various proxies. Figure 1 below summaries how the system functions. Surge simulates multiple web requests from, such as requests Netscape or Internet Explorer would issue. Surge first queries the alternate domain name server (ADNS) for a web proxy's IP address and port number that Surge should use to retrieve web pages from the internet. Similarly, a real DNS redirects the web client to an arbitrary web server. Arrow "1" indicates Surge's request and arrow "2" indicates the ADNS redirection. The web proxy completes the web request by retrieving a web page from a web server as indicated with arrow "3" and returning the web page back to Surge. The web server's failure is out of the scope of the project. The system is robust enough to recover from failure of the Surge, the web proxies, and ADNS. The proxies will automatically restart the ADNS when it fails.
Web proxy caches

Surge, Web client Alternate DNS (ADNS)

www.unc.edu

Project scope

Figure 1: System services The "Web Proxy Balancing service" meets the definition of a distributed system. The ADNS Surge and Web-proxy processes run independently on separate machines as described in the "System structure section". The processes have an ongoing communication through several different protocols as described in the "Protocols" section. The section on fault tolerance describes how the ADNS and proxies provide an ongoing reliable service.

1 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

2 System structure
System structure is an overview of the independently functioning system processes. The processes are summarized in the table and figure below. The processes can be started on any UNIX machine. The system demonstration runs on the independent Solaris machines eagle, swift, and capefear. The ADNS is the corner stone of the system that connects the Surge web-clients to the web proxies. There is only one instance of the ADNS whereby there are several instances of the proxies and web clients. All processes are started manually to initialize the system. Surge and the proxies have a mechanism to wait until an ADNS starts and thereby completes the system initialization. Although there is only one instance of the ADNS, it is the most robust process. An instance of the proxies will restart the ADNS when it is down. Losing a proxy process is tolerable, as they are redundant, although the web access load of the other proxies will increase. Failure of Surge is acceptable, as it only effects the individual user of a web browser and no other web-browsing clients. The web server the proxy interfaces to is assumed to always be running as its functioning is outside the scope of the project. The limit of process that the system can handle has not fully being explored. However, a minimum number of proxies should be able to register with the ADNS. Surge clients on the other hand should be refused when the ADNS has reached its limit.

ADNS

protocol

protocol protocol

Surge Web-browser

Proxy
protocol

WebServer

Figure 2: System processes

2 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service Table 1: Table 1, System processes Process Started/restarted Surge Manual/manual Proxy Manual/manual ADNS Manual/automatic

# of processes Many 1-5 1

2.1

Process Group theory

The ADNS, the proxies, and Surge processes function as a process group. The ADNS is equivalent of the Group Membership Service, which maintains state and communication with the proxies. Surge is an outside process that contacts the Group Membership Service to find the proxy location information. Broken sockets are the method the ADNS uses to detect that a proxy has failed and the ADNS can remove that proxy from the membership list (more detail in the Fault tolerance section).

3 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

3 Maintained shared state and load balancing algorithm
The system state is maintained in the ADNS to accomplish the purpose of connecting the Surge web clients to the least utilized web proxy. This means the ADNS maintains the proxy location (IP and port). When a proxy fails, the ADNS instantly detects the failure and removes the proxy information from its list of available proxies. When the ADNS fails, the proxy information is updated upon a reconnect of the proxy to the ADNS. In addition, the ADNS maintains the current load of the proxies so that it can identify the least utilized proxy to refer to the web client. The current load measurement is the number web requests a proxy receives in a 30-second interval. The proxy reports the load to the ADNS every time interval and reset its load to accumulate for the next interval. In addition, the ADNS tries to update the proxy load in the time interval the proxy is updating the new load. Each time the ADNS refers a web client, for example to proxy A, proxy A's load is incremented by one. This increment was discovered by trial and error to improve the load; for example, if three proxies reported equal load, the ADNS would always refer the first proxy (picking the first lowest load proxy). Incrementing the load value stored by the ADNS would mean the proxies are referred to in a round robin fashion. A new proxy registering with the ADNS will have a load of zero and the ADNS will refer it to web clients until the load reaches the load level of the other proxies. The current load-balancing algorithm can be expanded to include different heuristics to more accurately represent the load. Measurements of CPU usage, file size and response time of the proxy would take into consideration the different resources allocated to each proxy. These extensions could easily incorporated into the existing design; the proxy would have to track the new measurements and send them to the ADNS. The ADNS can use the information to scale the decision of which proxy to refer to the web client. This distributed system would be more complex if there were tighter timing requirements or a more dependent sequence of events. As it is, Surge queries the ADNS for a proxy and queries continually until a proxy registers with the ADNS.

4 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

4 System Protocols
Protocols are used to communicate among the different process. State information is either reported or a query for information is made. The http protocol between Surge, the proxy and web server already existed at the start of the project. The protocols between ADNS/Surge and ADNS/Proxy were implemented for this project and are the focus of this discussion. A summary of the protocols can be found in the following figure. In addition, there is a multicast protocol used. The network multicasting is the mechanism used by the proxy and Surge to locate the ADNS; all processes agree on one multicast IP address and port number to interconnect at. The multicast protocols are shown in figure 4.

ADNS

DNS protocol

Proxy protocol http 1.0 protocol

Surge Web-browser

Proxy
http 1.0 protocol

WebServer

Figure 3: Protocol overview

ADNS
Web-client Multicast

Proxy multicast

http 1.0 protocol

Surge Webbrowser

Proxy
http 1.0 protocol

WebServer

Figure 4: Multicast protocol
Multicast used by Surge and proxies to find ADNS (double arrow) Multicast used to inform other proxies that new ADNS has been restarted (loop arrow) 5 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service 4.1 Surge/ADNS protocol

The protocol between Surge and ADNS is divided into two parts. The first is a multicast interaction where Surge queries for the ADNS's location. Once the interaction is established, a DNS protocol is used for Surge to query which proxy to use. Surge will poll the ADNS until it has a valid proxy entry; putting the responsibility of Surge reduces the ADNS's complexity. The ADNS does not need to log the Surge clients that have requests. In case the ADNS fails, Surge will recognize this and wait until the ANDS has restarted. This is achieved by having the ADNS send out a multicast message to notify the web clients after it restarts. A summary of this protocol is given in the following protocol figure.

Surge find multicast block until multicast response Looping requests

ADNS

location multicast DNS request DNS response ADNS returns appropriate response

Start over in case of failure

Figure 5: Surge/ADNS protocol The multicast protocol is very simple. At compile time Surge and ADNS agree on a multicast IP address (ex. 224.1.2.3) and port number. Surge will send a request message. ADNS will send a response once it starts or send a response if it is active and receives a request from Surge. The message is very simple; Surge and the ADNS mark their messages from CLIENT or ADNS respectively. This way other Surge processes listening to the same multicast address can distinguish the difference. The ADNS message will return its current location IP address and port number. A good resource to learn about multicasting is from: http://notch.mathstat.muohio.edu/html/Multicast/Multicast-HOWTO.html The interaction between Surge and ADNS is modeled after the interaction between a web browser and real DNS; the DNS resolves the IP address of a server name the web client
6 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service requests. An expansion of this project would be to integrate a commercial web browser with the "Web Proxy Load Balancing system. Although a DNS protocol is implemented between Surge and ADNS, there are some minor differences. In the existing DNS protocol only the IP address of the web server requested is returned. The web browser uses default port 80 to connect to the web server or other user specified port. In this system, the DNS protocol has been expanded to response with a proxy's IP address and port number. Otherwise the protocol information and format is identical. The following web page documents the DNS protocol and is used to implement it. http://www.freesoft.org/CIE/RFC/1035/40.htm DNS messages are usually transmitted using UDP but also using TCP/IP. The current system uses TCP/IP to have guaranteed message delivery. Connecting to the department DNS validated the DNS protocol. 4.2 Proxy/ADNS protocol

The proxy ADNS protocol has three parts. The first multicast protocol is the same as the multicast between Surge and the ADNS; the proxy identifies where the ADNS is. The second multicast protocol is used in the process of restarting the ADNS and is described further in the fault tolerant section. Under normal operation the proxy protocol is used to inform the ADNS of the proxy location and web load. The protocols used in normal operation are illustrated in the following figure:

Proxy find multicast Block until response Normal behavior location multicast report load

ADNS

ADNS started and returns location (IP and port) Proxy entry canceled in ADNS when proxy down

Figure 6: Proxy/ADNS protocol
7 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service The normal proxy/ADNS protocol is a TCP/IP connection. The format of the packet contains: message 4-byte length, 4-byte type, 4-byte IP address, 4-byte port number, and 4-byte load value. The message length and message type are used to distinguish the message from other message types. The proxy location information and load information are stored by the ADNS.

5

Fault tolerance and availability

Handling and recovering from faults allows this distributed system to offer an ongoing reliable service. There are error cases handled and not handled are described in the next two sections. 5.1 Handled errors

All process can experience halting failure and the system will be able to recover. However, one proxy is needed to restart the ADNS. Surge failing is the not vital as this is equivalent to an individual's web browser crashing; the ADNS will continue to function and wait for new requests from Surge. Likewise it is acceptable if the proxy fails during a transaction with Surge. Completing the request is a simple task of Surge retrying the request and is an application detail. The more interesting fault recovery is between the proxy and ADNS. Failure of the proxy is detected immediately by the ADNS, which is triggered by a broken socket connection. The ADNS will remove the proxy entry from the list of available proxies. This ensures that the ADNS does not redirect Surge to an invalid proxy. Solving the recovery of a failed ADNS is similar to Fisher, Lynch and Paterson's statement "..no deterministic algorithm that solves consensus in an asynchronous system and tolerates even a single halting failure." The ADNS and proxies are asynchronous because they have no common clock. When the ADNS experiences a halting failure, the proxies have to restart the ADNS. However, it is open ended which proxy should restart the ADNS because they are of equal priority. The current implementation is to use time-outs. The proxies detect failure when the socket connection to the ADNS breaks. As each proxy recognizes the fault, the process backs off for a random time amount. One of the proxies will finish the waiting period first and send a multicast message to the other proxies to inform them that a new ADNS will be started (see the implementation techniques section for details). In addition, the multicast message contains the IP address and port number of the ADNS's new location. The other proxies receive the multicast and back off again for several seconds to allow
8 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service enough time for the ADNS to be initialized. This technique is very successful in practice although in theory this scheme will fail when two proxies coincidentally want to restart the ADNS. The system was not designed to have two ADNS and duplication would lead to problems. On the other hand, the back off timers are placed so far apart, that the chance for overlap is slim. With this technique every proxy is likely to restart the ADNS. Another solution would favor one proxy over another, which might be favorable if a certain machine should restart the ADNS. At the start of each proxy, the proxy's back off time is set. The back off time must be large enough to allow all proxies to detect the ANDS failure and prevent two proxies from restarting the ADNS. This mechanism would be easy to implement in the existing system. Even more involved would be for the proxies to use a leader election algorithm to chose the proxy to restart the ADNS. A simple web search reveals several leader election algorithms. The proxies do not initially know about each other but could communicate through multicast messages. 5.2 Errors not handled

Some system faults are not handled because of the impossibility to test the cases or the failures are particular to network programming and not distributed systems. The system implementation assumes that the network does not fail, such as a network partition, link failure, or omission failure. In addition, it assumed that multicast is solely being used by this systems processes not conflicting with other processes. The system is not currently designed to handle two ADNS but this could be a possible extension. Currently two ADNS would be confusing because some Surge web clients and proxies would register arbitrarily with one of the other ADNS. Proxies currently would not recognize the other ADNS and restart a new ADNS. A solution would be to have a leader election algorithm to chose between multiple ADNSs as was the case with the proxies restarting the ADNS. It would be an interesting extension for the ADNS to restart proxies if a minimum number of proxies do not exist. The mechanism to restart the proxies is the same as the proxies restarting the ADNS. Process that are hanging or are slow are not dealt with. Currently the only halting failures are detected by socket connections breaking. Slow processes can be dealt with by retrying to make requests to them.

9 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

6 Implementation techniques
There are certain programming techniques that are very useful in developing distributed systems. Some of the mechanisms are simple solutions to problems solved in a more robust way. Modifying Surge to work with this system was straightforward. Surge has one centralized function that creates a new socket for every web transaction. Code only needed to be added before the socket creation to find which proxy to connect to. Restarting a process in UNIX is done though executing a system call to a remote shell with the desired command. The restarted process is independent, but is still a child process of the proxy that restarts the ADNS, which means the ADNS fails again when the particular proxy dies. Creating an independent daemon process could circumvent this. The proxies have a common respawn.conf file, which lists the possible servers and port numbers the new ADNS can be restarted on. Concurrent programming is necessary to handle multiple requests from different processes. The proxy implements concurrently by forking child processes. The ADNS concurrently handles requests with the select command that records when concurrent information is received at a socket. Process halting failure was very easy to detect through a broken socket. The return type or signal of a read and write to a failed socket could identify with the select command or a broken socket. Networking multicast is a flexible mechanism to coordinate processes that are independent and don't know about each other's location. Network multicast is limited to a local network, however, a "Group Message Layer Service" would accomplish the same as done in "Process Groups".

10 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243

Load Balancing Web Proxy Service

7 Demonstration
The system can be demonstrated by systematically adding processes as summarized in the table. Surge can simulate many web clients by being started with four clients each having ten threads. Table 2: System Demonstration Processes involved Surge/ADNS Proxy/ADNS Description Start Surge and observe how it waits for an ADNS to start. When the ADNS starts, Surge will be polling the ADNS for a proxy. Starting the proxy without an ADNS will show that the proxy waits for an ADNS. Once an ADNS is started proxies can be added and removed and they should register with the ADNS Involving all processes will show the normal system operation. Watching the output of Surge and the proxies should show that the load is balanced. Adding and removing proxies should demonstrate normal operations and show how the load-balancing algorithm works. Killing Surge should not interrupt the proxies or ADNS Killing the ADNS demonstrates the robustness of the system. After some time the proxies will restart the ADNS. A "ps-aux" command on the target machine will show the evidence that a new ADNS has been restarted. When the ADNS starts again, the system will continue working and Surge will make requests to the proxies (see the output on the screens).

Surge/Proxy/ADNS

Surge/Proxy/ADNS Kill ADNS

11 Long Le &Dorian Miller December 18, 2000 Distributed Systems comp 243


								
To top