VIEWS: 3 PAGES: 4 POSTED ON: 6/11/2011
Towards a Simulation Framework for the Evaluation of Distributed Applications Executing in the Internet Domain Phillip M. Dickens Department of Computer Science Illinois Institute of Technology Chicago, Illinois Abstract reasons: First, it outlines the approach that we feel is necessary to develop and We are developing detailed simulation efficiently execute detailed simulation models to study, evaluate, and predict the models of distributed applications executing performance of complex distributed in the Internet domain. Secondly, it provides applications executing in the Internet insight into how one complex distributed domain. Our approach is to model each application, that being the simulation component of the distributed application in framework itself, is designed to achieve as much detail as required, and to efficiently scalability while coordinating its simulation execute the model by partitioning it across activity across the global Internet. multiple processors and using scalable parallel simulation techniques to coordinate 2 Parallel Discrete Event the simulation activity. The interactions of Simulation system components are modeled by the interactions between individual parallel In a parallel discrete event simulation simulations. Thus the component (PDES), a simulation model is decomposed simulations cooperate and coordinate their into a set of sub-models, each of which is simulation activity, and, when taken capable of execution on a separate together, provide detailed composite models processor. These sub-models are generally of overall system level behavior. termed logical processes or LPs, and the LPs communicate through time-stamped 1 Introduction messages termed events. An event represents a change to the state of the sustem being In this paper, we present the modeled, and the timestamp represents the framework through which multiple parallel virtual (or simulation) time at which the simulations interact to produce models of event occurs. Each LP maintains its own applications executing in the Internet event list, as well as its own virtual clock domain. We describe a new two-level that represents the simulation time up to synchronization mechanism that logically which the sub-model has been executed. separates the internal synchronization The fundamental problem in a protocol (that is, internal to a particular parallel discrete event simulation is to parallel simulation) from the external maintain causality constraints, which synchronization protocol (i.e. the essentially means that all causes of a given mechanism that synchronizes the activity event are executed before the event itself is between parallel simulations). To executed. That is, for a PDES to be correct, demonstrate the applicability of our each LP must process all of its events in approach, we discuss how one important non-decreasing timestamp order. application, that being Internet telephony The synchronization of PDES is a call-signaling protocols, fits into our well-studied problem in the parallel modeling framework. This paper is of simulation community, and synchronization interest to the Internet community for two techniques broadly fall into two categories. In a conservative approach (e.g. ) an LP 3 Approach is not allowed to process an event with timestamp t if it is possible to receive an The distributed application is modeled as a event with a timestamp less than t at any set of clusters that communicate and interact point in the future. In the optimistic strictly through the Internet. There are no approach (e.g. TimeWarp ), an LP is restrictions on the types of computational allowed to process any event it receives, and activity that can fill the role of a cluster in any out-of-order processing is corrected this model, and thus individual clusters can through a state saving and rollback represent supercomputers, workstation/PC mechanism. clusters communicating over a LAN, remote Our approach is based on databases, remote parallel file systems, and conservative synchronization techniques due other complex computational systems. In to the fact that such techniques have been this approach, each of the clusters and the shown to scale well both analytically and model of the underlying communication empirically. It has been well established that network (e.g. some portion of the global in order for conservative techniques to Internet) are modeled by a separate parallel perform well the simulation model must simulation utilizing as much computational possess good look-ahead characteristics power as needed to model the component at defined as the ability of an LP to predict its the required level of detail. The look-ahead future behavior. Thus an important aspect of in this model is derived from the fact that developing this simulation framework is the the clusters communicate solely through the exploitation of the look-ahead characteristics Internet, guaranteeing that a non-trivial available within the simulation model. amount of simulation time elapses between These ideas become somewhat more the virtual time at which a cluster places an complex when multiple parallel simulations event on the Internet, and the virtual time at cooperate to model the behavior of a which the event actually impacts the distributed application. In such a federation simulation activity of the recipient. This of parallel simulations, each individual look-ahead information provides the simulation has two streams of events that underpinnings for the development of must be merged and executed in the correct scalable federated simulations modeling the timestamp order. One event stream consists interaction of these system components. of the events representing the component This simple model, shown in Figure being modeled by that particular simulation. one, captures many important distributed This event stream represents internal events, applications quite well. It captures high- or those events having to do solely with the performance computing on the particular system component being modeled. computational grid (using a system such as The other event stream consists of the events wide-area MPICH-G2  or Globus ) that are injected into the simulation from where the clusters would represent high- other simulation federates, and represents performance computational resources that the interaction of the various system communicate and synchronize their activity components. These events are termed across the global Internet. Internet service external events. Each cooperating parallel protocols, such as Internet telephony call simulation must merge these two event signaling protocols, are also captured. In this streams such that all of the events in the case, the clusters would represent Intranet composite event stream are executed in the domains, where call signaling between correct timestamp order. For such a Intranets would be routed over the Internet. federation of parallel simulations to be Distributed collaboration using the Access correct, each parallel simulation must Grid is also captured. In this case, the process all of its events in the composite clusters would represent nodes on the event stream in the correct timestamp order. 2 Access Grid that communicate strictly a small table (the Minimum Reservation across the Internet. Time Table or MRTT) with a single entry for each external simulation. The entry for a 3.1 External Synchronization given simulation reflects the minimum time at which any event destined for that federate In this framework, each parallel simulation can exit the local simulation. In the absence has two synchronization mechanisms: one of reservation information for a given ensuring the proper sequencing of internal external simulation a standing reservation is events and the other ensuring the proper assumed with reservation time equal to the merging of the two event streams. The sum of the current virtual time and the internal PDES synchronization algorithm is minimum delay required to create a new chosen based on the characteristics of the event and have that event exit the model being simulated. The key factor of simulation. this approach is the aggressive exploitation of look-ahead information from within each 4 An Example Application: of the simulation federates to increase the Internet Telephony level of concurrency within the global simulation system. The external Consider Internet call signaling synchronization protocol is as follows. protocols as an example application for the Associated with each participating simulation framework. Call signaling and simulation is a local reservation agent. This user feature deployment are inexorably agent is responsible for extracting, moving to the Internet as the network of maintaining, and updating the look-ahead choice, primarily due to the ease and information related to its interactions with rapidity of developing and implementing other simulation federates. The simulation user features in this environment. In Internet associated with Internet connectivity derives telephony, general-purpose computers are this information based on the minimum used for the call controller (often termed a delay between the current location of a “soft switch”), and user features may well be given event and its arrival at the destination implemented and deployed as Java applets. simulation. This information is updated as As the call setup phase progresses, user the event moves through the simulated feature applets may be spawned for both the network. In the case of the simulations caller and callee, and once spawned, modeling the clusters, the particular look- communicate with each other and the ahead mechanism is model dependent. Internet call controller using the Internet as To track this look-ahead the transmission medium. information, each of the local agents Internet telephony offers the maintains a reservation board that lists all promise of easily developed personalized known external events (i.e. events destined user features, which, in the unregulated for an external simulation), and the world of IP, will result in significant minimum time at which each such event can numbers of parties writing their own exit the simulation. This reservation board is personalized features. Such separately updated as events move through the local developed and completely independent user simulation, new events being added to the features will significantly increase the board as they are created, and old events feature-interaction problem, where feature being deleted as they exit the simulation. It applets interact with each other in is important to note that the local agent does inconsistent, incomplete, or un- not have to track events that it knows will implementable ways. The problem of user not exit the simulation. feature-interaction is quite difficult, and the The local agent distills this detailed simulation models we are information from the reservation board into developing where such interactions can be 3 studied and evaluated will be of significant performance and scalability of the value. framework itself. Consider how this simulation problem fits very well within the framework References: we are developing. That is, this application can be modeled as a set of clusters that  Nicol, D. The Cost of Conservative communicate strictly across the Internet. Synchronization in Parallel Discrete The clusters represent large Intranets, where Event Simulation, Journal of the ACM, local telephony traffic stays within the given Vol. 40, No. 7, April 1993, pp. 304-333. cluster, and telephony traffic between clusters is routed through the Internet. The  ] Jefferson, D. Virtual Time, ACM local events of the individual clusters Transactions on Programming Languages represent the computational activity and Systems, 1985, Vol 7, No. 3. (including Intranet telephony) that remains strictly within the cluster. The external URL:http://www.globus.org/about/news/ events of the clusters represent Internet MPICH-G2.html telephony activity between clusters. The internal events of the Internet simulation  URL: http://www.globus.org have to do with the generation of realistic Internet traffic with which the Internet  Dahmann, J., Fujimoto, R. and R. telephony messages must contend, and the Weatherly. The Department of Defense external events are those placed on the High Level Architecture, Proceedings of Internet by one of the clusters. the 1997 Winter Simulation Conference.  Riley, G., Fujimoto, R., and M. Ammar. 5 Related Work A Generic Framework for Parallelization of Network Simulations, MASCOTS, The most closely related projects include the 1999. DoD High Level Architecture  the federation of parallel and sequential simulations undertaken at the Georgia  URL: http://www.ssfnet.org Institute of Technology , and the Scalable Simulation Framework (SSF, ). The most important distinction between our approach and these related efforts is that we are Cluste Cluster exploiting look-ahead information from 0 0 r within the individual simulations to increase the level of concurrency within the global federation. Other approaches do not attempt to exploit such information in their Cluster Internet Cluster 3 Cloud 1 synchronization algorithm. 6 Conclusions and Future Cluster Work 2 In this paper, we have discussed our ongoing research efforts to model complex applications executing in the Internet domain.The primary task ahead of us is to complete the implementation of the Figure 1. This figure represents the basic system model where computational simulation framework and gather clusters communicate over the Internet. experimental results related to the 4
"itrint2short - Scalable Federated Distributed Simulations"