Learning Center
Plans & pricing Sign in
Sign Out

itrint2short - Scalable Federated Distributed Simulations


									   Towards a Simulation Framework for the Evaluation of Distributed
            Applications Executing in the Internet Domain
                                         Phillip M. Dickens
                                 Department of Computer Science
                                  Illinois Institute of Technology
                                          Chicago, Illinois

Abstract                                             reasons: First, it outlines the approach that
                                                     we feel is necessary to develop and
We are developing detailed simulation                efficiently execute detailed simulation
models to study, evaluate, and predict the           models of distributed applications executing
performance of complex distributed                   in the Internet domain. Secondly, it provides
applications executing in the Internet               insight into how one complex distributed
domain. Our approach is to model each                application, that being the simulation
component of the distributed application in          framework itself, is designed to achieve
as much detail as required, and to efficiently       scalability while coordinating its simulation
execute the model by partitioning it across          activity across the global Internet.
multiple processors and using scalable
parallel simulation techniques to coordinate         2 Parallel           Discrete        Event
the simulation activity. The interactions of         Simulation
system components are modeled by the
interactions between individual parallel             In a parallel discrete event simulation
simulations.     Thus      the    component          (PDES), a simulation model is decomposed
simulations cooperate and coordinate their           into a set of sub-models, each of which is
simulation activity, and, when taken                 capable of execution on a separate
together, provide detailed composite models          processor. These sub-models are generally
of overall system level behavior.                    termed logical processes or LPs, and the
                                                     LPs communicate through time-stamped
1 Introduction                                       messages termed events. An event represents
                                                     a change to the state of the sustem being
         In this paper, we present the               modeled, and the timestamp represents the
framework through which multiple parallel            virtual (or simulation) time at which the
simulations interact to produce models of            event occurs. Each LP maintains its own
applications executing in the Internet               event list, as well as its own virtual clock
domain. We describe a new two-level                  that represents the simulation time up to
synchronization mechanism that logically             which the sub-model has been executed.
separates the internal synchronization                       The fundamental problem in a
protocol (that is, internal to a particular          parallel discrete event simulation is to
parallel simulation) from the external               maintain causality constraints, which
synchronization       protocol     (i.e. the         essentially means that all causes of a given
mechanism that synchronizes the activity             event are executed before the event itself is
between      parallel     simulations).  To          executed. That is, for a PDES to be correct,
demonstrate the applicability of our                 each LP must process all of its events in
approach, we discuss how one important               non-decreasing timestamp order.
application, that being Internet telephony                   The synchronization of PDES is a
call-signaling protocols, fits into our              well-studied problem in the parallel
modeling framework. This paper is of                 simulation community, and synchronization
interest to the Internet community for two           techniques broadly fall into two categories.
In a conservative approach (e.g. [1]) an LP      3 Approach
is not allowed to process an event with
timestamp t if it is possible to receive an      The distributed application is modeled as a
event with a timestamp less than t at any        set of clusters that communicate and interact
point in the future. In the optimistic           strictly through the Internet. There are no
approach (e.g. TimeWarp [2]), an LP is           restrictions on the types of computational
allowed to process any event it receives, and    activity that can fill the role of a cluster in
any out-of-order processing is corrected         this model, and thus individual clusters can
through a state saving and rollback              represent supercomputers, workstation/PC
mechanism.                                       clusters communicating over a LAN, remote
         Our approach is based on                databases, remote parallel file systems, and
conservative synchronization techniques due      other complex computational systems. In
to the fact that such techniques have been       this approach, each of the clusters and the
shown to scale well both analytically and        model of the underlying communication
empirically. It has been well established that   network (e.g. some portion of the global
in order for conservative techniques to          Internet) are modeled by a separate parallel
perform well the simulation model must           simulation utilizing as much computational
possess good look-ahead characteristics          power as needed to model the component at
defined as the ability of an LP to predict its   the required level of detail. The look-ahead
future behavior. Thus an important aspect of     in this model is derived from the fact that
developing this simulation framework is the      the clusters communicate solely through the
exploitation of the look-ahead characteristics   Internet, guaranteeing that a non-trivial
available within the simulation model.           amount of simulation time elapses between
         These ideas become somewhat more        the virtual time at which a cluster places an
complex when multiple parallel simulations       event on the Internet, and the virtual time at
cooperate to model the behavior of a             which the event actually impacts the
distributed application. In such a federation    simulation activity of the recipient. This
of parallel simulations, each individual         look-ahead information provides the
simulation has two streams of events that        underpinnings for the development of
must be merged and executed in the correct       scalable federated simulations modeling the
timestamp order. One event stream consists       interaction of these system components.
of the events representing the component                  This simple model, shown in Figure
being modeled by that particular simulation.     one, captures many important distributed
This event stream represents internal events,    applications quite well. It captures high-
or those events having to do solely with the     performance        computing        on      the
particular system component being modeled.       computational grid (using a system such as
The other event stream consists of the events    wide-area MPICH-G2 [3] or Globus [4])
that are injected into the simulation from       where the clusters would represent high-
other simulation federates, and represents       performance computational resources that
the interaction of the various system            communicate and synchronize their activity
components. These events are termed              across the global Internet. Internet service
external events. Each cooperating parallel       protocols, such as Internet telephony call
simulation must merge these two event            signaling protocols, are also captured. In this
streams such that all of the events in the       case, the clusters would represent Intranet
composite event stream are executed in the       domains, where call signaling between
correct timestamp order. For such a              Intranets would be routed over the Internet.
federation of parallel simulations to be         Distributed collaboration using the Access
correct, each parallel simulation must           Grid is also captured. In this case, the
process all of its events in the composite       clusters would represent nodes on the
event stream in the correct timestamp order.

Access Grid that communicate strictly            a small table (the Minimum Reservation
across the Internet.                             Time Table or MRTT) with a single entry
                                                 for each external simulation. The entry for a
3.1 External Synchronization                     given simulation reflects the minimum time
                                                 at which any event destined for that federate
In this framework, each parallel simulation      can exit the local simulation. In the absence
has two synchronization mechanisms: one          of reservation information for a given
ensuring the proper sequencing of internal       external simulation a standing reservation is
events and the other ensuring the proper         assumed with reservation time equal to the
merging of the two event streams. The            sum of the current virtual time and the
internal PDES synchronization algorithm is       minimum delay required to create a new
chosen based on the characteristics of the       event and have that event exit the
model being simulated. The key factor of         simulation.
this approach is the aggressive exploitation
of look-ahead information from within each       4 An Example Application:
of the simulation federates to increase the      Internet Telephony
level of concurrency within the global
simulation      system.       The     external            Consider Internet call signaling
synchronization protocol is as follows.          protocols as an example application for the
         Associated with each participating      simulation framework. Call signaling and
simulation is a local reservation agent. This    user feature deployment are inexorably
agent is responsible for extracting,             moving to the Internet as the network of
maintaining, and updating the look-ahead         choice, primarily due to the ease and
information related to its interactions with     rapidity of developing and implementing
other simulation federates. The simulation       user features in this environment. In Internet
associated with Internet connectivity derives    telephony, general-purpose computers are
this information based on the minimum            used for the call controller (often termed a
delay between the current location of a          “soft switch”), and user features may well be
given event and its arrival at the destination   implemented and deployed as Java applets.
simulation. This information is updated as       As the call setup phase progresses, user
the event moves through the simulated            feature applets may be spawned for both the
network. In the case of the simulations          caller and callee, and once spawned,
modeling the clusters, the particular look-      communicate with each other and the
ahead mechanism is model dependent.              Internet call controller using the Internet as
         To     track      this    look-ahead    the transmission medium.
information, each of the local agents                     Internet telephony offers the
maintains a reservation board that lists all     promise of easily developed personalized
known external events (i.e. events destined      user features, which, in the unregulated
for an external simulation), and the             world of IP, will result in significant
minimum time at which each such event can        numbers of parties writing their own
exit the simulation. This reservation board is   personalized features. Such separately
updated as events move through the local         developed and completely independent user
simulation, new events being added to the        features will significantly increase the
board as they are created, and old events        feature-interaction problem, where feature
being deleted as they exit the simulation. It    applets interact with each other in
is important to note that the local agent does   inconsistent,      incomplete,      or      un-
not have to track events that it knows will      implementable ways. The problem of user
not exit the simulation.                         feature-interaction is quite difficult, and the
         The local agent distills this           detailed simulation models we are
information from the reservation board into      developing where such interactions can be

studied and evaluated will be of significant     performance and            scalability       of   the
value.                                           framework itself.
         Consider how this simulation
problem fits very well within the framework      References:
we are developing. That is, this application
can be modeled as a set of clusters that         [1] Nicol, D. The Cost of Conservative
communicate strictly across the Internet.        Synchronization in Parallel Discrete
The clusters represent large Intranets, where    Event Simulation, Journal of the ACM,
local telephony traffic stays within the given   Vol. 40, No. 7, April 1993, pp. 304-333.
cluster, and telephony traffic between
clusters is routed through the Internet. The     [2] ] Jefferson, D. Virtual Time, ACM
local events of the individual clusters          Transactions on Programming Languages
represent the computational activity             and Systems, 1985, Vol 7, No. 3.
(including Intranet telephony) that remains
strictly within the cluster. The external        [3]URL:
events of the clusters represent Internet        MPICH-G2.html
telephony activity between clusters. The
internal events of the Internet simulation       [4] URL:
have to do with the generation of realistic
Internet traffic with which the Internet         [5] Dahmann, J., Fujimoto, R. and R.
telephony messages must contend, and the         Weatherly. The Department of Defense
external events are those placed on the          High Level Architecture, Proceedings of
Internet by one of the clusters.                 the 1997 Winter Simulation Conference.

                                                 [6] Riley, G., Fujimoto, R., and M. Ammar.
5 Related Work                                   A Generic Framework for Parallelization
                                                 of Network Simulations, MASCOTS,
The most closely related projects include the
DoD High Level Architecture [5] the
federation of parallel and sequential
simulations undertaken at the Georgia            [7] URL:
Institute of Technology [6], and the Scalable
Simulation Framework (SSF, [7]). The most
important distinction between our approach
and these related efforts is that we are
exploiting look-ahead information from                           0 0
within the individual simulations to increase
the level of concurrency within the global
federation. Other approaches do not attempt
to exploit such information in their              Cluster        Internet           Cluster
                                                  3              Cloud              1
synchronization algorithm.

6 Conclusions             and       Future
  Work                                                           2

In this paper, we have discussed our ongoing
research efforts to model complex
applications executing in the Internet
domain.The primary task ahead of us is to
complete the implementation of the               Figure 1. This figure represents the basic
                                                 system model where computational
simulation      framework      and     gather
                                                 clusters communicate over the Internet.
experimental results related to the


To top