The collectors
Shared by: fjzhangweiyun
-
Stats
- views:
- 0
- posted:
- 11/14/2012
- language:
- Latin
- pages:
- 18
Document Sample


Trace Generation to Simulate
Large Scale Distributed
Application
Olivier Dalle, Emiio P. Mancini Mar. 8th, 2012
Outline
• Introduction
• The trace collection
• The hierarchical architecture
• The components
• An example
• Conclusion
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications Mar. 8th, 2012 - 2
Introduction
• Most distributed systems, as the Grids, offer
massively parallel but loosely coupled resources: an
accurate application’s model can help the scheduling
decisions
• Simulators of parallel and distributed applications
need accurate model of application behavior: but the
size of the traces for long running parallel
applications tends to explode
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications Mar. 8th, 2012 - 3
Introduction
• One solution is to buffer data locally, gathering them
after the end of the program (post-mortem): there is
some scalability issue
• We need to minimize the perturbation: the
instrumentation compete with the application for the
system’s resources.
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications Mar. 8th, 2012 - 4
Introduction
• A distributed application is composed by a set of
cooperating tasks
• The connection between them are in general not
homogenous
• Networks may present some hierarchy, e.g. fat trees,
multi switch hops ...
• Can we exploit that hierarchy on the trace
generation/instrumentation purposes?
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications -5
The Trace Collection: a Simplified
Schema
Gateway / Switch
Core Core Core Core
Gateway / Switch Gateway / Switch
CPU CPU
Node Collector /
Post processor
Main Collector /
Post processor
GPU
Local Collector /
Post processor
Node Collector /
Post processor
Application Application Application Application
GPU
Node 1 Node 2 Node 3 Node 4
The classical computational cluster execution model:
• Several task on several nodes (e.g., MPI)
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications -6
The Trace Collection: a Simplified
Schema
Gateway / Switch
Gateway / Switch Gateway / Switch
Node Collector / Main Collector / Local Collector / Node Collector /
Post processor Post processor Post processor Post processor
Application Application Application Application
Node 1 Node 2 Node 3 Node 4
We need to measure some parameters on each task,
collect local data, and gather them.
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications -7
The Trace Collection: a Simplified Schema
In a Grid it is common to In HPC the bandwidth of upper
have a low quality levels is shared between more
connecting link between the hosts than lower levels
V.O. sites Gateway / Switch
Gateway / Switch Gateway / Switch
Node Collector / Main Collector / Local Collector / Node Collector /
Post processor Post processor Post processor Post processor
Application Application Application Application
Node 1 Node 2 Node 3 Node 4
We gather the data hierarchically, using local
collectors, eventually making local decimations or pre-
elaborations. We use the locality principle.
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications -8
The Trace Collection
Management
Simulator Application Sensors Collectors
Unit
a. Starts
b. Starts with
instrumentation
1. Infrastructure c. Estimate
overhead
2. Execution withinitialization
instrumentation
d. Event
e. Event’s data
Data collection update (e.g.,
3.5. Trace generation
a. Environment
Data collection
a. LD_PRELOAD)
a. b.Overhead estimation (e.g., mpiexec,
b. Middleware launcher 4. Processing and Propagation
Post-processing f. Post processing
b. Events’ measurement
c. qsub …)
Simulator’s trace generation
a. Decimation g. Propagation
b. Compression data
h. Gathers
c. Buffering
i. Post processing
j. Traces d. …
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications -9
The architecture
Simulator
Traces
Launching
Storage management
unit
unit
Analysis
Management unit
Collectors Trace files
hierarchy
Application Application
Post Post
processor processor
Buffer Buffer
Sensor Client/Server Sensor
Client/Server
Collector Collector
Operating Operating
systems systems
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 10
The sensors
The sensors:
• Instrument the application’s Simulator
Storage
Launching
unit
Traces
management
unit
tasks Analysis
Management unit
Collectors Trace files
hierarchy
• Compute the
instrumentation’s overhead Application
Post
processor
Post
processor
Application
Buffer Buffer
Sensor Client/Server Sensor
Client/Server
• Collect the raw data
Collector Collector
Operating Operating
systems systems
• Send them to the first level
collectors
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 11
The sensors
• We assume the system to be
heterogeneous Simulator
Traces
Launching
Storage management
unit
unit
Analysis
Management unit
• Every sensor makes an Collectors Trace files
overhead analysis
hierarchy
Application Application
• Then it propagates the
Post Post
processor processor
Buffer Buffer
Sensor Client/Server Sensor
Client/Server
information to the Operating
Collector Collector
Operating
systems systems
management unit
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 12
The collectors
• The collectors gather data
from sensors and from other Simulator
Storage
Launching
unit
Traces
management
unit
collectors Analysis
Management unit
Collectors Trace files
hierarchy
• Buffer incoming data Application Application
Post Post
processor processor
Buffer Buffer
• Process collected data before
Sensor Client/Server Sensor
Client/Server
Collector Collector
Operating Operating
systems
sending them to upper levels systems
• Decimation
• Compression
• …
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications Dec. 14th, 2011 - 13
The Management Unit
• Launches the collector
daemons
• Launches the application
• Gather the data from the top
collector Simulator
Traces
Launching
Storage management
unit
unit
Analysis
Management unit
• Convert and store the data in Collectors
hierarchy
Trace files
the required format
Application Application
Post Post
processor processor
• Managed with scripts or
Buffer Buffer
Sensor Client/Server Sensor
Client/Server
Collector Collector
graphical interface Operating
systems
Operating
systems
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 14
An Example of Data Collection
• We are interested to analyze the I/O of a parallel
synthetic benchmark
• We want to check the overhead
• The benchmark is a MPI application of n tasks
• Every task runs on a different node and writes
random data on the local file system
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 15
An Example of Data Collection:
We use the management unit to:
0
fwrite
70
instrumented fwrite
0
60
<?xml a hierarchical schema
1. Createversion="1.0" encoding="UTF-8"?> Simulator
0
50
<dtracer> Management Unit Storage
Analysis
<collector host="127.0.0.1" desc="Local">
0
40
ms
Main
2. Create the MPI launch scripts desc="Node 1">
<collector host="192.168.56.101"
Collector
0
<task cmd="hostname"/> 30
Local Local
0 Collector Collector
</collector>
20
3. collectors and the \
Launch the$MPIDIR/mpiexec.hydradesc="Node 2">
mpiexec
<collector host="192.168.56.102"
Node
Collector
Node
Collector
Node
Collector
Node
Collector
0
instrumented application-env LD_PRELOAD \ \
10
qsub
<task cmd="hostname"/>
… $DTDIR/libdt_sensor.so Sensor Sensor Sensor Sensor
0
</collector>
8K
6K
K
K
K
8
6
2
$HOME/bench/bench
1K
2K
4K
8K
16
32
64
12
25
51
16
32
65
12
25
</collector> Bytes
1. Collect the results
</dtracer>
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 16
Conclusion
Collecting large traces in distributed systems may
perturb the application’s execution.
We presented a system that efficiently collects
traces at run-time or post-mortem.
We use a hierarchical schema matching the
network links’ capacity, with distributed buffering
and processing
Future improvement will include the automatic
discovery of the network topology
O.Dalle, E.P. Mancini - Trace generation for large scale distributed applications - 17
Thank you
Get documents about "