DICE_perfSONAR
Document Sample


perfSONAR Update
Eric Boyd
Joe Metzger
Nicolas Simar
Martin Swany
Agenda
• perfSONAR Overview
• perfSONAR Development Status and Plans
• perfSONAR Demos
• perfSONAR Deployment Status and Plans
Vision: Performance
Information is …
• Available
– People can find it (Discovery)
– “Community of trust” allows access across administrative
domain boundaries (AA)
• Ubiquitous
– Widely deployed (Paths of interest covered)
– Reliable (Consistently configured correctly)
• Valuable
– Actionable (Analysis suggests course of action)
– Automatable (Applications act on data)
• Easy to produce
– Extensible data-sharing toolkit
Getting There: Build &
Empower the Community
Decouple the Problem Space:
Analysis &
– Analysis and Visualization Analysis & Visualization
– Performance Data Sharing Visualization
– Performance Data API
Generation
Grow the Footprint:
– Clean APIs and protocols Measurement Measurement
between each layer Infrastructure Infrastructure
– Widespread deployment of
measurement infrastructure
API
– Widespread deployment of
common performance Performance
measurement tools Tools Performance
Tools
What is perfSONAR?
• Performance Middleware
– perfSONAR is an international consortium led
by ESnet, GÉANT2, Internet2, and RNP
– perfSONAR is a set of protocol standards for
sharing data between measurement and
monitoring systems
– perfSONAR is a set of open source web
services that can be mixed-and-matched and
extended to create a performance monitoring
framework
perfSONAR Design Goals
• Standards-based
• Modular
• Decentralized
• Locally controlled
• Open Source
• Extensible
• Applicable to multiple generations of network
monitoring systems
• Grows “beyond our control”
• Customized for individual science disciplines
perfSONAR Integrates
• Network measurement tools
• Network measurement archives
• Discovery
• Authentication
• Data manipulation
• Resource protection
• Topology
perfSONAR Credits • GÉANT2 JRA1 includes:
– Arnes
• perfSONAR is a joint effort: – Belnet
– ESnet – Carnet
– GÉANT2 JRA1 – Cesnet
– Internet2 – CYNet
– RNP – DANTE
– DFN
• ESnet includes: – FCCN
– ESnet/LBL staff – GRNet
– GARR
– Fermilab
– ISTF
– PSNC
• Internet2 includes: – Nordunet (Uninett)
– University of Delaware – Renater
– Georgia Tech – RedIRIS
– SLAC – Surfnet
– Internet2 staff – SWITCH
Agenda
• perfSONAR Overview
• perfSONAR Development Status and Plans
• perfSONAR Demos
• perfSONAR Deployment Status and Plans
perfSONAR Development Process
• Loosely coordinated development of web services
– Each web service “owned” by 1 or 2 developers
• Core set of services released as a joint package
– Interoperability testing within the core
– Interoperability testing with “common” UIs
• Reference implementation in Java; some services in perl
• Common development resources (e.g. Bugzilla, mailing
lists, SVN, Wiki)
• Steering committee
• Regular email discussions and conference calls
• Quarterly face-to-face meetings
perfSONAR release 1.1 (orange signifies new)
• Production release of core services package v1.1 is planned for
February, 2007
– Single domain LS solution (PSNC)
– RRD MA (PSNC)
– SQL MA (PSNC)
– BWCTL MP (DFN)
– SSH/Telnet MP (Belnet)
• Recommended visualization to make use of those services
– perfSONAR UI (ISTF)
– CNM (DFN)
• Quality improvements
– Bug fixes
– Documentation
– Functional testing
– Installation
Ongoing and Planned Development Work
• Authentication • Network measurement tools
• Semantics defined (MACE, • ABW (CESNET)
JRA5) • BWCTL MP (DFN)
• BWCTL becomes a MA/MP
• Authorization (Internet2)
• Discussion has just begin • Ciena MP (UDel)
(RedIRIS) • CLI MP (RNP)
• Discovery • L2 status MP (DFN/JRA4)
• Single LS released (PSNC) • Netflow subscription MP (Surfnet)
• Multi-LS developed (UDel), in • SSH/Telnet MP (Belnet)
testing (UDel/PSNC) • TCMP (Arnes)
• Data manipulation • Traceroute MP started (GaTech)
• Anamoly detection service • Topology
started (UDel) • TopS under development
• NOC Analysis tools under (RedIRIS)
development (SLAC) • cNIS under development
• Network measurement archives (SA3)
• Hades MA (DFN) • Extension of Indiana NOC DB
• OWAMP MA started (GaTech) (Internet2)
• RRD MA (PSNC, flow: CARNET) • Unified Information Service
• SQL MA (PSNC, L2 status: PSNC) started (UDel)
Visualisation – Status Update
• Allows diversity on the measurement layer and on the
visualization layer:
– BWCTL webpage (DFN)
– CNM (DFN) – Top 10, dashboard.
– ICE/NeTraMet (RNP)
– JRA4 E2E L2 visualisation (DFN)
– Looking glass (BELNET)
– NEMO (UNINET)
– perfsonarUI (ISTF)
– VisualperfSONAR (CARNET)
Agenda
• perfSONAR Overview
• perfSONAR Development Status and Plans
• perfSONAR Demos
• perfSONAR Deployment Status and Plans
perfSONAR Demos
• Visual perfSONAR
– https://noc-mon.srce.hr/visual_perf/
• perfSONAR UI
– http://wiki.perfsonar.net/jra1-
wiki/index.php/PerfsonarUI
Agenda
• perfSONAR Overview
• perfSONAR Development Status and Plans
• perfSONAR Demos
• perfSONAR Deployment Status and Plans
perfSONAR Adoption
• R&E Networks • Distributed Development
– Internet2 – Individual projects (10
– ESnet before first release)
– GÉANT2 write components that
– European NRENs integrate into the overall
– RNP framework
• Targeted Application – Individual communities
Communities (2007) (5 before first release)
– LHC write their own analysis
– GLORIAD Distributed and visualization
Virtual NOC software
– Teragrid
perfSONAR Deployment Status
GÉANT2 Deployment Status
• 2 LS
• 15 MA (Renater and GARR)
• Hades (a.k.a. IP Performance Metrics) MA
– 1 service + 22 measurement nodes
• 4 BWCTL / OWAMP MP
• 1 Telnet / SSH MP
• Also RNP (Brazil NREN), MREN (Montenegrin
NREN), SEEREN2
Internet2 Deployment Status
• Focus is on development of services for Internet2 new
network and integration with Indiana NOC
• Submitting a proposal to NSF for additional funding
• Target: July 1, 2007 as new Internet2 network goes
operation
– OWAMP MA
– BWCTL MA/MP
– IU-based Topology Service
– Multi-LS
– NOC Alarm Transformation Service
ESnet Deployment Status
• RRDMA to export our link utilization statistics
• SNMP based link status polling system
– link status for LHCOPN and Service trial circuits
• E2E-MON MP from DFN to export this status
• Deploying active latency and bandwidth monitoring probes
around the network
– but have not integrated this with perfSONAR yet
GÉANT2 Transition to Service:
Multi-Domain Monitoring (MDM) Service
User
Own
User
Visualisation
GN2
Visualisation
perfSONAR SOAP XML + JRA5 AA
BWCTL MP BWCTL MP BWCTL MP
OWD MA OWD MA OWD MA
Lookup Lookup Lookup
Domain A Domain B Domain C
Multi-Domain Monitoring
Service
• User : role – group of people making use of a MDM Service.
– There may be several categories of users having different needs.
• E2E really means Edge to Edge, not End to End (unless end
institutions buy into it).
– Must go as close as possible to the end-institution – regional and
metropolitan networks should also be involved.
• An NREN has two roles:
– Data supplier.
– Data user.
Multi-Domain Monitoring
Service
• Multi-Domain Monitoring Service
– Access to a set of monitoring functionalities (e.g. accessing metric
or performing tests) offered to a group of users accessible directly
through an XML SOAP interface (perfSONAR protocol) or through a
visualisation tools.
– Based on an underlying set of perfSONAR web-services.
• perfSONAR web-service
– Web service (providing data or allowing to perform an action) using
the XML NM-WG. The perfSONAR web-services are the basic
building blocs of a MDM service.
Users Segmentation
Tailored
Project
Advance Service Project SLA Added
User group and their Monotoring Data Trouble- Service
Trouble- Health Trouble- Verificat Value
Usage. shooting Health
shooting Check shooting ion Function
check
al
PERT Yes Yes Yes
NOC Yes Yes Yes
Layer2 Project [optional] [optional] Yes Yes
Layer3 Project [optional] [optional] Yes Yes
PIP Project [optional] [optional] Yes Yes
NREN non technical Staff Yes Yes Yes
End-User [optional]
Network Researcher
Security
MDM Service Support
• Infrastructure to support the perfSONAR web-services and the
visualisation tools used by the MDM will be set-up.
– For the deployers: installation, configuration, incident, monitoring.
– For users: installation, utilisation.
SLA(*) Users
Deployers
(NOC, PERT,
(RENs)
Projects)
Deployer
Service SLA(*)
Desk
ISS
User Service Desk
(*) Don’t get scared here!
MDM Service Support
• Level1 – Service Desk (ISS)
– Help to install, configure the tools, run reachability tests, help on usability,
track the RFE, forward problem to proper person, log the requests, update
the documentation, track bugs. This is a central function (rotating member
or group of people - ownership).
• Level2 – Administrator (RENs or FMS)
– Administrator of the machines where the services are installed. The
function lies within the providers. They are in charge of taking care of the
security of the services, of their availability (up) and reachability (no
firewall, etc). The service should be available 24/7.
• Level3 – Developers (3 years subcontract).
– The JRA1 developers who have build the services. They are in charge of
implementing new features and fixing bugs and of answering the query
forwarded by level1.
• The three levels of support will be available to both the users and the
deployers.
MDM Service Support – Fully
Manage Service (FMS)
• A turn key solution service could be provided for the web-services of a
MDM service or part of it.
– HW bought.
– Web-services installed, monitored and managed on behalves of the
REN.
– Level2 Service Support provided.
– REN would still have to do a little bit.
• Physical installation.
• GPS antenna coordination.
• Provide the data.
• Train its staff.
– Save 40% of the installation effort and all the support time once
installed.
Going Operational
• Pre-roll Out – define and set-up support structure now – March 07.
• Pilot – April 07 – August 07 – 5 RENs + GÉANT2
– For NOC and PERT (no AA)
– Understand the issues of going operational.
– Validate the support structure, get feedback for next phase.
– Release in January, deployment training in February.
– Test the Fully Manage Service.
• Prototype – October 07 – February 08 – 11 RENs + GÉANT2
– For NOC, PERT and a limited number of projects.
– Verify the MDM SLA.
– Dedicated support team.
– Verify how to provide the service to external parties.
• Operation – April 08
– More RENs, closer to end-institution.
– More projects supported.
Pilot - Objectives
• Pilot
– April 07 – August 07
– 5 RENs + GÉANT2
– For NOC and PERT (no AA)
• Objectives
– Understand the issues of going operational.
– Validate the support structure set-up.
– Get staff trained, raise awareness, provide feedback.
– Use a trustable platform.
– Get feedback for next phase.
Pilot - Functionalities
Metric Time Location
L3 link utilisation, L3 link Latest, historical Backbone and access links.
capacity
Domain link L2 status. Last 5min, historical (*) Only for LHC circuits.
show commands On-demand All backbone
switches/routers.
TCP/UDP throughput (***) On-demand, historical From three sources
connected to important
network node.
OWD, jitter, OWPL and Historical From the same sources as
traceroute for the throughput metric.
(*) historical is currently not required by the Circuit E2E monitoring tool.
(**) List of command from a list of pre-defined commands for all the backbone routers.
(***) UDP throughput tests will be restricted to PERT to discover packets dropped in case of difficult problem to solve.
Pilot - Portfolio
JRA4
JRA1
Visualisations / perfso L2 XML
Metric CNM Looking
Web-service narUI visuali access
glass
sation
L3 link utilisation Historical Yes Yes Yes
RRD MA or SQL MA
L3 link capacity Historical Yes Yes Yes
L2 status MP(*) L2 circuit status Latest Yes Yes
SQL MA (*) L2 circuit status Historical Yes Yes
OWD, IPDV, OWPL, Historical Yes Yes Yes
Hades MA
traceroute Historical Yes
Delay RTT On-demand Yes Yes
Telnet/SSH MP show command On-demand Yes Yes
Traceroute On-demand Yes Yes
Achievable throughput (TCP) On-demand Yes Yes
BWCTL MP
UDP throughput On-demand Yes Yes
Lookup Service Service discovery Yes Yes Yes
(*) L2 status MP or SQL MA
More Information
• http://www.perfsonar.net/
Get documents about "