DICE_perfSONAR by xiangpeng


									perfSONAR Update

 Eric Boyd
 Joe Metzger
 Nicolas Simar
 Martin Swany
•   perfSONAR Overview
•   perfSONAR Development Status and Plans
•   perfSONAR Demos
•   perfSONAR Deployment Status and Plans
Vision: Performance
  Information is …
 • Available
    – People can find it (Discovery)
    – “Community of trust” allows access across administrative
      domain boundaries (AA)
 • Ubiquitous
    – Widely deployed (Paths of interest covered)
    – Reliable (Consistently configured correctly)
 • Valuable
    – Actionable (Analysis suggests course of action)
    – Automatable (Applications act on data)
 • Easy to produce
    – Extensible data-sharing toolkit
Getting There: Build &
 Empower the Community
Decouple the Problem Space:
                                                     Analysis &
   – Analysis and Visualization     Analysis &      Visualization
   – Performance Data Sharing      Visualization
   – Performance Data                               API
Grow the Footprint:
   – Clean APIs and protocols      Measurement      Measurement
      between each layer           Infrastructure   Infrastructure
   – Widespread deployment of
      measurement infrastructure
   – Widespread deployment of
      common performance           Performance
      measurement tools               Tools         Performance
What is perfSONAR?
• Performance Middleware
  – perfSONAR is an international consortium led
    by ESnet, GÉANT2, Internet2, and RNP
  – perfSONAR is a set of protocol standards for
    sharing data between measurement and
    monitoring systems
  – perfSONAR is a set of open source web
    services that can be mixed-and-matched and
    extended to create a performance monitoring
perfSONAR Design Goals
• Standards-based
• Modular
• Decentralized
• Locally controlled
• Open Source
• Extensible
• Applicable to multiple generations of network
  monitoring systems
• Grows “beyond our control”
• Customized for individual science disciplines
perfSONAR Integrates
  •   Network measurement tools
  •   Network measurement archives
  •   Discovery
  •   Authentication
  •   Data manipulation
  •   Resource protection
  •   Topology
perfSONAR Credits                   •   GÉANT2 JRA1 includes:
                                         – Arnes
 •   perfSONAR is a joint effort:        – Belnet
      – ESnet                            – Carnet
      – GÉANT2 JRA1                      – Cesnet
      – Internet2                        – CYNet
      – RNP                              – DANTE
                                         – DFN
 •   ESnet includes:                     – FCCN
      – ESnet/LBL staff                  – GRNet
                                         – GARR
      – Fermilab
                                         – ISTF
                                         – PSNC
 •   Internet2 includes:                 – Nordunet (Uninett)
      – University of Delaware           – Renater
      – Georgia Tech                     – RedIRIS
      – SLAC                             – Surfnet
      – Internet2 staff                  – SWITCH
•   perfSONAR Overview
•   perfSONAR Development Status and Plans
•   perfSONAR Demos
•   perfSONAR Deployment Status and Plans
perfSONAR Development Process
• Loosely coordinated development of web services
    – Each web service “owned” by 1 or 2 developers
• Core set of services released as a joint package
    – Interoperability testing within the core
    – Interoperability testing with “common” UIs
• Reference implementation in Java; some services in perl
• Common development resources (e.g. Bugzilla, mailing
  lists, SVN, Wiki)
• Steering committee
• Regular email discussions and conference calls
• Quarterly face-to-face meetings
perfSONAR release 1.1 (orange signifies new)
• Production release of core services package v1.1 is planned for
  February, 2007
   – Single domain LS solution (PSNC)
   – RRD MA (PSNC)
   – SQL MA (PSNC)
   – SSH/Telnet MP (Belnet)
• Recommended visualization to make use of those services
   – perfSONAR UI (ISTF)
   – CNM (DFN)
• Quality improvements
   – Bug fixes
   – Documentation
   – Functional testing
   – Installation
Ongoing and Planned Development Work
• Authentication                         • Network measurement tools
   • Semantics defined (MACE,               • ABW (CESNET)
      JRA5)                                 • BWCTL MP (DFN)
                                            • BWCTL becomes a MA/MP
• Authorization                               (Internet2)
   • Discussion has just begin              • Ciena MP (UDel)
      (RedIRIS)                             • CLI MP (RNP)
• Discovery                                 • L2 status MP (DFN/JRA4)
   • Single LS released (PSNC)              • Netflow subscription MP (Surfnet)
   • Multi-LS developed (UDel), in          • SSH/Telnet MP (Belnet)
      testing (UDel/PSNC)                   • TCMP (Arnes)
• Data manipulation                         • Traceroute MP started (GaTech)
   • Anamoly detection service           • Topology
      started (UDel)                        • TopS under development
   • NOC Analysis tools under                 (RedIRIS)
      development (SLAC)                    • cNIS under development
• Network measurement archives                (SA3)
    •   Hades MA (DFN)                      • Extension of Indiana NOC DB
    •   OWAMP MA started (GaTech)             (Internet2)
    •   RRD MA (PSNC, flow: CARNET)         • Unified Information Service
    •   SQL MA (PSNC, L2 status: PSNC)        started (UDel)
Visualisation – Status Update
• Allows diversity on the measurement layer and on the
  visualization layer:
   – BWCTL webpage (DFN)
   – CNM (DFN) – Top 10, dashboard.
   – ICE/NeTraMet (RNP)
   – JRA4 E2E L2 visualisation (DFN)
   – Looking glass (BELNET)
   – perfsonarUI (ISTF)
   – VisualperfSONAR (CARNET)
•   perfSONAR Overview
•   perfSONAR Development Status and Plans
•   perfSONAR Demos
•   perfSONAR Deployment Status and Plans
perfSONAR Demos
• Visual perfSONAR
   – https://noc-mon.srce.hr/visual_perf/
• perfSONAR UI
   – http://wiki.perfsonar.net/jra1-
•   perfSONAR Overview
•   perfSONAR Development Status and Plans
•   perfSONAR Demos
•   perfSONAR Deployment Status and Plans
perfSONAR Adoption
• R&E Networks             • Distributed Development
   – Internet2                – Individual projects (10
   – ESnet                      before first release)
   – GÉANT2                     write components that
   – European NRENs             integrate into the overall
   – RNP                        framework
• Targeted Application        – Individual communities
  Communities (2007)            (5 before first release)
   – LHC                        write their own analysis
   – GLORIAD Distributed        and visualization
     Virtual NOC                software
   – Teragrid
perfSONAR Deployment Status
GÉANT2 Deployment Status

• 2 LS
• 15 MA (Renater and GARR)
• Hades (a.k.a. IP Performance Metrics) MA
   – 1 service + 22 measurement nodes
• 1 Telnet / SSH MP

• Also RNP (Brazil NREN), MREN (Montenegrin
Internet2 Deployment Status
• Focus is on development of services for Internet2 new
  network and integration with Indiana NOC
• Submitting a proposal to NSF for additional funding
• Target: July 1, 2007 as new Internet2 network goes
   – IU-based Topology Service
   – Multi-LS
   – NOC Alarm Transformation Service
ESnet Deployment Status
• RRDMA to export our link utilization statistics
• SNMP based link status polling system
   – link status for LHCOPN and Service trial circuits
• E2E-MON MP from DFN to export this status
• Deploying active latency and bandwidth monitoring probes
  around the network
   – but have not integrated this with perfSONAR yet
GÉANT2 Transition to Service:
 Multi-Domain Monitoring (MDM) Service



                perfSONAR SOAP XML + JRA5 AA

         BWCTL MP            BWCTL MP      BWCTL MP
          OWD MA              OWD MA        OWD MA
           Lookup              Lookup        Lookup

         Domain A            Domain B       Domain C
Multi-Domain Monitoring
• User : role – group of people making use of a MDM Service.
   – There may be several categories of users having different needs.

• E2E really means Edge to Edge, not End to End (unless end
  institutions buy into it).
   – Must go as close as possible to the end-institution – regional and
      metropolitan networks should also be involved.

• An NREN has two roles:
   – Data supplier.
   – Data user.
Multi-Domain Monitoring
• Multi-Domain Monitoring Service
   – Access to a set of monitoring functionalities (e.g. accessing metric
     or performing tests) offered to a group of users accessible directly
     through an XML SOAP interface (perfSONAR protocol) or through a
     visualisation tools.
   – Based on an underlying set of perfSONAR web-services.

• perfSONAR web-service
   – Web service (providing data or allowing to perform an action) using
     the XML NM-WG. The perfSONAR web-services are the basic
     building blocs of a MDM service.
Users Segmentation

                                     Advance           Service    Project           SLA     Added
User group and their Monotoring Data          Trouble-                    Service
                                     Trouble-          Health    Trouble-         Verificat Value
              Usage.                          shooting                    Health
                                     shooting           Check    shooting           ion    Function

PERT                                  Yes      Yes       Yes
NOC                                            Yes       Yes                             Yes
Layer2 Project                                                   [optional] [optional]   Yes      Yes
Layer3 Project                                                   [optional] [optional]   Yes      Yes
PIP Project                                                      [optional] [optional]   Yes      Yes
NREN non technical Staff                                 Yes                   Yes       Yes
End-User                                                                                       [optional]
Network Researcher
    MDM Service Support
    • Infrastructure to support the perfSONAR web-services and the
      visualisation tools used by the MDM will be set-up.
       – For the deployers: installation, configuration, incident, monitoring.
       – For users: installation, utilisation.

                                            SLA(*)         Users
                                                        (NOC, PERT,
            Service          SLA(*)

                                              User Service Desk
(*) Don’t get scared here!
MDM Service Support
•   Level1 – Service Desk (ISS)
     – Help to install, configure the tools, run reachability tests, help on usability,
       track the RFE, forward problem to proper person, log the requests, update
       the documentation, track bugs. This is a central function (rotating member
       or group of people - ownership).
•   Level2 – Administrator (RENs or FMS)
     – Administrator of the machines where the services are installed. The
       function lies within the providers. They are in charge of taking care of the
       security of the services, of their availability (up) and reachability (no
       firewall, etc). The service should be available 24/7.
•   Level3 – Developers (3 years subcontract).
     – The JRA1 developers who have build the services. They are in charge of
       implementing new features and fixing bugs and of answering the query
       forwarded by level1.
•   The three levels of support will be available to both the users and the
MDM Service Support – Fully
 Manage Service (FMS)
• A turn key solution service could be provided for the web-services of a
  MDM service or part of it.
   – HW bought.
   – Web-services installed, monitored and managed on behalves of the
   – Level2 Service Support provided.
   – REN would still have to do a little bit.
        • Physical installation.
        • GPS antenna coordination.
        • Provide the data.
        • Train its staff.
   – Save 40% of the installation effort and all the support time once
Going Operational
• Pre-roll Out – define and set-up support structure now – March 07.
• Pilot – April 07 – August 07 – 5 RENs + GÉANT2
   –   For NOC and PERT (no AA)
   –   Understand the issues of going operational.
   –   Validate the support structure, get feedback for next phase.
   –   Release in January, deployment training in February.
   –   Test the Fully Manage Service.
• Prototype – October 07 – February 08 – 11 RENs + GÉANT2
   –   For NOC, PERT and a limited number of projects.
   –   Verify the MDM SLA.
   –   Dedicated support team.
   –   Verify how to provide the service to external parties.
• Operation – April 08
   – More RENs, closer to end-institution.
   – More projects supported.
Pilot - Objectives
• Pilot
   – April 07 – August 07
   – 5 RENs + GÉANT2
   – For NOC and PERT (no AA)

• Objectives
   –   Understand the issues of going operational.
   –   Validate the support structure set-up.
   –   Get staff trained, raise awareness, provide feedback.
   –   Use a trustable platform.
   –   Get feedback for next phase.
 Pilot - Functionalities
               Metric                                         Time                                      Location

L3 link utilisation, L3 link                Latest, historical                            Backbone and access links.
Domain link L2 status.                      Last 5min, historical (*)                     Only for LHC circuits.

show commands                               On-demand                                     All backbone
TCP/UDP throughput (***)                    On-demand, historical                         From three sources
                                                                                          connected to important
                                                                                          network node.
OWD, jitter, OWPL and                       Historical                                    From the same sources as
traceroute                                                                                for the throughput metric.
(*) historical is currently not required by the Circuit E2E monitoring tool.
(**) List of command from a list of pre-defined commands for all the backbone routers.
(***) UDP throughput tests will be restricted to PERT to discover packets dropped in case of difficult problem to solve.
  Pilot - Portfolio
   Visualisations /                                              perfso                   L2     XML
                                   Metric                                 CNM   Looking
    Web-service                                                  narUI                  visuali access
                      L3 link utilisation           Historical    Yes     Yes                     Yes
                      L3 link capacity              Historical    Yes     Yes                     Yes
 L2 status MP(*)      L2 circuit status             Latest                                Yes     Yes
 SQL MA (*)           L2 circuit status             Historical                            Yes     Yes
                      OWD, IPDV, OWPL,              Historical    Yes     Yes                     Yes
 Hades MA
                      traceroute                    Historical                                    Yes
                      Delay RTT                     On-demand                     Yes             Yes
 Telnet/SSH MP        show command                  On-demand                     Yes             Yes
                      Traceroute                    On-demand                     Yes             Yes
                      Achievable throughput (TCP)   On-demand     Yes                             Yes
                      UDP throughput                On-demand     Yes                             Yes
 Lookup Service       Service discovery                           Yes     Yes                     Yes

(*) L2 status MP or SQL MA
More Information
• http://www.perfsonar.net/

To top