Docstoc

Data Transfer Efficiency - GridPP

Document Sample
Data Transfer Efficiency - GridPP Powered By Docstoc
					 Data Transfer Efficiency
- leave no byte unchurned
          Jens Jensen
 Rutherford Appleton Laboratory
GridPP26, U Sussex, March 2011
             Background

• GridPP’s data grid
  – Distributed Storage Elements
  – Data movers (FTS, PhEDEx et al)
  – Catalogues (usu. replica)
• e-Infrastructure (aka cyberinfrastructure)
• (Presentation at ISGC)
           The Data Grid

• WLCG is primarily a data grid
  – Computation can (in principle) be redone
• Jobs go to where data is
  – Moving a job is quicker than moving data
Premature Optimisation is the
      Root of All Evil
 Postmature non-optimisation
   is the root of some evil
• The role of infrastructure code
  – Scientist as a programmer
  – “Bad” code moves up the stack?
  – “Bad” code improves over time?
• Doofers stay in prod’n
  Efficiencaciousness Goals

Service               People
• Availability        • (Effective) support
• Performance         • Training
• Grows as needed     • Expertise
• Robust (no SPoF?)   • Availability of…
             Approaches
• Philosophy
  – Get it done – WLCG
  – Get it done right – EGI?
  – Do It Perfectly The First Time…
• Evolutionary (control system) vs
  revolutionary
  – Proactive vs reactive
  Efficiencaciousness Issues

• Failures
  – Sites – BDII, network
  – Elements – storage
  – Components – disk servers
• Timeouts
• DDoS
  Efficiencaciousness Issues
• Overall effort
  – Funded, contributed, external
• Availability of expertise
  – Single Point of Knowledge
• Decoherence
• 2nd Law of Thermodynamics
• Learning from incidents
  Efficiencaciousness Issues
• Primary communication
  – Sites
  – Users: large VOs, small VOs, single users
  – PMB
• Secondary
  – WLCG
  – NGS
  Efficiencaciousness Issues
• Sites
  – There Is Always A Bottleneck Somewhere
  – Site dependent
  – Usage dependent
• Information
  – Freshness
  – Accuracy (“spped is substute fo accurcy”)
  Efficiencaciousness Issues

• Usage patterns
   – C.f. Wahid’s talk yesterday
   – WAN vs LAN (WN) traffic
• Technology
  – In the narrow sense (drives, controllers)
  – And the wider sense: dist’d filesystems
• Support: Upstream (EGI), Fabric
  Efficiencaciousness Issues
• Overheads
  – Complexity of use of stack (see next)
  – Infrastructure is complex
  – But Complexity Has To Go Somewhere
• Time-to-production
  – Testing, troubleshooting, monitoring,
    tweaking, tuning
   Expt            • DDM et al


                   • FTS
Data movers        • Catalogues

                   • SRM
Data control       • SE GRIS

                   • WAN: GridFTP
 Transport         • LAN: RFIO, DCAP, …


 Network           • Routers, switches, firewalls, OPN


                   • HDD, SSD, tapes
  Fabric           • Network cards, disk/RAID controllers

          With apologies to the OSI stack
Particular Pain Point Principle

PROGRESS
      Progressing Forward

• What is progress
• How to measure progress
          The Good News

• We’ve come a long way
• Don’t think there is a skills gap
  – But some SPoKs
           Graeme’s talk

• “Get the best out of what we can afford
  to buy”
• Proactive sites better
• Standards are good
       E[GM]I involvement

• EMI data roadmap
  – Support for dCache, DPM, StoRM
  – Support for standards (NFS4, CDMI)
• But then
  – StoRM=INFN, dCache=DESY,
    DPM=CERN
         The Cloud View

• Supplement resources with on-demand
• Agile
• CDMI is superset of SRM
  – But using ReST+JSON, not SOAP
        (Open) Standards

• Standards promote interoperation and
  stability
• Interoperation
• Multiple (independent) implementations
  – Both Java and (C or C++)
 The Case for Non-HEP Data

• Benefit from non-HEP data
  – Outreachy stuff
  – Benefit to society (eg saving lives)
• NGI interop (at compute)
• Others…
SUMMARY
  Efficiencaciousness Goals

Service               People
• Availability        • (Effective) support
• Performance         • Training
• Grows as needed     • Expertise
• Robust (no SPoF?)   • Availability of…

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:2/25/2013
language:English
pages:24