Outline

Document Sample
Outline Powered By Docstoc
					Berkeley RAD Lab
Technical Approach

Armando Fox, Randy Katz,
Michael Jordan, Dave Patterson,
Scott Shenker, Ion Stoica
October 2005
                             1
RAD Lab
The 5-year Vision:
Single person can go from vision to a next-generation IT service (“the Fortune 1
   million”)
 E.g., over long holiday weekend in 1995, Pierre Omidyar created Ebay v1.0


The Challenges:
   Develop the new Service
   Assess: Measuring, Testing, and Debugging the new Service in a realistic
    distributed environment
   Deploy: Scaling up a new, geographically distributed Service
   Operate a service that could quickly scale to millions of users

The Vehicle:
Interdisciplinary Center creates core technical competency to demo 10X to 100X
 Researchers are leaders in machine learning, networking, and systems
 Industrial Participants: leading companies in HW, systems SW, and online
   services
 Called “RAD Lab” for Reliable, Adaptable, Distributed systems               2
                                                                  Cap:
                                                                Dado:
                                             (The section of a pedestal
RAD Lab                                        between cap and base)
                                                                 Base:

The Science:
Both shorter-term and longer-term solutions
   Develop using primitives  functions (MapReduce), services (Craigslist)
   Assess/debug using deterministic replay and finding new metrics
   Deploy using “Internet-in-a-Box” via FPGAs under failure/slowdown workloads
   Operate using Statistical Learning Theory-friendly, Control Theory-friendly
    software architectures and visualization tools


Added Value to Industrial Participants:
   Working with leading people and companies from different industries on
    long-range, pre-competitive technology
   Training of dozens of future leaders of IT in multiple disciplines,
    and their recruitment by industrial participants
   Working with researchers with successful track record of rapid transfer of
    new technology

                                                                                 3
Steps vs. Process
   Steps: Traditional, Static      Process: Support
    Handoff Model, N groups          DADO Evolution, 1 group

Develop                                   Develop

       Assess
                                 Assess             Deploy
            Deploy
                                          Operate
                 Operate
                                                         4
    DADO - Develop
   Create abstractions,
    primitives, & toolkit for large                      Application
                                                Higher Functions (MapReduce)
    scale systems that make it                       Middleware (J2EE)
    easy to invent/deploy                                 Libraries
    functions like MapReduce                        Compilers/Debuggers
       For example, Distributed Hash Tables         Operatin g System
        (OpenDHT), Rendezvous-based                   Virtual Machine
        communication, (Internet Indirection             Hardware
        Infrastructure), Weak-semantics tuple
        spaces
       Already setting the trend for IETF
        standards
                                                                          5
DADO - Develop
   The opportunity of middleware:
    Middleware becoming dominant way to
    deploy commercial networked applications
       Innovate below abstraction
       Unmodified/proprietary apps deployed on improved middleware
       We put instrumentation and recovery support in middleware
            Pinpoint (diagnosis) and Microreboot (fast recovery) added
             to J2EE server
       Good news: Middleware imposes design constraints on
        applications that help recovery, e.g., Separation of state from
        app logic

                                                                          6
DADO - Develop
   First test: how easy to build MapReduce?
   Build simple versions of current generation
    Internet apps and scale up
       Auctions, Craigslist, Email, Sales, Free DB, …

   Test RAD vision and system in new courses
    and evolve system based on feedback
       2007, 2008, 2009: Students from CS, SIMS, MBA
       Operate good services from classes afterwards at partners site?

   A future Sergy Brin, Larry Page, Eric Brewer,
    or Pierre Omidyar in one of these classes?                            7
DADO - Assess
   “We improve what we can measure”
       Inspect box visibility into networks, usually data poor
       Servers data rich; data often discarded

   Statistical and Machine Learning to the rescue
    It works well when
       You have lots of raw data
       You have reason to believe the raw data is related to some
        high-level effect you’re interested in
       You don’t have a model of what that relationship is

   Note: SLT advances  fast analysis
                                                                     8
DADO - Assess
   Example: Statistical Debugging
       Instrument programs to add predicates (assertions) via compiler
       Sparsely sample (~ 1%), recording predicates T/F and crashes
       Collect information over the Internet
       Learn a statistical classifier based on successful and failed runs,
        using feature predicate selection + clustering methods to pinpoint
        the bugs

   Found bugs in several open-source programs
    (moss, ccrypt, bc, rhythmbox, exif)
To learn more, see "Scalable Statistical Bug Isolation," B. Liblit, M.
Naik, A. X. Zheng, A. Aiken, and M. I. Jordan, PLDI, 2005.               9
DADO - Assess (MOSS results)
Bug Thermometer Predicate                                       Line of Code    #1  #2 #3 #4    #5  #6 #7 #8 #9
                files[filesindex].language > 16                     5869         0   0  28  54 1585 0   0  0  68
                ((*(fi + i)))->this.last_line == 1                  5442        774  0  17   0   0   0 18 0    2
                token_index > 500                                   4325         31  0  16 711   0   0  0  0  47
                (p + passage_index___0)->last_token <= filesbase    5289         28  2 508 0     0   0  1  0  29
                __result___430 == 0 is TRUE                         5789         16  0   0   9  19 291 0   0  13
                config.match_comment is TRUE                        1994        791  2  23   1   0   5 11 0   41
                i___0 == yy_last_accepting_state                    5300         55  0  21   0   0   3  7  0 769
                f <f                                                4497         3  144 2    2   0   0  0  0   5
                files[fileid].size < token_index                    4850         31  0  10 633   0   0  0  0  40
                passage_index___0 == 293                            5313         27  3   8   0   0   0  2  0 366
                ((*(fi + i)))->other.last_line == yyleng            5444        776  0  16   0   0   0 18 0    1
                min_index == 64                                     5302         24  1   7   0   0   1  1  0 249
                ((*(fi + i)))->this.last_line == yy_start           5442        771  0  18   0   0   0 19 0    0
                (passages + i)->fileid == 52                        4576         24  0 477 14   24   0  1  0  14
                passage_index___0 == 25                             5313         60  5  27   0   0   4 10 0 962
                strcmp > 0                                          4389         0   0  28  54 1584 0   0  0  68
                i > 500                                             4865         32  2  18 853 54    0  0  0  53
                token_sequence[token_index].val >= 100              4322       1250 3   28  38   0  15 19 0   65
                i == 50                                             5252         27  0  11   0   0   1  4  0 463
                passage_index___0 == 19                             5313         59  5  28   0   0   4 10 0 958
                bytes <= filesbase                                  4481         1   0  19   0   0   0  0  0   1




 • Reinsert 9 known bugs in MOSS program to see if can find
 • Statistical debugging points out 7 of 9 (1 never occurred)
                                                                                                          10
DADO - Assess
   Distributed debugging is very hard
       Services required to be up 24x7
       Even rare failures are catastrophic
            Very hard to reproduce in lab

   Provide continuous logging and check-pointing
       Reproducible behavior via replaying
            Leverage RAMP, Iboxes for deterministic replaying (interrupt at clock
             cycle 100M …) and “what-if” scenario analysis

   Create an “open source” failure and slowdown
    (e.g, Ebates, Windows Minidump) respository
    with sanitized information (+ tools to sanitize),
    workloads so other researchers can help                                    11
    DADO - Deploy
   How can academics experiment with
    systems of 1000+ nodes?
   RAMP (Research Accelerator for Multiple
    Processors) for parallel HW & SW research
       Single FPGA hold ~ 25 CPUs + caches in 2005
       ~$100k = ~4 FPGAs / board, ~4 DIMMs / FPGA ,10-20 boards
        + low-cost Storage Server over Ethernet
         ~1000 CPUs, 256 MB DRAM/CPU, 20 GB disk storage/CPU
       Pros: free “IP” (opencores.org), large scale, low purchase cost,
        low operation cost, change easy, trace easy, reproducible behavior,
        real ISA and OS, grows with Moore’s Law (2X CPUs, clock / 1.5 yrs)
       Cons: Slow clock rate (100-200 MHz vs. 2-4 GHz)                   12
Why RAMP Attractive to Research?
Priorities for Research
Parallel Computer
  1a. Cost of purchase
  1b. Cost of ownership (staff to administer it)
  1c. Scalability/Reality (1000 nodes, “real”  SW)
  4. Observability (measure, trace everything)
  5. Reproducibility (to debug, run experiments)
  6. Flexibility (change for different experiments)
  7. Credibility (results are believable for tech. transfer)
  8. Performance
Note: Commercial parallel computer        Performance #1      13
      Why RAMP Attractive for Research?
      SMP, Cluster, Simulator v. RAMP
                      SMP        Cluster     Simulate       RAMP
Cost (1 CPU)         F ($40k)    B ($2k)      A+ ($0k)     A ($0.1k)
Cost of ownership       A           D            A            A
Scalability (1000)      C           A            A            A
Observability           D           C           A+           A+
Reproducibility         B           D           A+           A+
Community               D           A            A            A
Flexibility             D           C           A+           A+
Credibility            A+          A+            F            A
Perform. (clock)     A (2 GHz)   A (3 GHz)    F (0 GHz)   C (0.2 GHz)
                                                                   14
GPA                     C           B-           B            A-
DADO - Deploy
   Re-engineer RAMP to act like 1000+ node
    distributed system under realistic failure and
    slowdown workloads
       Same HW emulates data center as well as wide area systems
       Embed Emulab and ModelNet emulation test beds
       Have synthetic time, checkpoint/restart, clock cycle accurate
        reproducibility, dedicated use of large system, trace anything, …
       Researchers should be able to develop in similar environment that
        led to innovations like MapReduce

   Failure data collection of PlanetLab et al =>
    failure and slowdown workload
                                                                       15
 DADO - Operate
• Idea: when site misbehaves, users notice, and
  change their behavior; use as “failure detector”
• Approach: combine visualization with Statistical
  Learning Theory (SLT, aka machine learning)
  analysis so operator see anomalies too
• Experiment: does distribution of hits to various
  pages match the “historical” distribution?
      Each minute, compare hit counts of top N pages to hit counts over
       last 6 hours using Bayesian networks and 2 test, real Ebates data
  To learn more, see “Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for
  Failure Detection and Localization,” In Proc. 2nd IEEE Int’l Conf. on Autonomic Computing, June 2005, by Peter Bodik,
  Greg Friedman, Lukas Biewald, Helen Levine (Ebates,com), George Candea, Kayur Patel, Gilman Tolle, Jon Hui, Armando
  Fox, Michael I. Jordan, David Patterson.                                                                      16
        Time (5 minute intervals)


 Top
 40
Pages                               Visualization as
                                    user behavior
                                    completely
                                    different; usually
                                    animate
                                    architecture


                                    Seeing is
                                    believing:
                                    Win trust in SLT
                                    by leveraging
                                    operator
                                    expertise and
                                    human visual
                                    pattern
                                    recognition
                                                17
DADO - Operate
   Maintaining Quality of Service in presence of DDOS,
    flash crowds, ... is critical
   Key observation: network service failures attributed
    to unexpected traffic patterns
   Key Approach: identify and protect “good” traffic vs.
    discard “bad” traffic
   Create “Inspection-and-Action” Boxes
     Deep multiprotocol packet inspection
     Exploit SLT to discover “Normal” model + anomaly detection
     Mark and annotate packets to add info / Prioritize and throttle
     Evolve network architecture, e.g., include annotation layer
                                                                        18
RAD Lab Opportunity:
New Research Model
    Chance to Partner with the Top University in
     Computer Systems on the “Next Great Thing”
        National Academy of Engineering mentions Berkeley in 7 of 19 $1B+
         industries that came from IT research
          NAE mentions Berkeley 7 times, Stanford 5 Times, MIT 5, CMU 3
           Timesharing (SDS 940), Client-Server Computing (BSD Unix), Graphics,
           Entertainment, Internet, LANs, Workstations, GUI, VLSI Design (Spice)
           [ECAD $5B?/yr] , RISC [$10B?/yr] , Relational DB (Ingres/Postgres) [RDB
           $15B?/yr], Parallel DB, Data Mining, Parallel Computing, RAID [$15B?/yr] ,
           Portable Communication (BWRC), WWW, Speech Recognition, Broadband
      Berkeley one of the top suppliers of systems students to industry and academia
      US News & World Report ranking of CS Systems universities:
       1 Berkeley, 2 CMU, 2 MIT, 4 Stanford, 5 Washington
      For example: Quanta (Taiwan PC laptop clone manufacturer) funds MIT CSAIL @
       $4M/year for 5 years to reinvent PC April 2005 (“Tparty”)
    RAID project (4 faculty, 20 grads, 10 undergrads) helped
     create $15B industry, but not fundable today at DARPA, NSF 19
RAID Alumni 10 years later




   Industry managers: AT&T, HP, IBM, Microsoft, Sun, …
   Founders of Startups: Electric Cloud, Panasas, VMware, …
   Professors: CMU, Stanford, Michigan, Arizona, UCSC

                                                               20
    Founding the RADLab; Start 12/1
   $2.5M / yr (1/2 BWRC) 70% industry,
    20% state, 10% fed gov’t
        25 grad students + 15 undergrads+ 6 faculty + 2 staff
   Looking for 3 to 4 founding companies to fund ≈ 3-5 years @ cost
    of $0.5M / year
        Follow SNRC (Stanford Network Research Center), BWRC (Berkeley Wireless R. C.)

   Feedback on forming consortium?
      Prefer founding partner technology in prototypes
      Designate employees to act as consultants
      Head start for participants on research results
      Putting IP in Public Domain so partners not sued
   Press release of founding RAD Lab partners December 1
   Mid project review after 3 years by founding partners
                                                                                     21
Critical Mass vs. Spreading $ Thinly
    Seems safer to spend $50k at 10 universities
     vs. $500k at 1 university
    But still get diversity, portfolio effect across fields
      N faculty and grad students in N areas
      Improved student-to-student and student-to-prof training across fields

    But critical mass on coherent systems project much greater
     chance of technical success (in our experience)
        E.g., BSD Unix, Ingres, Postgres, RISC, RAID, NOW all had critical mass
    But less management overhead in critical mass model
        Less industry time for interaction/travel, Less $ for faculty support
      More students supported with less hassle in critical mass model
    But much more influence on directions, participants
      “$50k like a cousin; $500k like a spouse”
      Preference to partner’s technology



                                                                                   22
RAD Lab Model                         Foundation
                                       Member
 Preference to Partner’s Technology       √

 Partner employees advises design
                                                   Affiliate
 teams                                    √        Member
 Attend Two 3-day Reviews                 √          √


                                         √           √
 6-month delay on Review
   presentations
                                       (5 students) (1/2 student)
 Annual Contribution                   ≥$500k      ≥$50k
 (Contribution counts towards CITRIS donation)
 (Contribution  matching state $ via MICRO, UC Discovery)
                                                      23
   RAD Lab: Interdisciplinary Center for
   Reliable, Adaptive, Distributed Systems
                                              Capability (Desired): 1 person can
                                              invent & run the next-gen IT service
                                              Develop using primitives to enable
                                              functions (MapReduce), services
                                              (Craigslist)
                                              Assess using deterministic replay and
                                              statistical debugging
                                              Deploy via “Internet-in-a-Box” FPGAs
                                              Operate SLT-friendly, Control Theory-
                                              friendly architectures and operator-
                                              centric visualization and analysis tools
                                              Base Technology:
                                              Server Hardware, System Software,
                                              Middleware, Networking

• Working with different industries on long-range, pre-competitive technology
• Training of dozens of future leaders of IT, plus their recruitment
• Working with researchers with track records of successful technology transfer    24
Backup Slides




                25
  References
To learn more, see

  •    “Combining Visualization and Statistical Analysis to Improve Operator Confidence and
       Efficiency for Failure Detection and Localization,” In Proc. 2nd IEEE Int’l Conf. on
       Autonomic Computing, June 2005, by Peter Bodik, Greg Friedman, Lukas Biewald,
       Helen Levine (Ebates,com), George Candea, Kayur Patel, Gilman Tolle, Jon Hui,
       Armando Fox, Michael I. Jordan, David Patterson.
  •    “Microreboot -- A Technique for Cheap Recovery,” George Candea, Shinichi
       Kawamoto, Yuichi Fujiki, Greg Friedman, and Armando Fox. Proc. 6th Symp. on
       Operating Systems Design and Implementation (OSDI), San Francisco, CA, Dec.
       2004.
  •    “Path-Based Failure and Evolution Management,” Mike Y. Chen, Anthony Accardi,
       Emre Kiciman, Jim Lloyd, Dave Patterson, Armando Fox, and Eric Brewer In Proc. 1st
       USENIX/ACM Symp. on Networked Systems Design and Implementation (NSDI '04),
       San Francisco, CA, March 2004.
  •    "Scalable Statistical Bug Isolation," Ben Liblit, M. Naik, Alice. X. Zheng, Alex Aiken,
       and Micheal I. Jordan, PLDI, 2005.

                                                                                                 26
Sustaining Innovation/Training
Engine in 21st Century
   Replicate research centers based primarily on
    industrial funding to expand IT market and
    to train next generation of IT leaders
       Berkeley Wireless Research Center (BWRC):
        50 grad students, 30 undergrads @ $5M per year
       Stanford Network Research Center (SNRC):
        50 Grad students @ $5M per year
       MIT Tparty $4M per year (100% $ from Quanta)
       Industry largely funds
            N companies, where N is 5?
       Exciting, long term technical vision
            Demonstrated by prototype(s)
                                                         27
State of Research Funding Today
   Most industry research shorter term
   DARPA exiting long-term (exp.) IT research
       ’03-’05 BAAs IPTO: 9 AI, 2 classified, 1 SW radio, 1 sensor net,
        1 reliability, all have 12 to 18 month “go/no go” milestones
       Academic led funding reduced 50% (so far) 2001 to 2004
       Faculty ≈ consultants in consortia led by defense contractor,
        get grants ≈ support 1-2 students (~ NSF funding level)
   NSF swamped with proposals, conservative
       2000 to 6500 proposals in 5 years
            IT has lowest acceptance rate at NSF (between 8% to 16%)
       “Ambitious proposal” is a negative review
       Even if get NSF funding, proposal reduced to stretch NSF $
        e.g., got 3 x 1/3 faculty, 6 grad students, 0 staff, 3 years
   (To learn more, see www.cra.org/research)                              28
RAD Lab Timeline
   2005 Launch RAD Lab
   2006 Collect workloads, Internet in a Box
   2007 SLT/CT distributed architectures,
    Iboxes, annotative layer, class testing
   2008 Development toolkit 1.0, tuple space,
    class testing; Mid Project Review
   2009 RAD Lab software suite 1.0, class
    testing
   2010 End of Project Party                    29
   Guide to Visualization
      Multiple interesting & useful predicate metrics
      Graphical representation helps reveal trends
           How much P true increases probability
Fails whether            Increase(P)      error bound     Succeeds
or not P true                                         despite P true
   Context(P)                                            S(P)




                  log(Number runs P observed)
                                                                30

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:8/18/2012
language:Unknown
pages:30