Validating the Intel Pentium 4 Microprocessor

Document Sample
Validating the Intel Pentium 4 Microprocessor Powered By Docstoc
					Validating the Intel® Pentium® 4
Microprocessor

          Bob Bentley
        Intel Corporation
         June 20, 2001
Pentium® 4 Development Timeline
 Structural RTL coding start: 2H’96
    First cluster models released: late ’96
    First full-chip model released: 1Q’97

 Structural RTL coding complete: Q2’98
    “All   bugs coded for the first time!”
 Structural RTL under full ECO control: Q2’99
 Structural RTL frozen: Q3’99
 A-0 tapeout: December ’99
 First packaged parts available: January 2000
 First samples shipped to customers: Q1’00
 Production ship qualification granted: October 2000
Pentium® 4 Validation Staffing
Pre-silicon validation environment
 SRTL validation is MUCH slower than real silicon
    Typical  full-chip simulation with checkers ran at 3-5 Hz
     on a Pentium III machine
    We used a compute farm containing thousands of
     machines running 24/7 to get ~6 billion cycles/week
    ALL the SRTL simulation cycles we recorded amounted
     to less than 2 minutes on a single 1 GHz system!
 But pre-silicon validation has some advantages
    Fine-grained    (cycle-by-cycle) checking
    Visibility of internal state (e.g. caches)
    APIs to allow event injection

 No amount of dynamic validation is enough to
  exhaustively test a complex microprocessor
   A  single dyadic extended-precision FP instruction has
     O(10**50) combinations
Pentium® 4 Formal Verification
 First large-scale effort (~60 person years) at Intel to
  apply formal verification techniques to CPU design
    Applying   FV to a moving target is a big challenge!
 Mostly model checking, with some recent work using
  theorem proving to connect FP proofs to IEEE 754
 More than 10,000 proofs in key areas:
    FP Execution units
    Instruction decode
    Out-of-order control mechanisms

 Found ~20 “high quality” bugs that would have been
  hard to detect by dynamic testing
 No silicon bugs found to date in areas proved by FV
Cluster Test Environments
 Cluster test environments were developed for each of
  the Pentium 4 hardware clusters plus microcode
 A BIG win overall, even though it took a lot of work to
  develop and maintain them
    Almost  60% of all the bugs found by dynamic testing
     were caught at CTE level
    Moved bug detection upstream – earlier detection is
     less costly and less disruptive
    Decoupled validation of different parts of the chip –
     bugs in one area didn’t block progress elsewhere
    Provided much better controllability than the full-chip
     model, especially for “downstream” microarchitecture
     pipeline stages
 CTEs helped to maintain a healthy full-chip model
Power Reduction Validation
 Power consumption was a big concern for Pentium 4
    Need  to stay within the cost-effective thermal envelope
     for desktop systems at 1.5+ GHz
 Extensive clock gating in every part of the design
 Mounted a focused effort to validate that:
    Committed  features were implemented as per plan
    Functional correctness was maintained in the face of
     clock gating
    Changes to the design did not impact power savings

 ~12 person years of effort, 5 heads at peak
 Fully functional on A-step silicon, measured savings
  of ~20W achieved for typical workloads
Coverage
 Testing without coverage feedback is like driving a
  car with a blindfold on
    You may think you know where you are going, but
     where you end up is not where you meant to go
 We made extensive use of directed random test
  generators, with coverage feedback to tell us what
  we were, and were not, hitting
             is a poor guide, especially for a complex
    Intuition
     microarchitecture like the Pentium 4 microprocessor
 The purpose of coverage is not necessarily to hit
  100% of the identified conditions
    Rather,  it is a way of directing future testing to the
     places it is most needed
    It helps to avoid the trap of “spinning your wheels” and
     testing the same areas over and over again
SRTL Model Release Process
 Integrating an SRTL model for a design as complex
  as the Pentium 4 is a real challenge
         1.5 million lines of SRTL code
         “Massively parallel” development
 We put together a code release process based on
  our experience from previous projects:
   i.     Build and test a graft cluster model
   ii.    Turn in changes for inclusion in the next cluster
          model release, along with other designers’ changes
   iii.   Build and test a graft full-chip model
   iv.    Turn in changes for inclusion in the next full-chip
          model release, along with changes for other clusters
 Although this may seem overly bureaucratic, it kept
  the full-chip model healthy even in the face of major
  waves of new functionality
Feature Pioneering
 Sometimes, we had to make exceptions to the
  general integration methodology
    Some features required simultaneous turnins from
    multiple sources (almost always including microcode)
 In these cases, we adopted a scheme called “feature
  pioneering”
   A   feature owner created a prototype model containing
     all the first-cut changes
    A feature validator put together a set of broad-brush
     tests designed to exercise the basic functionality
    The two of them sat together and rapidly iterated
     through test and fix cycles until the feature was healthy
     enough to be released to a wider audience
How do you know you’re done?
 Short answer: you are never done
    There are always more tests that can be run, more
     coverage that can be obtained, …
 More useful answer: you are done when you have
  exhausted the usefulness of the SRTL model as a
  vehicle for finding bugs
 A good first-order approximation is to track:
    New bug rate
    Cycles run
    Coverage

 If you are trying really hard to find bugs, and not
  finding anything significant, and you’ve covered the
  whole target space, then it may be time to tape out
        0
            10
                 20
                      30
                           40
                                50
                                     60
                                          70
                                               80
                                                    90
                                                         100
40'98

43'98

46'98

49'98

52'98

03'99

06'99

09'99

12'99

15'99

18'99

21'99
                                                               Pre-silicon Bug Rate




24'99

27'99

30'99

33'99

36'99

39'99

42'99

45'99

48'99

51'99
        0
              1000
                     2000
                            3000
                                   4000
                                          5000
                                                 6000
                                                        (Millions)
40'98

43'98

46'98

49'98

52'98

03'99

06'99

09'99

12'99

15'99

18'99

21'99

24'99

27'99

30'99

33'99

36'99
                                                                     Pre-silicon validation cycles




            chip
            Full-




39'99

42'99

45'99

48'99

51'99
Unit-Level Coverage
Pre-silicon Bug Causes

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:5/24/2012
language:
pages:15