CPSC 321 Computer Architecture - PowerPoint

Document Sample
CPSC 321 Computer Architecture - PowerPoint Powered By Docstoc
					         CPSC 321
     Computer Architecture
                   Spring 2008

                    Lecture 1
Introduction and Five Components of a Computer
                      Manhee Lee

       Adapted from CS 152 Spring 2002 UC Berkeley
                  Copyright (C) 2001 UCB
           Course Information
• Instructor
  – Manhee Lee manhee@cs.tamu.edu
     • HRBB Rm 336, tel: 845-1865
     • Office Hours: MWF 9~10AM

• TAs
  – Minseon Ahn msahn@cs.tamu.edu
  – Lei Wang wanglei@cs.tamu.edu
   Course Information [contd…]
• Grading: Projects, Labs, Exams
   – Labs              25%
   – Mid Term          25%
   – Finals            25%
   – Projects          25%
   – Class Participation (extra)       5%

• Labs
   – MIPS (Assembly Programming), Verilog (HDL)

• Projects
   – Project 1: MIPS
   – Projects 2 & 3: Verilog (Datapath Design)
  Course Information [contd…]
• Books
  – Computer Organization and Design: The Hardware/Software
    Interface, Third Edition,
    David A. Patterson and John L. Hennessy, 3rd edition, Morgran
    Kaufmann Publishers.


  – Digital Design, (optional)
    M. Morris Mano, 3rd Edition, Prentice Hall

  – Check the course webpage for other materials and links
  Course Information [contd…]
• Course Webpage
  – http://students.cs.tamu.edu/manhee/cpsc321

• CS Accounts
  – Use your CS accounts for turnin

• Course announcements will be made on
  course website and/or Neo email
            Course Contents
•   Organization of a computer
•   Assembly language
•   Design of a computer
•   Verilog
•   Future architectures
How does the course fit into the
        curriculum?

  CPSC 483 Computer Sys Design            CPSC 4xx
  ECEN 449 Micropro. Sys. Design         compiler OS




                   CPSC 321
             Computer Architecture


  ELEN 220 Intro to                ELEN 248 Intro to
    Digital Design                 DGTL Sym Design
         Things to Learn Today
• Computer architecture
• Instruction Set Architecture (ISA)
  – Load-store ISA
• Process technology – 65nm, 45nm
• Moore’s law
• Performance
  – Response time, Throughput
• Clock cycles per instruction
• Amdahl’s Law
 Computer Architecture - Definition
• Computer Architecture = ISA + MO

• Instruction Set Architecture
  – What the executable can “see” as underlying hardware
  – Logical View



• Machine Organization
  – How the hardware implements ISA ?
  – Physical View
           ISA: Critical Interface

software



                  instruction set



hardware
What is Computer Architecture ?

    Applications
                   Operating
                     System
                                Instruction set
       Compiler      Firmware    architecture
  Instr. Set Proc. I/O system

     Datapath & Control
                                   Machine
       Digital Design
                                 organization
        Circuit Design
            Layout



Many levels of abstraction
     Types of Internal Storage
• Stack, Accumulator, A Set of Registers
            Basic ISA Classes
• Accumulator:
   – Push A, Push B, Add, Pop C
• Accumulator
   – Load A, Add B, Store C
• Register-memory
   – Load R1 A, Add R3 R1 B, Store R3 C
• Register-register/load-store
   – Load R1 A, Load R2 B, Add R3 R1 R2, Store R3 C
             The Big Picture

            Processor
                                 Input
             Control
                        Memory


            Datapath
                                 Output




Since 1946 all computers have had 5 components!!!
           Technology Trends
• Processor
   – logic capacity: about 30% per year
   – clock rate:     about 20% per year
• Memory
   – DRAM capacity: about 60% per year (4x every 3 years)
   – Memory speed: about 10% per year
   – Cost per bit: improves about 25% per year
• Disk
   – capacity: about 60% per year
   – Total use of data: 100% per 9 months!
• Network Bandwidth
   – Bandwidth increasing more than 100% per year!
                Technology Trends
                                                          Microprocessor Logic Density
    DRAM chip capacity                             100000000




              DRAM
                                                    10000000
                                                                                uP-Name
       Year      Size                                                                                                 R10000
                                                                                                                   Pentium
       1980      64 Kb                                                                                             R4400
                                                                                                          i80486
       1983      256 Kb                             1000000




                                     Transistors
       1986      1 Mb                                                                            i80386

                                                                                            i80286
       1989      4 Mb                                100000
                                                                                                     R3010

       1992      16 Mb                                                              i8086
                                                                                               SU MIPS                    i80x86

       1996      64 Mb                                10000
                                                                                                                          M68K
                                                                                                                          MIPS

       1999      256 Mb                                                                                                   Alpha
                                                                        i4004
       2002      1 Gb                                  1000
                                                          1965   1970      1975     1980     1985    1990          1995      2000   2005


°   In ~1985 the single-chip processor (32-bit) and the single-board computer emerged

°   In the 2002+ timeframe, these may well look like mainframes compared single-chip
    computer (maybe 2 chips)
            Technology Trends
                                  •   65nm
                                       Intel Pentium 4 (Cedar Mill) – 2006-01-16
                                       Intel Pentium D 900-series – 2006-01-16
                                       Intel Celeron D (Cedar Mill cores) – 2006-05-28
                                       Intel Core – 2006-01-05
                                       Intel Core 2 – 2006-07-27
                                       Intel Xeon (Sossaman) – 2006-03-14
                                       AMD Athlon 64 series (starting from Lima) – 2007-02-
                                           20
                                       IBM's Cell Processor - Playstation 3
                                       Microsoft Xbox 360 "Falcon" CPU - 2007-09
                                  •   45nm
                                       Intel 5400-series Xeon(R) platform -2007-11
                                       AMD: 2008
                                  •   32nm
Smaller feature sizes                  2009-2010, Intel vs. AMD+IBM
– higher speed, density           •   22nm
                                       2011-2012
                                  •   16nm
                                       ~2013


          ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)
          Wikipedia.com
       Technology Trends




Moore’s law: number of transistors doubles every 24 months
                      Wikipedia.com
The Role of Performance
        Performance Metrics
• Response Time
  – Delay between start end end time of a task


• Throughput
  – Numbers of tasks per given time
                Question
• Suppose that we replace the processor in
  a computer by a faster model
  – Does this improve the response time?
  – How about the throughput?
                Question
• Suppose we add an additional processor
  to a system that uses the processors for
  multiple tasks simultaneously.
  – Does this improve the response time?
  – Does this improve the throughput?
      Measuring Performance
• Wall-clock time or Total Execution Time
  – System performance
• CPU Time (CPU execution time)
  – User CPU Time ~ CPU performance
  – System CPU Time

• Try using time command on UNIX system
    Real
    User
    Sys
               Performance

          (Absolute) Performance



           Relative Performance
     " X is n times faster than Y" means


n=
      Performance Example
• System A
  – CPU execution time: 2sec
• System B
  – CPU execution time: 4sec
• System A is __ times faster than System B
         CPU Performance
• CPU Execution time for a program
  = CPU Clock Cycles for a program
    X Clock Cycle time

 = CPU Clock Cycles for a program /
    Clock rate
   CPU Performance Example
• A program runs in 10 secs on computer A
  with a 4 GHz clock. We try to build
  computer B, that runs this program in 6
  secs where computer B requires 1.2 times
  as many clock cycles as computer A for
  this program. What clock rate should we
  target?
   CPU Performance Example
• Computer A
  – CPU execution time: 10sec
  – Clock rate: 4GHz
  – Clock cycles:
• Computer B
  – CPU execution time: 6sec
  – Clock rate:
  – Clock cycles:
• B’s clock cycles/A’s clock cycles = 1.2
           CPU Performance
• CPI
  = CPU clock cycles / instruction count
• CPU clock cycles
  = instruction count X CPI
• CPU time
  = instruction count X CPI X Clock cycle time


             inst count         Cycle time



                          CPI
                 CPI Example
• Computer A
  –   Cycles: 10G
  –   Instructions: 2G
  –   CPI:
  –   CPU execution time:
• Computer B
  –   CPI: 2
  –   Clock rate: 1GHz
  –   Instructions: 1G
  –   CPU execution time:
              Cycles Per Instruction
                  (Throughput)
 “Average Cycles per Instruction”
           CPI = (CPU Time * Clock Rate) / Instruction Count
               = Cycles / Instruction Count

                                              n
  CPU time  Cycle Time   CPI j  I j
                                             j 1

       n                                               Ij
CPI   CPI j  Fj           where Fj 
       j 1                                 Instructio n Count


                                   “Instruction Frequency”
               CPI Example
• Function A
  – CPI: 1
  – Instruction: 3M
• Function B
  – CPI: 3
  – Instruction: 1M
• CPI:
• Ex P.252
                Amdahl’s Law
• Pitfall: Expecting the improvement of one aspect of a
  machine to increase performance by an amount
  proportional to the size of improvement
        Amhdahl’s Law [contd…]
• A program runs in 100 seconds on a machine, with multiply
  operations responsible for 80 seconds of this time. How much do I
  have to improve the speed of multiplication if I want my program to
  run five times faster ?

• Execution Time After improvement =
  (exec time affected by improvement/amount of improvement) + exec
  time unaffected
   exec time after improvement = (80 seconds / n) + (100 – 80 seconds)

   We want performance to be 5 times faster =>
   20 seconds = 80/n seconds / n + 20 seconds

   0 = 80 / n !!!!
     Amdahl’s Law [contd…]
• Opportunity for improvement is affected by
  how much time the event consumes
• Make the common case fast
 MIPS as a performance measure
• MIPS = Instruction count
         / (Execution time X 10^6)

Ex P. 269
         Things to Learn Today
• Computer architecture
• Instruction Set Architecture (ISA)
  – Load-store ISA
• Process technology – 65nm, 45nm
• Moore’s law
• Performance
  – Response time, Throughput
• Clock cycles per instruction
• Amdahl’s Law