DSP Processor

Document Sample
DSP Processor Powered By Docstoc
					       DSP Processor



Behdad Hosseini, 781413112
University of Isfahan
April, May 2002

Dedicated to Z. Haghshenas
                  Introduction

 Digital signals & systems
 DSP (Digital Signal Processing)
 Digital Signal Processors (DSPs) vs General
  Purpose Processors (GPPs)




                                                2
                   DSPs Features

 High speed DSP computations
      Specialized instruction set
      High performance repetitive numeric calculations
      Fast & efficient memory accesses


 Special mechanism for real-time I/O
 Low power consumption
 Low cost in comparison with GPPs

                                                          3
        DSPs General Applications

   Digital cellular phones       Voice mail
   Satellite communications      Digital cameras
   Seismic analysis              Navigation equipment
   Vehicle collision             Modems (POTS, ISDN,
    avoidance                      cable,...)
   Secure communications         Audio production
   Voice over Internet           Noise cancellation
   Tape less answering           Videoconferencing
    machines                      Medical ultrasound
   Motor control                 Music synthesis, effects
   Sonar                         Radar
                                                              4
          DSPs μPs Applications

 Speech and audio compression
 Filtering
 Modulation and demodulation
 Error correction coding and decoding
 Audio processing (e.g., surround sound, noise reduction,
  equalization, sample rate conversion, echo cancellation)
 Signaling (e.g., DTMF detection)
 Speech recognition
 Signal synthesis (e.g., music, speech synthesis)

                                                             5
            DSPs Characteristics

1.   Data path & internal ALU architecture
2.   Specialized instruction set
3.   External memory architecture
4.   Specialized addressing modes
5.   Specialized execution control
6.   Specialized peripherals for DSP



                                             6
                       Data Path

             DSPs                          GPPs
 Performs all key arithmetic    Multiplies often take >1
  operations in 1 cycle.          cycle
 Hardware support for           Shifts often take >1 cycle
  managing numeric fidelity:     Other operations (e.g.
      Shifters                   saturation, rounding)
      Guard bits                 typically take multiple
      Saturation                 cycles



                                                               7
       DSPs Data Path Example




A representative conventional
fixed-point DSP processor data
path (from the Motorola
DSP560xx, a 24-bit, fixed point
processor family)


                                  8
                  Instruction Set

           DSPs                        GPPs
 Specialized, complex       General-purpose
  instructions                instructions
 Multiple operations per    Typically only one
  instruction (e.g. using     operation per instruction
  VLIW)




                                                          9
                                 VLIW

Very long instruction word
(VLIW) architectures are
garnering increased attention
for DSP applications.
Major features:
      Multiple independent
       operations per cycle
      Packed into a single large
       “instruction” or “packet”
      More regular, orthogonal,
       RISC-like operations Large,
       uniform register sets
                                        10
           Memory Architecture

          DSPs                          GPPs
 Harvard architecture         Von Neumann architecture
 2-4 memory accesses/cycle    Typically 1 access/cycle
 No caches—on-chip            May use caches
  SRAM




                                                       11
Von Neumann Architecture




           The Von Neumann memory
           architecture, common among micro
           controllers. Since there is only one
           data bus, operands cannot be loaded
           while instructions are fetched,
           creating a bottleneck that slows the
           execution of DSP algorithms.


                                                  12
Harvard Architecture




         A Harvard architecture, common to
         many DSP processors. The
         processor can simultaneously access
         the two memory banks using two
         independent sets of buses, allowing
         operands to be loaded while fetching
         instructions.


                                                13
                  Addressing Modes

             DSPs                          GPPs
 Dedicated address              Often, no separate address
  generation units                generation unit
 Specialized addressing         General-purpose
  modes; e.g.:                    addressing modes
      Auto-increment
      Modulo (circular)
      Bit-reversed (for FFT)
 Good immediate data
  support

                                                               14
             Execution Control

 Hardware support for fast looping
 “Fast interrupts” for I/O handling
 Real-time debugging support




                                       15
                   Peripherals

   Host ports
   Bit I/O ports
   On-chip DMA controller
   Clock generators
   Synchronous serial ports
   Parallel ports
   Timers
   On-chip A/D, D/A converters
                                  16
           DSPs classifications (1)

 By arithmetic format
      Fixed-point
      Floating-point
      Block floating-point
 By data width
      Typical fixed-point DSPs: 16-bit
      Typical floating-point DSPs: 32-bit
 By memory organization
 By multiprocessor support
                                             17
          DSPs classifications (2)

 By speed
     Million of instruction per second (MIPS)
     A basic operation (e.g. MAC)
     A basic algorithm (e.g. FFT, FIR or IIR filter)
     Benchmark programs
 By power consumption
     Operating voltage
     Sleep or idle mode
     Programmable clock dividers
     Peripheral control                                18
             DSPs Evolution

 First generation (TI TMS32010)
 Second generation (Motorola DSP56001, AT&T
  DSP16A, Analog Dev. ADSP-2100, TI
  TMS320C50)
 Third generation (Motorola DSP56301, TI
  TMS320C541, TI TMS320C80, Motorola
  MC68356)
 Fourth generation (TI TMS320C6201, Intel
  Pentium MMX)
                                               19
            First Generation (1982)

   16-bit fixed-point
   Harvard architecture
   Accumulator
   Specialized instruction set
   390 ns MAC time (228 ns
    today)




                                      20
         Second Generation (1987)

   24-bit data, instructions
   3 memory spaces (X, Y, P)
   Parallel moves
   Single- and multi
    instruction
   hardware
   loops
   Modulo addressing
   75 ns MAC (21 ns today)

                                    21
         Third Generation (1995)

 Enhanced conventional DSP architectures
 3.0 or 3.3 volts
 More on-chip memory
 Application-specific function units in data path or as
  co-processors
 More sophisticated debugging and application
  development tools
 DSP cores (Pine & Oak from DSP G., cDSP from TI)
 20 ns MAC (10 ns today)
                                                     22
       Fourth Generation (1998)

 Blazing clock speeds and super scalar architectures
 VLIW-like architectures, achieve top performance
  via high parallelism and increased clock speeds
 3 ns MAC throughput
 Expensive, power-hungry




                                                    23
DSPs Evolution Chart




                       24
        DSPs Performance Chart




Execution times for a 256-
point complex FFT in
microseconds


                                 25
                  Role of GPPs (1)

 Added capabilities:
      Add single-instruction, multiple-data instruction set
       extensions (e.g., MMX Pentium)
      Integrate a fixed-point DSP processor-like data path and
       related resources with an existing mC/mP core (e.g.
       Hitachi SH-DSP)
      Add a DSP co-processor to an existing mC/mP core (e.g.,
       ARM Piccolo)
      Create an all-new, hybrid architecture (e.g. Siemens
       TriCore)
                                                             26
                  Role of GPPs (2)

 Assisted capabilities:
      Very high clock rates (500-1000 MHz)
      Super scalar (“multi-issue”) architectures
      Single-cycle multiplication and arithmetic ops.
      Good memory bandwidth
      Branch prediction
      In some cases, single-instruction, multiple-data (SIMD)
       ops
      Caching & pipelining

                                                                 27
                      Conclusion

 DSP processor performance has increased by a factor of
  about 150x over the past 15 years (~40%/year)
 Processor architectures for DSP will be increasingly
  specialized for applications, especially communications
  applications
 General-purpose processors will become viable for many
  DSP applications
 Users of processors for DSP will have an expanding array of
  choices
 Selecting processors requires a careful, application-specific
  analysis
                                                             28
         Web Links & Information

 http://www.bdti.com
 http://www.eg3.com/dsp


 Buyer’s Guide to DSP Processors, Berkeley, California:
  Berkeley Design Technology, Inc., 1994, 1995, 1997, 1999.
 Phil Lapsley, Jeff Bier, Amit Shoham, and Edward A. Lee,
  DSP Processor Fundamentals: Architectures and Features,
  Berkeley, California: Berkeley Design Technology, Inc.,
  1996.
 Will Strauss, DSP Strategies 2002, Tempe, Arizona:
  Forward Concepts, 1999.
                                                          29