lec14-chameleon

Document Sample
lec14-chameleon Powered By Docstoc
					   Energy Management for
 Dynamically Reconfigurable
Heterogeneous Mobile Systems




 Paul J.M. Havinga, Lodewijk T.smit, Gerard J.M.
      Smit, Martinus Bos, Paul M. Heysters




  VLSI Algorithmic Design Automation Lab.
Abstract

   Dynamically reconfigurable Systems offer the potential for
    realising efficient systems as well as providing adaptability to
    changing system requirements.
   The research performed in the CHAMELEON project aims at
    designing such a heterogeneous reconfigurable mobile system
   The two main motivations for the system are 1) to have an
    energy-efficient system, while 2) achieving an adequate
    Quality of Service for applications




       VLSI Algorithmic Design Automation Lab.
Scenario




  VLSI Algorithmic Design Automation Lab.
Introduction

   Integration personal computing and communication device
       pager, cellular phone, digital camera, video game etc.
   User interface is multimedia oriented (terminal)
       speech recognition, video, audio
   Interaction with / reaction on environment
       access to services and resources
       uses/reacts on location information
   Excellent security and high reliability
   Resource poor (small size, light weight, low power, ..)



        VLSI Algorithmic Design Automation Lab.
Chameleon




  VLSI Algorithmic Design Automation Lab.
Mobile multimedia systems

   Design a battery powered personal mobile computing device
    that has multimedia functionality and can operate in a dynamic
    environment.

   A possible solution is to have a mobile device with a
    reconfigurable architecture so that it can adapt its operation to
    the current environment and operating condition.




       VLSI Algorithmic Design Automation Lab.
Why reconfigurability in mobile systems

   Mobile devices operate in a dynamically changing environment
    and must be able to adapt to the new environment.

   Flexibility to handle a variety of multimedia services and
    standards.

   Adaptability to accommodate environment, required level of
    security, and available resources.




       VLSI Algorithmic Design Automation Lab.
The spectrum of solutions

                                flexibility                  efficiency




application




                    General-
                     purpose
                    processor




              e.g. Pentium          Reconfigurable architecture   Application specific modules
                                                                              ASIC




         VLSI Algorithmic Design Automation Lab.
Approaches

   Partitioning on heterogeneous architectures

       Dedicated hardware(ASIC)

       FPGA(Reconfigurable computing)

       Software (general purpose processor)

       server in the wired network




        VLSI Algorithmic Design Automation Lab.
Reconfigurable computing

   Reconfigurable computing systems combine programmable
    hardware with programmable processors.

   Many Reconfigurable computing systems are based on FPGAs.
    However, these systems have a number of limitations.
       Limited functionality
       Gated capacity
       Configuration speed
       Memory structures and interface
       Tools



        VLSI Algorithmic Design Automation Lab.
System modelling of reconfiguration

   Our goal :
       Minimize energy consumption given a required QoS
       Adaptable and flexible for dynamic mobile environment

   Philosophy
       Operations on data should be done where it is most energy efficient and
        where the required communication is minimized

   Types of reconfigurable systems architectures
       Reconfigurable logic
       Reconfigurable data path(processors)
       Reconfigurable data streams


        VLSI Algorithmic Design Automation Lab.
Energy management

   Be energy aware at all levels of your system (QoS)
       technological, system architecture, operating system, applications
   React on the environment: adaptability
   Do just enough and not too much for a given task (QoS)
       do not optimize for ‘worst case’ but for the ‘current case’
   Schedule communication (use locality of reference)
   Do the tasks at the most energy-efficient platform/way
       Reconfigurable architectures




        VLSI Algorithmic Design Automation Lab.
Granularity of programming model

   A program is typically a set of instructions that modify
    dynamically the behavior of statically connected modules such
    as memories, registers and datapaths.
   Field-programmable devices(FPGAs) can give orders of
    magnitude in performance improvement for specific
    computational kernels over traditional computers by providing
    programmability at the gate level.
   In the CHAMELON project we use the term reconfigurability
    to denote all possible changes to solve a problem at various
    levels of the system architecture.



       VLSI Algorithmic Design Automation Lab.
Granularity levels of
reconfigurable computing
                                   Programmable
                                    logic blocks
                                                                          mux
                                                                add

                                        Programmable                        mux
                                         interconnect


                                                                  multiply


                                                                         add


                                                                      register




     a) reconfigurable logic                              b) reconfigurable data-path



                Device 1



                                                                         External
   Device 2                                                              servers




                                        CPU
                      switch
                                                                   Base
                                                                  station




                                                                mobile

               Device N




       c) reconfigurable data-streams                   d) dynamic load balancing




   VLSI Algorithmic Design Automation Lab.
CHAMELEON project(Overview)

   Algorithm – the most suitable algorithm(s) to execute the
    request service(s)
   Partitioning – an appropriate partitioning of the algorithms over
    the different heterogeneous parallel processing units.
   Parameters – the most optimal parameters for execution of the
    algorithms.
   Power state – the suitable energy status mode of the different
    hardware components.




       VLSI Algorithmic Design Automation Lab.
CHAMELEON System model

   Communicating processes (threads)
   Using channels
   Multiple implementations with different characteristics e.g.
       general purpose processor
       reconfigurable processor                                      P5
                                                       P1
       ASIC
   For fine to coarse grain granularities
       Hierarchical                              P2
                                                                 P4




                                                            P3



        VLSI Algorithmic Design Automation Lab.
Operating system support

   Mapping phase – partitioning of the system specification

   Constellation phase – loading the programs, setting up
    communication channels

   Execution phase – executing the programs

   Termination – partial reconfiguration




       VLSI Algorithmic Design Automation Lab.
Target CHAMELEON architecture

 Low power, reconfigurable accelerator for an application
 specific domain




    VLSI Algorithmic Design Automation Lab.
Reconfigurable dataflow
: the Octopus switch
   Communication switch
       Surrounded by several
        autonomous modules




   Connection centric
       decomposed out of application specific modules
       CPU is out of the data path
       data traffic is reduced
       with each connection a certain QoS is associated
       devices are more intelligent


        VLSI Algorithmic Design Automation Lab.
Turbo code


     장단점
     Convolution은 블록부호에 비해 다소 복잡한 구조를 가지므로 해석상의
      난점은 있으나 오류정정능력이 매우 뛰어남.
     구성 특징
     선형부호로서 두 개 부호어의 합은 또다른 부호어가 되고 천이 레지스터를
      memory로 가진다.




     VLSI Algorithmic Design Automation Lab.
Turbo Decoder Algorithm
    MAP(Maximum a posteriori)
    Log-MAP: Very high complexity
    MAX-log-MAP: High complexity

    SOVA(Soft-Output Viterbi Algorithm)
    Low performance
    Low complexity




         VLSI Algorithmic Design Automation Lab.
Experiment Result

                                             Eb/No




                           MAP과 SOVA 성능비교

   VLSI Algorithmic Design Automation Lab.
3G wireless link




   VLSI Algorithmic Design Automation Lab.
System Overview




  VLSI Algorithmic Design Automation Lab.
Rake Receiver




  VLSI Algorithmic Design Automation Lab.
Turbo Decoder




  VLSI Algorithmic Design Automation Lab.
 SOVA Decoder


                              Deinterleaver



Lcy1                                                                   L2(u’)
                          Le1(u’)
                                                                 -
          SOVA            -                       SOVA
                              Interleaver
Lcy2     Decoder1                                Decoder2
                                                                Deinterleaver
                     L1(u’)
                               Lcy1                         L2(u’)
Lcy3




                                                                Decoded data
                                                                      u’

       VLSI Algorithmic Design Automation Lab.
Costs




  VLSI Algorithmic Design Automation Lab.
Too much parameters




  VLSI Algorithmic Design Automation Lab.
Plot - rake




   VLSI Algorithmic Design Automation Lab.
Plot - Turbo




   VLSI Algorithmic Design Automation Lab.
Conclusion




  VLSI Algorithmic Design Automation Lab.
      Complex QPSK


                                                       +   I        Baseband
                            Iin
YI1
       Walsh                          When             _              Filter
                                     Enabled,
      Function                                                                      +
                                      Rotate                                            f1
                                     By 90도                                         +

      QOFsign                        (Output           +
                            Qin                            Q        Baseband
YQ1                                  Qin+j Iin)
                                                       +              Filter

                                  Enable
                Walsh rot
                                   PN I

                                   PN Q
                                                               f1
 YI2                                                           f2
 YQ2                  CQPSK 2                     f2


                                                               f3
 YI3

 YQ3
                    CQPSK 3 Design Automation Lab.
                  VLSI Algorithmic
                                       f          3

                                                                    (2개의 안테나로 송신)
       Discussion
          Experiment Result



                   1Finger     2Finger     3Finger                    1Finger     2Finger     3Finger
Cost         500           1         0.97        0.96 Cost      500           1           1           1
            1000        0.84         0.78        0.91          1000        0.63        0.52        0.48
            1500         0.7         0.54        0.58          1500        0.43        0.36        0.37
            2000        0.64         0.46        0.49          2000        0.34        0.32        0.31
            2500        0.63         0.42        0.48          2500         0.3        0.31        0.29
            3000         0.6           0.4       0.43          3000        0.29         0.3        0.28
            3500        0.58         0.38        0.41          3500        0.28        0.29        0.28
            4000        0.56         0.37          0.4         4000        0.27        0.29        0.28


   Relation Between Finger and Iterations at 0dB         Relation Between Finger and Iterations at 10dB




              VLSI Algorithmic Design Automation Lab.
References

      http:www.3gpp.org
      Near optimum error correcting coding and decoding: Turbo-codes(C.
       Berrou and A. Glavieux)
      Interference cancellation in the synchronous downlink of cdma-
       systems(T.F.M. Bossert)
      A communication technique for multipath channels(R. Price and P.
       Green)
      Space-time processing for CDMA mobile communications(P.van
       Rooyen, M.Lotter, and D. van Wyk)




      VLSI Algorithmic Design Automation Lab.
Conclusion

   Reconfigurable systems are suitable for the dynamic
    application and communication environment of wireless
    multimedia devices.

   A hierarchical system model is used in which Quality of
    Service and energy consumption play a crucial role.

   This model is used to dynamically partition tasks of an
    application.



       VLSI Algorithmic Design Automation Lab.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:3/13/2012
language:
pages:36