; Energy Consumption Reduction with Low Computational Needs
Learning Center
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Energy Consumption Reduction with Low Computational Needs


This paper take about these topics cpu, power Consumption, processor , multicore , multiprocessors, multithread .

More Info
  • pg 1
									                                                       Author manuscript, published in "48th IEEE Conference on Decision and Control (CDC'09), Shanghai : China (2009)"

                                               Energy Consumption Reduction with Low Computational Needs
                                                  in Multicore Systems with Energy-Performance Tradeoff
                                                                                   Sylvain Durand and Nicolas Marchand

                                          Abstract— A two voltage level electronic device is interesting       A good energy-performance tradeoff could be achieved
                                       because the clock frequency and the supply voltage level could       using a commonly used approach in embedded systems:
                                       be reduced (respecting certain rules) in order to decrease           the Dynamic Voltage and Frequency Scaling (DVFS). This
                                       the energy consumption. We proposed in a previous paper a
                                       robust control architecture to deal with this power-performance      method consists in adapting the voltage and the frequency to
                                       tradeoff and we are now interested in extending this principle       the computational load and leads up to an important energy
                                       for several devices which works together since they are all          consumption reduction (regarding the application) [10]. Fur-
                                       supplied with the same voltage and clock frequency. Thus, an         thermore, it seems that most of the applications could run
                                       intuitive multicore control strategy which duplicates the whole      with a reduced voltage [2], [3]. Thus, several behaviors are
                                       monocore architecture as much as devices is compared with
                                       a second strategy where the duplication is reduced as much           known to minimize the energy consumption. Firstly, each
                                       as possible. It appears that the proposal clearly gives a low        task has to be considered independently and its execution
                                       control computational needs with the same reduction of the           time has to fit with the deadline. Moreover, selecting some
                                       energy consumption.
hal-00404053, version 1 - 6 Apr 2011

                                                                                                            suitable voltage levels leads to a drastic energy reduction
                                                                                                            even if the number of levels is very small [7]. The supply
                                                           I. INTRODUCTION
                                                                                                            voltage has to be reduced as much as possible and the fre-
                                          An energy-performance tradeoff is required in many em-            quency clock adapted to the computational load to minimize
                                       bedded electronic systems. Actually, three power consump-            the energy consumption [11].
                                       tion sources exist in CMOS circuits [4], which could be
                                                                                                               Based on these different rules, we proposed in [5] a
                                       sorted into a dynamic consumption from switching of elec-
                                                                                                            robust strategy to control the clock frequency and the supply
                                       trical gates and a static consumption from short circuit and
                                                                                                            voltage level of an electronic device. The proposal leads
                                       leakage currents:
                                                                                                            to minimize the energy consumption while guaranteeing a
                                           P    = Pswitching + Pshort circuit + Pleakage                    good computational performance. We are now interested
                                           P    = Kdyn fclk Vdd + Ksc fclk Vdd + Kleak Vdd
                                                                                                            in extending this principle to several devices which works
                                                                                                            together (with the same voltage and frequency domain) but
                                          It appears that the consumption could be reduced by
                                                                                                            where each device has to deal with a different load. In
                                       decreasing Vdd , i.e. the supply voltage, or fclk , i.e. the
                                                                                                            the following section, we first propose to bring back the
                                       clock frequency. However, decreasing only the frequency
                                                                                                            monocore system architecture and summarize its control
                                       will decrease the power consumption and results in a slower
                                                                                                            strategy. In section III, the multicore architecture is then
                                       running task but the total energy consumption will remain
                                                                                                            presented and two control strategies are detailed: a first
                                       unchanged [12]. The voltage has hence to be reduce in order
                                                                                                            intuitive one which duplicates the monocore principle as
                                       to decrease the energy consumption. Furthermore, the supply
                                                                                                            much as devices and a second strategy which reduces consid-
                                       voltage is the dominant term especially because the dynamic
                                                                                                            erably the computational needs. Finally the two controllers
                                       power is the most important part in (1). In other words,
                                                                                                            are compared in section IV in term of energy consumption
                                       decreasing the voltage will almost quadratically decrease the
                                                                                                            and control computational needs.
                                       energy consumption. Unfortunately, this drop will decrease
                                       the computational speed (because of the propagation delay                       II. MONOCORE SYSTEM PRINCIPLE
                                       of transistors) and controlling the supply voltage is hence             The system architecture with only one device to control
                                       a power-delay tradeoff: the power consumption decreases              is shown on Figure 1.
                                       while the delay increases. That is why the supply voltage
                                       and the clock frequency have to be controlled together to
                                                                                                                                        f                              fclk
                                       guarantee the critical path (the longest electrical path a signal     ref
                                                                                                                                                          Oscillator             Device         ω

                                       can travel to go from a point to another of the circuit).                       Controller
                                                                                                                                                               Vdd                   Vdd
                                                                                                                   ω                Vlevel        Vdd
                                       Clearly, it is required to decrease the clock frequency before                                           hopping
                                       decreasing the supply voltage and, respectively, increase the                                                                          Monocore system

                                       supply voltage before increasing the clock frequency.
                                                                                                                              Fig. 1.       Monocore system architecture
                                         S. Durand is with NeCS Project-Team, INRIA - GIPSA-lab - CNRS,
                                       Grenoble, France, sylvain.durand@inrialpes.fr
                                         N. Marchand is with NeCS Project-Team, INRIA - GIPSA-lab - CNRS,     The Device is the system to control. It usually runs at
                                       Grenoble, France, nicolas.marchand@gipsa-lab.inpg.fr                 nominal supply voltage and constant clock frequency but
                                       these quantities will now dynamically vary in order to                       there are now N devices to control, which means as many
                                       reduce the energy consumption. That is possible introducing                  references ref N given by the operating system (the number
                                       a closed-loop with a controller to monitor the activity of the               of instructions and the deadline for each task) and as many
                                       device (its computational speed ω) and to adapt the supply                   measured computational speeds ω N as devices. Therefore the
                                       voltage and the clock frequency regarding the computational                  controller has to control the whole system but devices do
                                       load ref provided by the operating system for each task.                     not work independently since they are all supplied with the
                                                                                                                    same voltage Vdd and the same clock frequency fclk . The
                                          The Oscillator and the Vdd-hopping are the two actuators
                                                                                                                    only allowed dimension of freedom is to trigger a device
                                       used in some DVFS systems. They respectively provide the
                                                                                                                    with a ratio of the clock fclk because in fact in practice it is
                                       clock frequency and the supply voltage to the device:
                                                                                                                    possible to add one or two NOPs (i.e. No OPeration) between
                                          • The oscillator could be a ring oscillator [6].
                                                                                                                    each instruction in order that the device runs twice or three
                                          • The Vdd-hopping principle is described in [1]. Two
                                                                                                                    times slower. For this reason, now the energy controller has
                                       voltage levels are available (Vlow and Vhigh ) and the one
                                                                                                                    to provide the frequency ratios ρN anymore.
                                       or the other could be achieved (with a certain transition time
                                       and dynamics that depends upon the internal controller of
                                       the Vdd-hopping) regarding the Vlevel input signal: Vlevel =                 ref N                 Vlevel      Actuators
                                                                                                                                                                   fclk &Vdd

                                       levellow to require the low voltage and respectively levelhigh                        Controller
                                       for the high voltage.                                                                                                  ρ1           Device 1
                                                                                                                       ωN                  ρN                                              ωN
                                          The Controller has to provide the control signals to the                                                            ρ2           Device 2

                                       actuators. Actually, the controller can be divided into two                                                                                    ω3
                                                                                                                                                              ρ3           Device 3
                                       parts, as depicted on Figure 2:
hal-00404053, version 1 - 6 Apr 2011

                                          • The computational speed controller (CSC) provides the                                                             ρn           Device n   ωn

                                       computational speed set point ωsp . Thus, from some task                                                    Multicore system

                                       informations - for each task Ti the operating system provides
                                       the computational load (i.e. the number of instructions Ci )                                 Fig. 3.        Multicore system architecture
                                       and the time before the task has to be executed (i.e. the dead-
                                       line Ni ) - a fast predictive control law permits to calculate                  Notations: ρj (lower case indice) denotes the signal ρ of
                                       the best speed set point in order to minimize the penalizing                 the device j, whereas ρN (upper case indice) means that
                                       high voltage running time (and so the energy consumption)                    there are N signals ρ, one for each device.
                                       while guaranteeing the computational performance.
                                          • Then the frequency and voltage level controller (FVC)                      In the two following subsections we will detail two control
                                       fits the measured speed ω with the desired one ωsp , by                       strategies: a first intuitive one which duplicates the monocore
                                       adapting the frequency f and the voltage level Vlevel .                      principle as much as devices and a second one which tries
                                                                                                                    to minimize the computational needs of the controller.
                                           Ci       Computational   ωsp                           Monocore
                                                        speed             Frequency and f
                                                                           voltage level Vlevel
                                                                                                   system    ω
                                                                                                                    A. Multicore control based on full duplication of the mono-
                                                ω     controller      ω     controller               Σ              core control strategy
                                                                                                                       A first way to control several devices is to duplicate the
                                                                                                                    monocore control strategy (detailed in section II) as much
                                       Fig. 2. Monocore controller architecture: a computational speed controller   as devices. The resulting multicore architecture is presented
                                       (CSC) plus a frequency and voltage level controller (FVC)
                                                                                                                    on Figure 4 and could be divided in three steps:
                                          The whole monocore controller (CSC + FVC) leads to                           1) First, the computational speed controller (CSC) calcu-
                                       a robust control (see [5] for further details): for a given                  lates the speed set points ωsp for the whole devices. Thus the
                                       test bench, the device runs at the penalizing high supply                    set point ωsp is independently calculated for each device j,
                                       voltage only during 30% of time and an energy consumption
                                                                                                                    using the task information Ci and Lj (given by the operating
                                       reduction of about 20% is achieved. We propose next to adapt                 system) and the measured speed ω j .
                                       this principle to a system with several devices.                                2) Then the frequency and voltage level controller (FVC)
                                                                                                                    independently calculates the frequencies f N and the voltage
                                                    III. MULTICORE SYSTEM PRINCIPLE                                           N
                                                                                                                    levels Vlevel usually required to control a single device.
                                          The system architecture with several devices to control                      3) Finally a frequency ratio controller compares the
                                       is shown on Figure 3. In fact this system is not so different                calculated frequencies f N to deduce the critical device c,
                                       from the monocore one (presented in section II and shown on                  i.e. the device which needs the maximal frequency to fit with
                                       Figure 1). Indeed the bases remain the same, with a controller               its load. Thus the frequency f and the voltage level Vlevel
                                       which sends the frequency f and the voltage level Vlevel to                  sent to the actuators are those from the critical device, i.e.
                                       the actuators, i.e. a ring oscillator and a Vdd-hopping which                           c
                                                                                                                    f c and Vlevel , and the frequency ratios ρN are obtained by
                                       respectively provide the clock frequency and the supply                      doing the ratio between the frequency of the current device
                                       voltage to the electronic devices. The main difference is that               f j and the one of the critical device f c .
                                       Ci                                                         fcor
                                                          ωsp                 N
                                                                             fcor     Frequency           Multicore        of instructions and the deadline because the computational
                                                 CSCN                                            Vlevel    system
                                                                   FVC   N     N
                                                                             Vlevel      ratio                        ωN   load which was already executed is necessary too. Therefore
                                             N                 N
                                         ω                 ω                          controller  ρN
                                                                                                            ΣN             we propose to duplicate the computational speed controller
                                                                                                                           (which seems to have to be repeated anyway). Thus the
                                                                                                                           multicore architecture on Figure 5 is proposed:
                                       Fig. 4. Multicore control architecture based on full duplication of the
                                       monocore control strategy: the computational speed controller (CSC) and
                                                                                                                              1) First, the computational speed controller (CSC) pro-
                                       the frequency and voltage level controller (FVC) are duplicated as much             vides the speed set points ωsp , from which the frequency
                                       as devices and a frequency ratio controller calculates the critical frequency       ratios ρ could be obtained since they provide information
                                       and voltage level to deduce the frequency ratios ρN
                                                                                                                           on the remaining computational load.
                                                                                                                              2) Then the frequency ratio controller compares the whole
                                          This intuitive strategy guarantees that the tasks are cor-                       speed set points ωsp to deduce the critical task c, i.e. the task
                                       rectly performed for all devices because each device is in-                         which needs the maximal speed to fit with its deadline. Thus
                                       dependently controlled using the monocore strategy. Indeed,                         the speed set point ωsp and the measured speed ω sent to the
                                       the monocore strategy works for one device and we focus the                         FVC are those calculated for the critical task, i.e. ωsp and
                                                                                                                             c                             N
                                       frequency and the voltage level decision on the critical one,                       ω , and the frequency ratios ρ are obtained by doing the
                                       i.e. the device which has to treat the task with the highest                        ratio between the speed set point of the current device ωsp
                                       computational needs. Thus, all the non-critical tasks will be                       and the one of the critical task ωsp .
                                       executed with the critical voltage level and a frequency lower                         3) Finally the frequency and voltage level controller
                                       or equal to the critical frequency. Moreover, an non-critical                       (FVC) calculates the frequency f and the voltage level Vlevel
                                       device could become the critical one whereas its task requires                      to send to the actuators only for the critical device, i.e. the
hal-00404053, version 1 - 6 Apr 2011

                                       more and more computational needs.                                                  device which has to compute the critical task.
                                                                                                                              With this proposal, only the CSC is repeated and not
                                          An improvement could be done for the non-critical de-                            the FVC anymore. We so hope a reduction of the control
                                       vices. Actually, if a device runs at high level then it is                          computational needs without impacting the gain on the
                                       forced to the maximal frequency in order to run the shortest                        energy consumption.
                                       possible time at the penalizing high supply voltage (see [5]
                                       for further details). A non-critical device - which a priori                         N                                      ωsp         fcor
                                                                                                                           Ci                 N
                                                                                                                                             ωsp      Frequency    ω     FVC   Vlevel   Multicore
                                       could run at Vlow - will hence have its frequency forced                            NiN       CSCN
                                                                                                                                                         ratio                           system
                                                                                                                                 N                N
                                       anyway when the critical device needs to run at Vhigh . For                           ω                ω       controller
                                                                                                                                                                                ρN        ΣN
                                       this reason, we propose to force only the frequency of the
                                       critical device. However in practice the critical device is not
                                       known yet when the frequencies are calculated, i.e. in step 2,                      Fig. 5. Multicore control architecture based on partial duplication of the
                                       because the frequency ratio controller determined it in step 3.                     monocore control strategy: only the computational speed controller (CSC)
                                       Fortunately, a solution consists in using the device which                          is duplicated as much as devices and then a frequency ratio controller
                                                                                                                           calculates the critical speed set point which will be used by the frequency
                                       was critical during the previous sampling period, by using                          and voltage level controller (FVC) and deduces the frequency ratios ρN
                                       the assumption that the critical device does not often change.
                                          This intuitive duplication of the whole monocore principle                          Though all the devices are not independently controlled
                                       leads to reduce the energy consumption of several devices                           using the monocore strategy, the computational performances
                                       working together while guaranteeing their computational                             are yet guaranteed for each device. Indeed, with this second
                                       performance. Nevertheless, a consequence is that the control                        architecture the monocore control strategy only guarantees
                                       computational needs are multiplied as much as devices and                           that the critical task will fit with its deadline, since the
                                       the number of variables seriously increases too. That is                            monocore control strategy is only applied to the critical
                                       why we propose next to duplicate only some parts of the                             device. The frequency ratios for the non-critical devices
                                       monocore control strategy.                                                          are then calculated from the computational load of the
                                                                                                                           task of each device which is finally adjusted thanks to the
                                       B. Multicore control based on partial duplication of the                            CSC. Thus all the non-critical tasks are executed until their
                                       monocore control strategy                                                           deadline anyway, or a task becomes the critical one when its
                                          This second strategy tries to minimize the control compu-                        computational needs become the more important one.
                                       tational needs by not intuitively duplicating all the monocore
                                       control strategy. In fact, the frequency ratios ρN require to be                    C. Discret values of the frequency ratios
                                       calculated and so some parts have necessary to be duplicated                           One could note that the control algorithms proposed
                                       in order to obtain the N signals. The aim is to repeat the                          in both previous subsections were developed with ideal
                                       least code as possible. The best solution would be to use the                       continuous frequency ratios ρN . However, as explained in
                                       references ref N (given by the operating system) to deduce                          introduction of the multicore principle, some devices could
                                       the ratios without duplicating any part of the monocore                             be triggered with a ratio of the clock frequency fclk by
                                       strategy, but these signals are not relevant enough. Indeed,                        adding NOPs between instructions in order that the device
                                       the critical task could not be known only from the number                           runs slower. This is why the frequency ratios could only be a
                                       discrete value which correspond to the number of NOPs, i.e.
                                         = {1; 1 ; 1 } for 0, 1 or 2 NOPs respectively added between
                                                2 3
                                       each instruction (note that the discrete frequency ratios will
                                       be called and the continuous ones ρ).
                                          In order to implement this behavior, we first have to
                                       calculate the continuous ratios ρN (i.e. ρj = fcor /fcor for
                                                                                         j    c

                                       the multicore control strategy based on full duplication of
                                       the monocore control strategy and ρj = wsp /wsp for the
                                                                                      j    c

                                       one based on partial duplication). Then, iterations have to
                                       be done for each device j in order to deduce the discrete
                                       frequency ratio j just upper than the value of the continuous
                                       ratios ρj , as depicted by the below algorithm:
                                                             1 if 1 < ρj ≤ 1
                                                                      2

                                                                    if 1 < ρj ≤ 1
                                                             1
                                                                2      3        2
                                                         =                                        (2)
                                                             1 if 0 < ρj ≤ 1
                                                             3
                                                                               3

                                                                0 otherwise

                                         This discrete ratio behavior would lead to a less energy
                                       efficient system because the frequencies of the non-critical
hal-00404053, version 1 - 6 Apr 2011

                                       devices will be higher than required - thanks to (2) -
                                       contrary to the continuous case where these frequencies
                                       correspond exactly to the desired ones. Moreover, the control
                                       computational needs would increase a little bit thanks to the
                                       added code required to calculate the discrete ratios N .          Fig. 6. References used for the simulations: the number of instructions, the
                                                                                                         deadline and the laxity (the remaining available time to execute the task)
                                                 IV. PERFORMANCE EVALUATION                              for each device
                                         This section presents some simulation results. The bench-
                                       mark test is the same for all the simulations, where four
                                       devices with a different reference (number of instructions          The results are quantified in term of energy consumption
                                       and deadline shown on Figure 6) have to be controlled:            and computational needs:
                                       device 1 → three tasks to execute: the first task starts with      Energy consumption of the system: The energy consump-
                                            5 instructions to do in 0.5µs, then a 75 instruction task         tion is calculated in order to have an idea of the
                                            has to be executed in 2.5µs and the last one has to               reduction achieved thanks to our proposal. Thus, the re-
                                            compute 10 instructions in 1µs.                                   lation (1) is used and a ratio of this power consumption
                                       device 2 → three tasks also: a 15 instruction task to execute          is added due to the Vdd-hopping principle: 20% more
                                            in 1.25µs, a task with 50 instructions to do in 2.25µs            during the voltage transition time and 3% more during
                                            and then 5 instructions to execute in 0.5µs.                      the steady state [8]. Finally, an integration during the
                                       device 3 → a single task of 40 instructions to do in 4µs.              whole running time gives the total energy consumption.
                                       device 4 → three tasks again: 10 instructions to compute in       Computational needs of the controller: The control laws
                                            0.75µs, a task with 20 instructions to do in 0.75µs and           are compared in term of computational needs, i.e. the
                                            a last 40 instruction task to execute in 2.5µs.                   number of instructions required to calculated the com-
                                                                                                              putational speed set points, the frequencies, the voltage
                                          First, the simulation results for both control strategies           levels and the frequency ratios. To do that, we use the
                                       (with ideal continuous frequency ratios) are shown on Fig-             Lightspeed Matlab toolbox proposed by T. Minka [9],
                                       ures 7 and 8. The top plots show the average speed set point           which provides a number of flops for each instruction.
                                       (for guideline), the speed set point ωsp , the measured speed ω     Moreover, the strategies are compared with a system
                                       and the critical speed ω c (for guideline) for each device. One   using the intuitive control strategy (by duplicating the whole
                                       could verify that ω = ω c when the device is the critical one     monocore control strategy) but without Dynamic Voltage
                                       (highlighted by the gray areas on plots). Moreover, the supply    Scaling (DVS): in this case the measured speed tracks the
                                       voltage Vdd (which is the same for the whole devices because      average speed set point and the supply voltage is fixed to the
                                       of the multicore architecture) is shown on the bottom plot.       penalizing high voltage, i.e. Vlevel = levelhigh .
                                       Note that the calculated frequency f or the clock frequency
                                       fclk and the voltage level Vlevel are not plotted because            In both cases, the system runs during more than 50% of
                                       they do not provide relevant information: the frequencies are     the simulation time at low voltage and a reduction of the
                                       proportional to the speed and the level can be deduced from       energy consumption of about 20% is achieved in comparison
                                       the voltage.                                                      with a system without DVS. The differences between the two
hal-00404053, version 1 - 6 Apr 2011

                                       Fig. 7. Simulation results of the multicore controller based on full dupli-   Fig. 8. Simulation results of the multicore controller based on partial
                                       cation of the monocore control strategy (with ideal continuous frequency      duplication of the monocore control strategy (with ideal continuous fre-
                                       ratios): energy consumption of 3.976 · 10−5 J and computational needs         quency ratios): energy consumption of 3.98 · 10−5 J and computational
                                       of 5.8 · 105 f lops, that is 82.2% of energy consumption and 94% of           needs of 3.8 · 105 f lops, that is 82.4% of energy consumption and 62% of
                                       computational needs compared to a controller without DVS                      computational needs compared to a controller without DVS

                                       control strategies are during the voltage transitions and come                and so the critical speed - remains continuous.
                                       from the choice of the critical device:                                          While the energy consumption is very similar for both
                                          A) For the multicore control strategy based on full dupli-                 strategies, the computational needs is considerably reduced
                                       cation of the monocore control strategy, one could see on                     for the second one with a drop of 35% of the number of
                                       Figure 7 that the measured speed ω is continuous for all the                  flops. For this reason, it would be the strategy to use.
                                       devices. This is because the ratios ρN are obtained from the                     Finally we propose to compare the simulation results of the
                                       frequencies f N independently calculated for each device.                     control strategy with low computational cost, on a first hand
                                          B) For the strategy based on partial duplication, one could                when the frequency ratios are the ideal continuous variables
                                       see on Figure 8 a discontinuity of the measured speed ω as                    ρN and on an other hand when they are the discrete variables
                                       soon as the critical device changes, such as at time 2.35µs                    N
                                                                                                                         described by the algorithm (2). One could immediately
                                       on device 2. Indeed, the frequency ratios ρN are obtained                     remarks than the results, respectively shown on Figures 8
                                       from the speed set points ωsp which are switching variables                   and 9, are quite similar. The main difference is that the
                                       due to their construction (see [5] for further details). Thus                 measured speed ω does not track the speed set point ωsp in
                                       the speed set point value of a device could suddenly change                   the discrete case as well as in the continuous case. However,
                                       and so are the ratios. Nevertheless, the critical frequency -                 the algorithm assures that the speed will be at least upper
                                                                                                                        while guaranteeing the computational performance.
                                                                                                                           While the first multicore control strategy intuitively dupli-
                                                                                                                        cates the whole monocore architecture as much as devices,
                                                                                                                        the second strategy - the contribution of this paper - tries
                                                                                                                        to minimize as much as possible the duplication in order to
                                                                                                                        decrease the control computational needs. Both architectures
                                                                                                                        lead to a similar gain of energy consumption (compared to a
                                                                                                                        system without DVS mechanism) but an important reduction
                                                                                                                        of the number of flops is achieved with the second one. We
                                                                                                                        finally propose to use discrete frequency ratios which are the
                                                                                                                        only way to implement our controller in practice.
                                                                                                                           Next steps in this research is to test these control strategies
                                                                                                                        in practice.
                                                                                                                                       VI. ACKNOWLEDGMENTS
                                                                                                                          This research has been supported by the NeCS Project-
                                                                                                                        Team (INRIA, GIPSA-lab, CNRS) in the ARAVIS project
                                                                                                                        context. ARAVIS project is a Minalogic project gathering ST
                                                                                                                        Microelectonics with academic partners of different fields,
                                                                                                                        namely TIMA and CEA-LETI for micro-electronics and
hal-00404053, version 1 - 6 Apr 2011

                                                                                                                        INRIA for operating system and control. The aim of the
                                                                                                                        project is to overcome the barrier of subscale technologies
                                                                                                                        (45nm and smaller).
                                                                                                                                                    R EFERENCES
                                                                                                                         [1] C. Albea, C. Canudas de Wit, and F. Gordillo. Control and stability
                                                                                                                             analysis for the vdd-hopping mechanism. In Proceedings of the IEEE
                                                                                                                             Conference on Control and Applications, 2009.
                                                                                                                         [2] T. Burd and R. Brodersen. Processor design for portable systems. In
                                                                                                                             The Journal of VLSI Signal Processing, volume 13, pages 203–221,
                                                                                                                         [3] T. Burd, T. Pering, A. Stratakos, and R. Brodersen. A dynamic
                                                                                                                             voltage scaled microprocessor system. In IEEE International Solid-
                                                                                                                             State Circuits Conference Digest of Technical Papers, volume 35,
                                                                                                                             pages 1571–1580, 2000.
                                                                                                                         [4] A. Chandrakasan and R. Brodersen. Minimizing power consumption
                                                                                                                             in digital cmos circuits. In Proceedings of the IEEE, volume 83, pages
                                                                                                                             498–523, 1995.
                                                                                                                         [5] S. Durand and N. Marchand. Fast predictive control of micro con-
                                                                                                                             troller’s energy-performance tradeoff. In Proceedings of the 3rd IEEE
                                       Fig. 9. Simulation results of the multicore controller based on partial               Multi-conference on Systems and Control - 18th IEEE International
                                       duplication of the monocore control strategy (with discrete frequency ratios):        Conference on Control Applications, 2009.
                                       energy consumption of 4·10−5 J and computational needs of 4.3·105 f lops,         [6] S. Fairbanks and S. Moore. Analog micropipeline rings for high
                                       that is an increase of 1% of energy consumption and 11% of computational              precision timing. In Proceeding of the International Symposium on
                                       needs compared to the controller with ideal continuous frequency ratios               Advanced Research in Asynchronous Circuits and Systems, pages 41–
                                                                                                                             50, 2004.
                                                                                                                         [7] T. Ishihara and H. Yasuura. Voltage scheduling problem for dynami-
                                                                                                                             cally variable voltage processors. In Proceedings of the International
                                       than the desired one and so the computational load will be                            Sympsonium on Low Power Electronics and Design, pages 197–202,
                                       correctly computed. This is why this principle is interesting                         1998.
                                                                                                                         [8] S. Miermont, P. Vivet, and M. Renaudin. A power supply selector
                                       since it only leads to an increase of less than 1% of the                             for energy- and area -efficient local dynamic voltage scaling. In
                                       energy consumption and 11% of the control computational                               PATMOS’07: 17th International Workshop on Power and Timing
                                       needs in comparison with the continuous frequency ratio case                          Modeling, Optimization and Simulation, pages 556–565, 2007.
                                                                                                                         [9] T. Minka. The lightspeed matlab toolbox v2.2.
                                       (which could not be implemented in practice anyway).                                  http://research.microsoft.com/˜minka/software/lightspeed/.
                                                                                                                        [10] T. Pering, T. Burd, and R. Brodersen. Voltage scheduling in the
                                              V. CONCLUSIONS AND FUTURE WORKS                                                lparm microprocessor system. In Proceedings of the International
                                                                                                                             Symposium on Low Power Electronics and Design (ISLPED), pages
                                          This paper proposes architectures to control several de-                           96–101, 2000.
                                       vices which work together since they are all supplied with                       [11] J. Pouwelse, K. Langendoen, and H. Sips. Dynamic voltage scaling
                                                                                                                             on a low-power microprocessor. In Proceedings of the 7th Annual In-
                                       the same voltage Vdd and the same clock frequency fclk (or a                          ternational Conference on Mobile Computing and Networking, pages
                                       ratio of this clock). The multicore control strategies are based                      251–259, 2001.
                                       on the monocore control strategy depicted in [5], where a fast                   [12] A. Varma, B. Ganesh, M. Sen, S. Choudhury, L. Srinivasan, and
                                                                                                                             J. Bruce. A control-theoretic approach to dynamic voltage scheduling.
                                       predictive control technique gives a computational speed set                          In Proceedings of the International Conference on Compilers, Archi-
                                       point to track in order to minimize the energy consumption                            tecture and Synthesis for Embedded Systems, pages 255–266, 2003.

To top