VIEWS: 7 PAGES: 16 POSTED ON: 10/16/2012
Fault-Detection Capability Analysis of a Hardware-Scheduler IP-Core in Electromagnetic Interference Environment J. Tarrillo1, L. Bolzani1, F. Vargas1, E. Gatti2, F. Hernandez3, L. Fraigi2 1 ElectricalEngineering Dept., Catholic University – PUCRS. Porto Alegre, Brazil. 2 Inst. Nacional de Tecnologia Industrial (INTI). Buenos Aires, Argentina. 3 Universidad ORT. Montevideo, Uruguay. Catholic University email@example.com 1 PUCRS Motivation Nowadays, safety-critical embedded systems support real-time (RT) applications that have to respect strict timing constraints. They have to provide logically and temporally correct results ! The high complexity of these systems requires the adoption of Real-Time Operating Systems (RTOS) that manage task switching process, concurrency between tasks, memory, time as well as interrupts. firstname.lastname@example.org 2 Understanding the Problem … The increasing hostility of the electromagnetic environment caused by the widespread adoption of electronics and in particular wireless technologies, represents a huge challenge for the reliability of RT embedded systems. Electromagnetic interference (EMI) may induce Power Supply Disturbances (PSD) that can generate transient faults. These faults can affect not only the applications running on embedded systems but also the RTOS executing the application code, by causing scheduling dysfunctions that could lead to incorrect system behavior. email@example.com 3 Understanding the Problem … Several solutions have been proposed. However, they provide fault tolerance only at the application level and do NOT consider faults affecting the RTOS that propagate to application tasks. e.g.: about 34% of the faults injected in processor’s registers led to scheduling dysfunctions: If not detected at the RTOS-level, - 44% of these dysfunctions led to system crashes, these faults escape detection by - 34% caused RT problems and conventional (app-level) techniques as well ! - 22% generated incorrect outputs (propagate to system outputs). firstname.lastname@example.org 4 Goal In this context… We propose a Hardware-based Scheduler (Hw-S) IP core to improve the robustness of embedded systems based on RTOS. the Hw-S targets faults that are NOT detected by the native structures present in the RTOS kernel. email@example.com 5 Summary 1. The Proposed Approach 2. Practical Experiments 3. Discussion: The Benefits 4. Conclusions firstname.lastname@example.org 6 1. The Proposed Approach Embedded System Events: Tick, interruption, ... Memory Addresses accessed (Reference for by the processor. Switching Task Context ) Hw-S identifies the current task under execution and correlates it with the information stored in an Address Table generated during the compilation process. Block diagram of the target embedded system email@example.com 7 1. The Proposed Approach In charge of identifying the task under execution based on the addresses accessed by the CPU and on the information stored in an Address Table generated during the compilation process. Error Indication to System Level Implements the scheduling algorithm based on the RTOS kernel Based on the tick and on any other event and provides fault detection according to: (interrupts), it is in charge of defining the - the task in execution, Time Limit (tl) for the processor to - the analysis of the tl, and execute each task, as well as detecting - the events (interrupts) that can influence the RT-system. the events that can possibly interrupt the task in execution. firstname.lastname@example.org 8 Block diagram of the Hw-S 1. The Proposed Approach Time for Context Switching (Δ time, proportional to the number and complexity of resources used by the RTOS) External Event Next task recover from the execution queue Current task retirement into the execution queue Time Limit for Switching Context Context Switch and Time Limit. email@example.com 9 1. The Proposed Approach Regarding the fault detection capability, the Hw-S targets two types of faults: Sequence error (E_seq): occurs at the end of the Time Limit, tl, by noting that the current task does not represent the expected one according to the task’s execution flow. Time error (E_time): occurs when a task switching process takes place in between two consecutive context switching events (e.g., two consecutive ticks) thus, violating the time constraints associated to the real-time system. firstname.lastname@example.org 10 2. Practical Experiments Case study: Von Neumann 32-bit RISC Plasma microprocessor running a RTOS (opencores.org). Plasma’s instruction set compatible to MIPS architecture. We developed and validated three benchmarks that exploit different services offered by the Plasma’s RTOS: T1 Variable 1 Tasks T1, T2 and T3 access and update the value of BM1 T2 Variable 2 three different global variables. T3 Variable 3 T1 QM T2 Tasks T1 and T2 communicate by message queue. T1 BM2 sends a value to the queue and T2 reads this value. T3 Variable 3 Task T3 writes a value into a global variable. T1 Tasks T1, T2 and T3 access a global variable which has T2 Global BM3 been protected by mutual exclusion semaphore T3 MUTEX (MUTEX). email@example.com 11 Power Supplies 2. Practical Experiments Temp Sensor FPGA Flash SRAM 8051 Supply F0 Supply F1 Test Side SRAM 0 SRAM 1 RS232 RS232 32 bits Supply SRAM 0 FPGA0 Supply FPGA1 SRAM 1 Supply M0 MSC M1 8 bits 8 bits Flash 0 8051 Flash 1 Test Side Top Botton Glue Logic Side 8 bits 8 bits FPGA RS232 CLK 8 bits RS232 8051 Remaining Glue Supply C Logic Side Block Diagram Test board designed for IEC 62.132-2 and 61.004-29 electromagnetic susceptibility analysis firstname.lastname@example.org 12 2. Practical Experiments Test Conditions: GTEM Cell Freq. range: 150 KHz – 3 GHz Field range: 10 – 200 V/m Test Host Signal Modulation: AM 80% Computer Total time of exposition: 27 hours RF Noise Generator Power-Supply and Amplifier Noise Generator Board Test Board and Shielding Box 1.2 volts 1.15 volts Fault injection environment 4.2 % of voltage dips Injected noise at the FPGA power bus (conducted EMI) email@example.com 13 2. Practical Experiments Summary of the obtained results RTOS/Hw-S FPGA RTOS Hw-S After 27 hours, # of latency configuration detection [%] detection [%] erroneous outputs [clock cycle] lost [%] observed per benchmark: 65 BM1 33.8 100.0 1523 7.7 BM2 43.1 100.0 498 1.5 BM3 1.5 100.0 810 - Minimum fault latency Highest fault detection Coverage of faults that propagated to outputs firstname.lastname@example.org 14 2. Practical Experiments After inspection … Time_Errors RTOS lost information (CPU switched to another Sequence_Errors associated to the “next task between two (CPU executed an thread”, so preventing consecutive ticks) RTOS lost “semaphore the CPU from switching unexpected task from the information”, so Task Execution Queue) to the next task in the preventing the CPU from execution queue continuing the proper execution of the tasks Migrate to HW the weakest reliability points of the RTOS Percentage of E_seq and Percentage of assert() send by the E_time detected by the Hw-S. RTOS email@example.com 15 4. Final Conclusions We presented a Hardware-based Scheduler (Hw-S) IP core to improve the robustness of embedded systems based on RTOS The Hw-S targets faults: scheduling dysfunctions that could lead to incorrect system behavior These faults are NOT detected by the native structures present in the RTOS kernel The IP core is attached to the processor bus to monitor tasks execution flow Practical experiments indicate the technique is effective to increase fault detection coverage provided by the RTOS-native structures. firstname.lastname@example.org 16
"Fault Tolerance in VHDL Description Transient Fault"