VIEWS: 21 PAGES: 28 CATEGORY: Technology POSTED ON: 9/8/2010
Soft Error Derating Computation in Sequential Circuits Hossein Asadi and Mehdi Tahoori Department of Electrical & Computer Engineering Northeastern University Objective Estimating the probability of system failure Due to single event transients (SETs or SEUs) In logic gates System failure Sequential logic circuits Probability of SETs captured in system bistables Outline Introduction: Background and Previous Work Error Event Propagation Algorithm Validation and Experimental Results Conclusions Soft Errors Major impact of radiation Single event upsets (aka SEUs, soft errors) Electron-hole pairs created Absorbed by diffusion Logic value changed (charge dependent) 0 1 or 1 0 Sources of radiation Alpha particles Packaging material Neutrons from cosmic rays Increases with altitude Soft Error Rate System failure rate due to SEUs Error at system outputs Error rate of node ni (gate output) Nominal FIT Logic Derating Timing Derating Nominal FIT: Occurrence rate of SEUs at node ni causing a glitch Logic Derating: Propagation of error from node ni to system bistables or outputs Timing Derating: Propagated transient captured in system bistables Previous Work Fault Injection Random vector simulation Random transient injection Dynamic (timing accurate) simulation [Maheshwari DFT’03, Mohanram ITC’03, Mohanram DFT’03, Nguyen IRPS’03, Shivakumar DSN’02, Zhou ICCAD’04] Problems Time-consuming / Incomplete / Expensive Poor soft error diagnosability/debug Previous Work BDD Diagram [Zhang SELSE’05, ISQED’06] Complete BDD Exact soft error rate (SER) Exponential run-time with circuit size Infeasible for even medium-size circuits BDD with Circuit Partitioning Faster than fault injection Still time-consuming for large circuits Less accurate than fault injection Outline Introduction: Background and Previous Work Error Event Propagation Algorithm Validation and Experimental Results Conclusions Our Approach Analytical method No need for vector simulation Using well-known concepts for SER estimation Signal probability (SP) Effective techniques available for SP computation Can be reused from power estimation phase Static timing analysis Very fast Input vector independent Glitch Propagation: Simple Path Duration and time of propagated glitch Depend on propagation and transition delays along the path Glitch propagation probability Depends on signal probabilities of off-path signals & type of gates Error propagation probability (EPP) Propagation probability (PP) Latching Probability (LP) off-path signals SPB=0.2 SPC=0.4 B C SEU w' w t' E FF t D A Latching Probability LP = (S+H+W)/T S,H : Setup and Hold time W: glitch width T : clock period overlap window = W + S + H S H W W T Static Propagation Probability An approach presented in [Asadi DATE’05, Asadi ISCAS’05] Using off-path signal probabilities Propagation rules On-path gates: Pa(Ui ) + Pā(Ui ) + P1(Ui ) + P0(Ui ) = 1 Off-path gates: P1(Ui ) + P0(Ui ) = 1 GATE RULE n P (out ) P ( X i ) 1 1 i 1 n Pa (out ) [ P ( X i ) Pa ( X i )] P (out ) 1 1 i 1 AND n Pa (out ) [ P ( X i ) Pa ( X i )] P (out ) 1 1 i 1 P0 (out ) 1 [ P (out ) Pa (out ) Pa (out )] 1 Static Propagation Probability (Cont.) GATE RULE n P0 (out ) P0 ( X i ) i 1 n Pa (out ) [ P0 ( X i ) Pa ( X i )] P0 (out ) i 1 OR n Pa (out ) [ P0 ( X i ) Pa ( X i )] P0 (out ) i 1 P (out ) 1 [ P (out ) Pa (out ) Pa (out )] 1 0 P0 (out ) P (in) 1 Pa (out ) Pa (in) NOT Pa (out) Pa (in) P (out) P0 (in) 1 Reconvergent Paths Error propagated to at least two inputs of a gate Propagated waveforms Multiple waveforms, not simple glitches EPP i FFj 1 g 1 PP LP all propagated i i waveformsi SEU (inverted) 1 2 FF (inverte d) 2 Approach Find all possible propagated waveforms Enhanced static timing analysis Record all possible transitions at each reachable gate Due to glitch at error site How? a a Glitch of width w t t+w Represented by two events: (a,t), (ā,t+w) For both positive and negative glitches Inject two events (a,t), (ā,t+w) at error site Find all events at the outputs of all on-path gates Approach (cont.) Find the probability of each event Using error propagation rules For output of each on-path gate For each time with an active event Calculate Pa, Pā, P1,and P0 Too many events per each gate? Maximum size of event list S298: 13 S386: 9 S526: 13 Not an issue! Approach (cont.) What we have? At the input of each reachable flip-flop Series of timed a, ā events with probabilities What we need? All possible propagated waveforms With propagation probabilities How? Valid waveform: A series of aā (āa) events With an equal number of a and ā E.g. (ā,a) or (a,ā,a,ā) What about (a,ā,ā,a,ā), (a,a,ā) ? Need to generate all valid waveforms from event list Outline Introduction: Background and Previous Work Error Event Propagation Algorithm Validation and Experimental Results Conclusions Algorithm: Overall START List(Gi) Extract on-path gates reachable from Gi List(Gi) Sort List(Gi) based on distance from Gi Event_List(Gi) Add_Event( a, time=t) Event_List(Gi) Add_Event( a, time=t+w) Is there No any gate (Gj) in Yes List(Gi) left? Compute timing-logic Propagate events darating for propagated events through gate Gj Algorithm: Events Propagation Is there No any gate (Gj) in Yes List(Gi) left? Is there Compute timing-logic No any input (k) of Gj darating for propagated events left? Yes Event_List(Gj) Add_event_list(k) Is there any event (E) in No Event_list(Gj) left? Yes Apply_propagation_rules(E) Algorithm: Derating Computation TD(Gj)=1 Is there No any flip-flop (FFj) in Yes List(Gi) left? TDGi FFj =0 TD(Gi) = 1 - TD(Gi) Is there any valid waveform END No (pk) in Event_list(FFj) left? TD(Gi) = TD(Gi).(1- TDGi FFj ) Yes PPk = Propagation_probability(pk) LPk = Latching_probability(pk) TDGi FFj = TDGi FFj + PPk × LPk Example SP B=0.2 SP C =0.3 B C T=13: P(H) = 0.28(0)+0.07(a)+0.65(1) T=14: P(H) = 0.28(0)+0.07(a)+0.65(1) T=16: P(H) = 0.168(0)+0.532(a)+0.3(1) SEU Glitch T=5: P(D) = 0.2(a)+0.8(0) T=17: P(H) = 0.168(0)+0.392(a)+0.042(a)+0.398(1) T=6: P(D) = 0.2(a)+0.8(0) width = 1 a a SP A=0.3 5 6 D H FF A SP D =0.1 SP H =0.7 SP G=0.5 a a a a SP E=0.7 13 14 0 1 E prob = 0.07*0.07*0.468*0.566=0.0013 G T=0: P(A) = 1(a) a a T=3: P(A) = 1(a) a a T=1: P(A) = 1(a) T=4: P(A) = 1(a) 3 4 a a 16 17 prob = 0.93*0.93*0.532*0.392=0.1804 SP F =0.7 8 9 T=8: P(G) = 0.7(a)+0.3(0) F T=9: P(G) = 0.7(a)+0.3(0) a a 13 16 prob = 0.07*0.93*0.532*0.566=0.0196 Delay(D) = 5, Delay(E) = 3 a a Delay(G) = 5, Delay(H) = 8 14 17 prob = 0.93*0.07*0.468*0.392=0.0119 Outline Introduction: Background and Previous Work Error Event Propagation Algorithm Validation and Experimental Results Conclusions Validation: Monte-Carlo Simulation Inject glitches at the outputs of random gates At random time Perform timing-accurate simulation Identify if error captured in a flip-flop Compute soft error rate Simulation termination Computed value reaches confidence interval For example 3% margin (97% accuracy) Simulation doesn’t converge after N iterations Too time-consuming Results: Simulation Time Setup: DELL Precision 450 equipped with 2 GB main memory Our approach: 4-5 orders of magnitude faster than MC-sim Results: Accuracy Accuracy Within 2% of MC simulation Very small dependency on signal probability accuracy Outline Introduction: Background and Previous Work Error Event Propagation Algorithm Validation and Experimental Results Conclusions Conclusions Analytical approach to compute timing-logic derating The probability that SETs (or glitches) latched in FFs Based on Enhanced static timing analysis Signal probability Time complexity Polynomial time complexity Very fast:10 min for largest ISCAS89 circuits Accuracy 98% accurate, on average, for ISCAS89 circuits