NAND flash memory is better than hard drive storage solution, which is no more than 4GB of low-volume applications, the performance was still obvious. As people continue to seek lower power, lighter weight and better performance products, NAND is proving to be attractive.
Error Correction for Multi-level NAND Flash Memory Using Reed-Solomon Codes Bainan Chen, Xinmiao Zhang Case Western Reserve University Zhongfeng Wang Broadcom Corporation Error Correction for Multi-level NAND Flash Memory Multi-level flash memory Multiple bits are stored in each memory cell Increased storage density, but decreased reliability NAND flash memory – large page size, e.g. 8K bits Prior error correction scheme for flash memory Hamming code – single-bit cell flash memory BCH codes – multi-level NAND flash memory Reed-Solomon (RS) codes VLSI System Architecture Laboratory Page 2 Outline Gray bit-mapping scheme for multi-level flash memory Performance of BCH and RS codes for multi-level flash memory Hardware complexity and throughput analysis for BCH and RS decoders Conclusions VLSI System Architecture Laboratory Page 3 Flash Memory Model and Gray Mapping Cell threshold voltage distribution model for 2-bit/cell multi-level flash memories  G. Atwood and etc. “Intel StrataFlash memory technology overview,” Intel technology Journal, 1997 Level 3 2 1 0 Direct Mapping  11 10 01 00 Gray Mapping 00 10 11 01 Mean (V) 0 3.25 4.55 6.5 Deviation (V) 4σ σ σ 2σ VLSI System Architecture Laboratory Page 4 Performance of BCH and RS codes 0.2dB 0.02dB VLSI System Architecture Laboratory Page 5 Hardware Complexity and Throughput Analysis for BCH and RS Decoders Hard-decision Decoding for BCH and RS Codes Decoding steps using the Berlekamp-Massey algorithm Syndrome computation Key equation solver Chien search for the calculation of error locations Forney’s algorithm for the calculation of error magnitudes Case study (828, 820) RS code(GF(210 )) vs. (8248, 8192) BCH code(GF(214 )) t=4 error-correcting codes VLSI System Architecture Laboratory Page 7 Syndrome Computation RS codes BCH codes p-parallel pt constant multipliers 2t constant multipliers pt adders 2t adders t squarers Code Gate Count Latency (828, 820) RS 116 XOR 828 clks (8248, 8192) BCH (p = 8) 339 XOR 1031 clks VLSI System Architecture Laboratory Page 8 Key Equation Solver CONTROL Key equation 0 1 (1 + S(x))Λ(x) = Ω(x) mod x2t+1 1 0 MSB ‘1' RS codes 3t+1 registers in each row (3t+1)2t=6t2+2t clock cycles PE D D D BCH codes 3t+1 ‘0' 2t registers in each row 3t+1 2t×t=2t2 clock cycles D D D Code Gate Count Latency (828, 820) RS 212XOR+200AND+30MUX 104 clks (8248, 8192) BCH 456XOR+392AND+42MUX 32 clks VLSI System Architecture Laboratory Page 9 Chien Search and Forney’s Algorithm Error locations: roots of Error magnitudes: RS codes 2t constant multipliers 2t-1 adders 1 multiplier 1 inverter BCH codes (p-parallel) pt constant multipliers p(t-1) adders Code Gate Count Latency (828, 820) RS 351XOR+270AND+36OR+5NOT+90MUX 1024 clks (8248, 8192) BCH (p=8) 727XOR+104OR+56MUX 1031 clks VLSI System Architecture Laboratory Page 10 Decoding Latency of RS and BCH Codes Clock cycles RS (828, 820) BCH (8248, 8192) Syndrome 828 1031 computation Key equation solver 104 32 Chien search and 1024 1031 error magnitude computation Overall latency 1024 1063 VLSI System Architecture Laboratory Page 11 Complexity and Throughput Comparisons (828, 820) RS decoder (8248, 8192) BCH decoder Total Area 679XOR+470AND+ 1522XOR+392AND+ 45OR+5NOT+120MUX+ 104OR+98MUX 534 Registers +750 Registers Critical Path 6XOR+1AND+1MUX 15XOR+1AND+1MUX Decoding 1024 clock cycles 1063 clock cycles latency The RS decoder can achieve 121% higher throughput than the BCH decoder with 66% of the area VLSI System Architecture Laboratory Page 12 Why the RS decoder is faster and smaller? RS codes are constructed over finite fields of lower order Finite field multipliers are smaller and have shorter critical path Syndrome computation involves less coefficients Chien search needs to be carried out over smaller number of finite field elements VLSI System Architecture Laboratory Summary The Gray bit-mapping scheme leads to extra coding gain without any overhead RS codes with similar code length and rate can achieve better error-correcting performance than BCH codes in multi-level flash memory RS decoders with similar code length and rate have lower complexity than BCH decoders VLSI System Architecture Laboratory Page 14 VLSI System Architecture Laboratory Page 15
Pages to are hidden for
"Thesis defense"Please download to view full document