Thesis defense by bestt571

VIEWS: 138 PAGES: 15

More Info
									Error Correction for Multi-level NAND Flash
   Memory Using Reed-Solomon Codes

            Bainan Chen, Xinmiao Zhang
             Case Western Reserve University

                  Zhongfeng Wang
                  Broadcom Corporation
Error Correction for Multi-level NAND Flash Memory

     Multi-level flash memory
       Multiple bits are stored in each memory cell
       Increased storage density, but decreased reliability
       NAND flash memory – large page size, e.g. 8K bits

     Prior error correction scheme for flash memory
       Hamming code – single-bit cell flash memory
       BCH codes – multi-level NAND flash memory


               Reed-Solomon (RS) codes

                     VLSI System Architecture Laboratory      Page 2
                     Outline
Gray bit-mapping scheme for multi-level flash memory
Performance of BCH and RS codes for multi-level
flash memory
Hardware complexity and throughput analysis for BCH
and RS decoders
Conclusions




               VLSI System Architecture Laboratory   Page 3
    Flash Memory Model and Gray Mapping
Cell threshold voltage distribution model
 for 2-bit/cell multi-level flash memories




                                                   [1] G. Atwood and etc. “Intel StrataFlash
                                                   memory technology overview,” Intel
                                                   technology Journal, 1997


                                           Level                   3       2       1       0
                                Direct Mapping [1]                11      10      01      00
                                    Gray Mapping                  00      10      11      01
                                        Mean (V)                   0     3.25 4.55        6.5
                                     Deviation (V)                 4σ      σ       σ       2σ

                            VLSI System Architecture Laboratory                            Page 4
Performance of BCH and RS codes




                                                         0.2dB
                                                0.02dB




          VLSI System Architecture Laboratory                    Page 5
Hardware Complexity and Throughput
 Analysis for BCH and RS Decoders
Hard-decision Decoding for BCH and RS Codes
 Decoding steps using the Berlekamp-Massey algorithm
    Syndrome computation
    Key equation solver
    Chien search for the calculation of error locations
    Forney’s algorithm for the calculation of error magnitudes

 Case study
    (828, 820) RS code(GF(210 )) vs. (8248, 8192) BCH code(GF(214 ))
                   t=4 error-correcting codes




                      VLSI System Architecture Laboratory        Page 7
          Syndrome Computation


      RS codes                                      BCH codes


                                                                          p-parallel


                                                             pt constant multipliers
2t constant multipliers                                      pt adders
2t adders                                                    t squarers

                Code                       Gate Count             Latency
           (828, 820) RS                     116 XOR             828 clks
     (8248, 8192) BCH (p = 8)                339 XOR             1031 clks

                       VLSI System Architecture Laboratory                         Page 8
                   Key Equation Solver
                                                                 CONTROL
       Key equation
                                                                                          0   1


(1 + S(x))Λ(x) = Ω(x) mod x2t+1                                  1   0
                                                                           MSB


                                                                                   ‘1'

      RS codes
         3t+1 registers in each row
         (3t+1)2t=6t2+2t clock cycles                                                    PE
                                                 D          D            D
      BCH codes
                                                                3t+1                              ‘0'
         2t registers in each row                               3t+1
         2t×t=2t2 clock cycles                   D          D            D




          Code                         Gate Count                                Latency
       (828, 820) RS     212XOR+200AND+30MUX                                     104 clks
     (8248, 8192) BCH    456XOR+392AND+42MUX                                     32 clks

                          VLSI System Architecture Laboratory                                     Page 9
       Chien Search and Forney’s Algorithm
Error locations: roots of

Error magnitudes:
     RS codes
         2t constant multipliers
         2t-1 adders
         1 multiplier
         1 inverter
     BCH codes (p-parallel)
         pt constant multipliers
         p(t-1) adders

         Code                                   Gate Count         Latency
     (828, 820) RS          351XOR+270AND+36OR+5NOT+90MUX          1024 clks
(8248, 8192) BCH (p=8)                727XOR+104OR+56MUX           1031 clks


                             VLSI System Architecture Laboratory         Page 10
Decoding Latency of RS and BCH Codes
  Clock cycles              RS (828, 820)                   BCH (8248, 8192)
    Syndrome                         828                         1031
   computation
Key equation solver                  104                          32
 Chien search and                   1024                         1031
  error magnitude
    computation
  Overall latency                   1024                         1063




                      VLSI System Architecture Laboratory                      Page 11
 Complexity and Throughput Comparisons

                (828, 820) RS decoder                  (8248, 8192) BCH decoder
Total Area      679XOR+470AND+                         1522XOR+392AND+
                45OR+5NOT+120MUX+                      104OR+98MUX
                534 Registers                          +750 Registers
Critical Path   6XOR+1AND+1MUX                         15XOR+1AND+1MUX

Decoding        1024 clock cycles                      1063 clock cycles
latency



The RS decoder can achieve 121% higher throughput
than the BCH decoder with 66% of the area


                         VLSI System Architecture Laboratory                  Page 12
Why the RS decoder is faster and smaller?

  RS codes are constructed over finite fields of lower order

    Finite field multipliers are smaller and have shorter
    critical path
    Syndrome computation involves less coefficients
    Chien search needs to be carried out over smaller
    number of finite field elements




                   VLSI System Architecture Laboratory
                  Summary
The Gray bit-mapping scheme leads to extra coding
gain without any overhead
RS codes with similar code length and rate can
achieve better error-correcting performance than BCH
codes in multi-level flash memory
RS decoders with similar code length and rate have
lower complexity than BCH decoders




                VLSI System Architecture Laboratory    Page 14
VLSI System Architecture Laboratory   Page 15

								
To top