Document Sample

Coding Theory Nikki Mark April 28th, 2006 What is Coding Theory? Algebraic codes are now used in CD players, fax machines, modems, and bar code scanners. Algebraic coding has nothing to do with secret codes. The goal of coding theory is to devise message encoding and decoding methods that are reliable, efficient, and reasonably easy to implement. Why do we need Coding Theory? To motivate this theory, we imagine wanting to send one of two signals to a spaceship on Mars. 0 to orbit 1 to land But it is possible that some sort of interference (noise) can cause an incorrect message to be received. Adding Redundancy to our Messages To decrease the effects of noise, we add redundancy to our messages. First method: repeat the digits multiple times. Thus, the computer is programmed to take any five-digit message received and decode the result by majority rule. Majority Rule So, if we sent 00000, and the computer receives any of the following, it will still be decoded as 0. 00000 11000 Notice that for the 10000 10100 computer to decode 01000 10010 incorrectly, at least 00010 10001 three errors must be 00001 etc. made. Independent Errors Using the five-time repeats, and assuming the errors happen independently, it is less likely that three errors will occur than two or fewer will occur. This is called the maximum likelihood decoding. More complicated numbers We are going to send a sequence of 0s and 1s of length 500. Assume the probability of an error occurring is .01 for any particular digit (and they happen independently). No redundancy: P(error free) = .99 500 .0066 “Three fold Repetition Scheme” Sending each digit three times, and decoding each block of three digits by majority rule, the probability increases: Suppose 1 is sent, it will be decoded as 0 iff the block received is 001, 010, 100, 000. The probability that this will occur is: P(error) =(. 01) 2 [3(. 99 ) .01] .000298 => P(error free for one digit) >= .9997 =>P(error free message) >=.9997500 = .86 A Purdy Picture Original message 0 Encoded message 00000 earth encoder transmitter spacecraft decoder Noise Received message 10001 Decoded message 0 Why don’t we use this? Repetition codes have the advantage of simplicity, both for encoding and decoding But, they are too inefficient! In a five-fold repetition code, 80% of all transmitted information is redundant. Can we do better? Yes! What is a linear code? Def: An (n,k) linear code over a finite field F is a k-dimensional subspace V of the vector space F F F ... F n n copies over F. The members of V are called the code words. When F is 2 , the code is called binary. A Hamming (7,4) code A Hamming code of (n,k) means the message of k digits long is encoded into the code word of n digits. The 16 possible messages: 0000 1010 0011 1111 0001 1100 1110 0010 1001 1101 0100 0110 1011 1000 0101 0111 Encoding with a Generating Matrix A generating matrix is one of k x n matrix, with a k x k Identity matrix adjoined with a k x (n – k) matrix. Our generating matrix: 1 0 0 0 0 1 1 0 1 0 0 1 0 1 G= 0 0 1 0 1 1 0 0 0 0 1 1 1 1 Encoding a message For each message v, find the left coset vG: Let v = [ 0 1 1 0 ] 1 0 0 0 0 1 1 vG = [ 0 1 1 0 ] 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 1 =[0110011] Encoding those messages Message codeword 0000 0000000 0110 0110011 0001 0001111 0101 0101010 0010 0010110 0011 0011001 0100 0100101 1110 1110000 1000 1000011 1101 1101001 1010 1010101 1011 1011010 1100 1100110 0111 0111100 1001 1001100 1111 1111111 Hamming Distance Def: The Hamming distance between two vectors of a vector space is the number of components in which they differ, denoted d(u,v). Mr. Hamming… Hamming Distance Ex. 1: The Hamming distance between v=[1011010] u=[0111100] d(u, v) = 4 Notice: d(u,v) = d(v,u) Hamming weight of a Vector Def: The Hamming weight of a vector is the number of nonzero components of the vector, denoted wt(u). Hamming weight of a code Def: The Hamming weight of a linear code is the minimum weight of any nonzero vector in the code. Hamming Weight Ex. 2: The Hamming weight of v=[1011010] u=[0111100] w=[0100101] are: wt(v) = 4 wt(u) = 4 wt(w) = 3 Theorem 31.1 ~ Properties of Hamming Distance and Weight For any vectors u, v, and w, 1) d(u,v) = wt(u – v). 2) d(u,v) < d(u,w) + d(w,v). Proof: 1) Observe that both d(u, v) and wt(u – v) equal the number of positions in which u and v differ. Proving Part 2 2) d(u,v) < d(u,w) + d(w,v). Proof: 2) Note that, if u and v differ in the ith position and u and w agree in the ith position, then w and v differ in the ith position. Refer to Example 2. Example of Proof v=[1011010] u=[0111100] w=[0100101] d(u,v) = 4 d(u,w) = 3 d(w,v) = 7 And: d(u,v) < d(u,w) + d(w,v) 4 < 3 + 7. Theorem 31.2 ~ Correcting Capability of a Linear Code If the Hamming weight of a linear code is at least 2t + 1, then the code can correct any t or fewer errors. Alternatively, the same code can detect any 2t or fewer errors. Example: Using our original code, the Hamming weight is 3 = 2(1) + 1. Thus, it can: correct 1 error, OR, it can: detect 2 errors. Why OR? We’ll do this by counterexample. We have our (7,4) Hamming Code, and the weight is 3 = 2(1) + 1. Say we receive the word 0001010. It could have had one error, in which 0101010 was intended, and we can detect that single error. Why OR, cont. It could have had two errors, in which 0000000, 0001111, 1011010 could be candidates. What do we do? If we chose detection, we would note that too many possibilities exist, and we would ask for retransmission. If we chose correction, we could potentially make a mistake and assume that the message was intended to be 0101010. Moral of the story We must be careful about detecting and correcting errors. It is safest to ask for retransmission if possible, to eliminate any guess-work. A better check If we write the Hamming weight of a linear code in the form: 2t + s + 1 We can correct any t errors AND detect any t + s errors. An example for ya Let’s assume some linear code has a Hamming weight of 5. Well, 5 = 2(1) + 2 + 1 • So we can correct any 1 error and detect up to 3 errors. Or, 5 = 2(0) + 4 + 1 • So we can detect up to any 4 errors. Or, 5 = 2(2) + 0 + 1 • So we can correct any 2 or fewer errors. Let’s Play a Game! Yay for Math Horizons! This uses the fact that each error is uniquely determined. Two different code words differ in at least three different positions. Creating the Parity-Check Matrix How to make H, the Parity-Check Matrix from G, the Generating Matrix. G [ Ik | A ] Compute –A H= A In k Homework #1 From your textbook: Page 540, numbers 4, 5, and 12. Not too bad, we promise! Decoding the Code Words Four types of decoding methods for linear codes: Nearest-neighbor Parity-Check Matrix Coset Decoding Same Coset-Same Syndrome The Steps of Nearest Neighbor Decoding This is a comparison between the received code word, v, and all possible code words. Look for the smallest distance between v and v’. If v’ is unique, assume v = v’. If v’ is not unique, ask for retransmission. Nearest Neighbor Example Ex. 3: We receive 0001110. This is not in our list of possible code words from our original list. We compare 0001110 to each code word, and see that it differs from 0001111 by only 1, and all others by more than one. We assume 0001110 was intended to be 0001111. We decode 0001111 to its original message of 0001. Creating the Parity-Check Matrix How to make H, the Parity-Check Matrix from G, the Generating Matrix. G [ Ik | A ] Compute –A H= A In k How to Use the Parity- Check Matrix to Decode For any received word w, compute wH. If wH is the zero vector, assume that no error was made. If there is exactly one nonzero element, s F and a row i of H such that wH is s times row i, assume that the sent word was w – (0 … s … 0), where s occurs in the ith component. If there is more than one such instance, do not decode. If wH does not fit into either category, we know that at least two errors occurred in transmission, and we do not decode. Parity-Check Decoding Example Using G = 1 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 1 0 1 1 Create H = 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 1 0 Example cont. We receive 0001110. Compute wH: We get 001. 001 is the 7th (last) row of our Parity- Check Matrix H. Thus we assume there is a problem in position 7. Compute 0001110 – 0000001 = 0001111, which is the intended code word, which is decoded to be the message 0001. Theorem 31.3 ~ Parity- Check Matrix Decoding Parity-check matrix decoding will correct any single error iff the rows of the parity-check matrix are nonzero and no one row is a scalar multiple of any other row. Example of Thm. 31.3 1 1 0 0 1 1 A= Good! Rows are 1 0 1 linearly independent 1 0 0 0 1 0 0 0 1 2 1 B= 2 2 Bad! Row 1 and row 1 2 3 are NOT linearly independent. 1 0 0 1 How to Use Coset Decoding To use coset decoding, one must create a “standard array” for the linear code. Once the standard array is constructed, find the received code word in the array and identify the column in which it lies. Decode the code word as the number at the top of the column minus the “coset leader.” How to create the dreaded “Standard Array” Use C, the group of code words, as the heading for each column. Next, look at all the possible vectors, and choose the next available one with minimum weight. This becomes the first “coset leader,” place this vector as the beginning of the second row. Constructing the Standard Array 0000000 0001111 0010110 0011001 1000000 C = {0000000, 0001111, 0010110, 0011001} The vector 1000000 is the next number with least weight, so it becomes the first coset leader. Continuing the Standard Array Next, add the coset leader to each member in C, the column headings. Of all the code words left, find the one with least weight, and this becomes the coset leader for the third row. Continue this until all possible code words are placed in the standard array. Standard Array 0000000 0001111 0010110 0011001 1000000 1001111 1010110 1011001 0100000 The vector 0100000 is the next code word with least weight that is still left, so it becomes the next coset leader. The Complete Standard Array We can’t complete the whole array, it’d be 32 rows, but here’s as much as we need to decode 0001110 0000000 0001111 0010110 0011001 1000000 1001111 1010110 1011001 0100000 0101111 0110110 0111001 0010000 0011111 0000110 0001001 0001000 0000111 0011110 0010001 0000100 0001011 0010010 0011101 0000010 0001101 0010100 0011010 0000001 0001110 0010111 0011000 1100000 1101111 1110110 1111001 … Decoding with the Standard Array 0000000 0001111 0010110 0011001 1000000 1001111 1010110 1011001 0100000 0101111 0110110 0111001 0010000 0011111 0000110 0001001 0001000 0000111 0011110 0010001 0000100 0001011 0010010 0011101 0000010 0001101 0010100 0011010 0000001 0001110 0010111 0011000 1100000 1101111 1110110 1111001 We received 0001110. Notice 0001111 is the column heading, and 0000001 is the coset leader. Final Decoding So, we found 0001110 in the second column, under 0001111, with a coset leader of 0000001. So, we assume the column heading is the intended message: 0001111. We could also do the received word 0001110, plus the coset leader, 0000001, to get 0001111. Theorem 31.4 ~ Coset Decoding is Nearest-Neighbor Decoding In coset decoding, a received word w is decoded as a code word c such that d(w, c) is minimal. Proof: Let C be a linear code, and let w be any received word. Suppose that v is the coset leader for the coset w+C. Thus, w+C = v+C. Proof Continued Thus, w = v + c, for some c in C. w is decoded as c. Now, if c’ is any code word, then w-c’ exists in w + C = v + C, so the wt(w-c’) > wt(v), since v was chosen because wt(v) is minimal among the members of v+C. Therefore, d(w,c’) = wt(w-c’) > wt(v) = wt(w-c) = d(w,c) Thus, w is decoded as a code word c such that d(w,c) is minimal. The Syndrome Def: If an (n, k) linear code over F has parity-check matrix H, then, for any vector u in F n , the vector uH is called the syndrome of u. How to use Syndrome Decoding Calculate wH, the syndrome of w. Find the coset leader v such that wH = vH. Assume that the vector sent was w – v. Syndrome Decoding Example 0 1 1 Let w = 0001110 1 0 1 Let H be our parity-check matrix: 1 1 0 Thus, wH = 001 [1000000]H = 011 1 1 1 1 0 0 [0100000]H = 101 0 1 0 [0010000]H = 110 0 0 1 [0001000]H = 111 [0000100]H = 100 [0000010]H = 010 [0000001]H = 001… Thus, v = 0000001 Syndrome Decoding Continued So, we have w = 0001110 And we have v = 0000001 Finally, all we have to do is compute w – v = 0001111. Which was the intended message. Theorem 31.5 ~ Same Coset - Same Syndrome Let C be an (n, k) linear code over F with a parity-check matrix H. Then, two vectors of are in the same coset of C iff they have the same syndrome. Proof: Two vectors u and v are in the same coset of C iff u-v is in C, since C is closed by definition. Note, vH = 0 iff v is in C. Thus, u and v are in the same coset iff 0 = (u- v)H = uH – vH. 0 = uH – vH uH = vH. Other Codes So, we only touched base on linear codes, and more specifically, the linear Hamming codes. There are many others! Reed-Solomon Codes Self-Dual Codes Golay Codes Cyclic Codes Quadric Residue Codes Bose-Chaudhuri-Hocquenghem (B.C.H.) Codes Etc. Any Questions? Homework: Problems from the chapter, page 540: 17 (and complete the standard array given to you) and 29 Use hint in back of book for 29. Sorry, we had to have some Abstract in there for ya!

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 26 |

posted: | 9/29/2012 |

language: | Unknown |

pages: | 59 |

OTHER DOCS BY 5EPZGb

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.