# Quantum Channels and their capacities

Document Sample

```					Quantum Channels and their
capacities
Graeme Smith, IBM Research

Quantum Information
Information Theory
• “A Mathematical Theory of
Communication”, C.E. Shannon, 1948
• Lies at the intersection of Electrical
Engineering, Mathematics, and Computer
Science
• Concerns the reliable and efficient storage
and transmission of information.
Information Theory: Some Hits
Low density parity check codes
• Source Coding:
Cell
Phones

Lempel-Ziv
compression
(gunzip,
winzip, etc)

Voyager (Reed Solomon codes)
Quantum Information Theory
When we include quantum mechanics (which
was there all along!) things get much more
interesting!

Secure communication, entanglement enhanced
communication, sending quantum information,…

Capacity, error correction, compression, entropy..
Outline
• Lecture 1:
Classical Information Theory
• Lecture 2:
Quantum Channels and their many capacities
• Lecture 3:
LOCC,Gaussian Noise,etc
• Lecture 4:
LOCC,Gaussian Noise,etc.
Example: Flipping a biased coin
Let’s say we flip n coins.
They’re independent and identically distributed (i.i.d):

Pr( Xi = 0 ) = 1-p            Pr( Xi = 1 ) = p

Pr( Xi = xi, Xj = xj ) = Pr( Xi = xi ) Pr( Xj = xj )

x1x2 … xn

Q: How many 1’s am I likely to get?
Example: Flipping a biased coin
Let’s say we flip n coins.
They’re independent and identically distributed (i.i.d):

Pr( Xi = 0 ) = 1-p            Pr( Xi = 1 ) = p

Pr( Xi = xi, Xj = xj ) = Pr( Xi = xi ) Pr( Xj = xj )

x1x2 … xn

Q: How many 1’s am I likely to get?

A: Around pn and, with very high probability between (p-d)n and (p+d)n
Shannon Entropy
Flip n i.i.d. coins, Pr( Xi = 0) = p, Pr( Xi = 1) = 1-p
Outcome: x1…xn.
w.h.p. get ¼ pn 1’s, but how many different configurations?
¡   n
¢              n!
There are            pn       =   (pn )!((1¡ p)n ) !       such strings.

Using   log n! = n logn ¡ n + O(logn)                      we get
µ    ¶
n
log          ¼ n log n ¡ n ¡ pn log(pn) + pn ¡ (1 ¡ p)n log((1 ¡ p)n) + (1 ¡ p)n
pn
=   nH (p)
Where H(p) = -plogp – (1-p)log(1-p)

So, now, if I want to transmit x_1…x_n, I can just check which typical sequence, and
report that! Maps n bits to nH(p)
P
Similar for larger alphabet:      H (X ) =       x   ¡ p(x) log p(x)
Shannon Entropy
Flip n i.i.d. coins, Pr( Xi = xi) = p(xi)
x1…xn has prob P(x1…xn) = p(x1)…p(xn)
P   n
logP(x 1 :::x n ) =       i = 1 logp(x i )

1
Typically, ¡   n   logP(x 1 :::x n ) ¼ ¡ hlogp(x)i = H (X )

So, typical sequences have P(x 1 :::x n ) »                   1
2n ( H   ( X ) § ±)

More or less uniform dist over typical sequences.
Correlations and Mutual
Information
H(X) – bits of information transmitted by letter from ensemble X

Noisy Channel:
p(y|x)
X            N          Y
(X,Y) correlated. How much does Y tell us about X?

After I get Y, how much more do you need to tell me so that I know X?
p( yj x ) p( x )
Given y, update expectations:   p(xjy) =       p( y )

Only H (X jY ) = h¡ log p(xjy)i         bits per letter needed.

Can calculate    H (X jY ) = h¡ log p( x ;y) i = H (X ; Y ) ¡ H (Y )
p(y)

Savings:   H(X)–H(X|Y) = H(X) + H(Y) – H(X,Y) =: I(X;Y)
Channel Capacity
Given n uses of a channel, encode
a message m 2 {1,…,M} to a
N               codeword xn = (x1(m),…, xn(m))

m              N              m0¼m
Encoder         Decoder
.             At the output of the channel, use
.             yn = (y1,…,yn) to make a guess,
m0 .
.
The rate of the code is (1/n)log M.

N               The capacity of the channel, C(N),
is defined as the maximum rate
you can get with vanishing error
probability as n ! 1
Binary Symmetric Channel

X            N      Y

p(0|0) = 1-p     p(1|0) = p
p(0|1) = p       p(1|1) = 1-p
Capacity of Binary Symmetric
Channel
2n possible outputs
Input string

xn =(x1,…, xn)
Capacity of Binary Symmetric
Channel
2n possible outputs
2nH(p)   typical errors
Input string

xn =(x1,…, xn)
Capacity of Binary Symmetric
Channel
2n possible outputs
2nH(p)   typical errors
Input string

x1n =(x11,…, x1n)

2nH(p)

x2n =(x21,…, x2n)
Capacity of Binary Symmetric
Channel
2n possible outputs
2nH(p)   typical errors
Input string

x1n =(x11,…, x1n)

2nH(p)

x2n =(x21,…, x2n)

.
.
.               2nH(p)

xMn =(xM1,…, xMn)
Capacity of Binary Symmetric
Channel
2n possible outputs
2nH(p)   typical errors
Input string
Each xmn gets
mapped to 2nH(p)
x1n =(x11,…, x1n)                                               different outputs.

2nH(p)
If these sets
x2 =(x21,…, x2n)
n
overlap for different
inputs, the decoder
.                                                        will be confused.
.                                                        So, we need
.               2nH(p)                                   M 2nH(p) · 2n, which
implies
xMn =(xM1,…, xMn)                                               (1/n)log M · 1-H(p)

Upper bound on capacity
Direct Coding Theorem:
Achievability of 1-H(p)
(1) Choose 2nR codewords randomly according to Xn
(50/50 variable)
(2) xmn ! yn. To decode, look at all strings within 2n(H(p)+ d)
bit-flips of yn. If this set contains exactly one

Decoding sphere is big enough that w.h.p. the correct
codeword xmn is in there.

So, the only source of error is if two codewords are in
there. What are the chances of that???
Direct coding theorem:
Mapped
Achievablility of 1-H(p)
here
Random
codeword

2n total
Direct coding theorem:
Achievablility of 1-H(p)
Size 2n(H(p) + d)

2n total
Direct coding theorem:
Achievablility of 1-H(p)
Size 2n(H(p) + d)                     If code is chosen randomly,
what’s the chance of another
codeword in this ball?

2n total
Direct coding theorem:
Achievablility of 1-H(p)
Size 2n(H(p) + d)                     If code is chosen randomly,
what’s the chance of another
codeword in this ball?

If I choose one more word, the
n ( H ( p) + ±)
chance is 2
2n

2n total
Direct coding theorem:
Achievablility of 1-H(p)
Size 2n(H(p) + d)                           If code is chosen randomly,
what’s the chance of another
codeword in this ball?

If I choose one more word, the
n ( H ( p) + ±)
chance is 2
2n
2n ( H   ( p) + R + ±)
Choose 2nR more, the chance is             2n
2n total
If R < 1 – H(p) – d, this ! 0 as n ! 1
Direct coding theorem:
Achievablility of 1-H(p)
Size 2n(H(p) + d)                                  If code is chosen randomly,
what’s the chance of another
codeword in this ball?

If I choose one more word, the
n ( H ( p) + ±)
chance is 2
2n
2n ( H   ( p) + R + ±)
Choose 2nR more, the chance is             2n
2n total
If R < 1 – H(p) – d, this ! 0 as n ! 1

So, the average probability of decoding error (averaged over codebook
choice and codeword) is small.

As a result, there must be some codebook with low prob of error
(averaged over codewords).
Low worst-case probability of error
Showed: there’s a code with rate R such that
1
P   2n R
2n R        i= 1   Pi < ²

Let N2e = #{i | Pi > 2e}. Then,
2² N 2 ²        1
P                            1
P   2n R
2n R
·   2n R       i st P i > 2²   Pi ·    2n R       i= 1   Pi < ²

So that N2e < 2nR - 1. Throw these away.

Gives a code with rate R – 1/n and Pi < 2e for all i.
Coding theorem: general case
X          N             Y

• 2nH(X) typical xn
• For typical yn, ¼ 2n(H(X|Y) + d) candidate xn’s in decoding
sphere. If unique candidate, report this.
n ( H ( X j Y ) + ±)
• Each sphere of size 2   n(H(X|Y)+d) contains a fraction 2 n H ( X )
2
of the inputs.
• If we choose 2nR codewords, prob of accidentally falling
nR                           nR
into wrong sphere is 2n ( H ( X ) ¡2 H ( X j Y ) ¡ ± ) = 2n I (2X ; Y ) ¡ ±

• Okay as long as R < I(X;Y)
Capacity for general channel
• We showed that for any input distribution
p(x), given p(y|x), we can approach rate R
= I(X;Y). By picking the best X, we can
achieve C(N) = maxXI(X;Y). This is called
the “direct” part of the capacity theorem.
• In fact, you can’t do any better. Proving
there’s no way to beat maxXI(X;Y) is called
the “converse”.
Converse
For homework, you’ll prove two great
1. H(X,Y) · H(X)+ H(Y)
2. H(Y1Y2|X1X2) · H(Y1|X1) + H(Y2|X2)
We’re going to use the first one to prove
that no good code can transmit at a rate
better than maxX I(X;Y)
Converse
• Choose some code with 2nR strings of n letters.
• Let Xn be uniform distribution on the codewords (codeword i with
prob 1/2nR)
• Because the channels are independent, we get p(y1…yn|x1…xn) =
p(y1|x1)…p(yn|xn)
• As a result, the conditional entropy satisfies H(Yn|Xn) = h –log
p(y1…yn|x1…xn) i = i h –log p(yi|xi)i = i H(Yi|Xi).
• Now, I(Yn;Xn) = H(Yn) – H(Yn|Xn) · i H(Yi) - H(Yi|Xi) = i I(Yi;Xi) · n
maxX I(X;Y)

• Furthermore, I(Yn;Xn) = H(Xn)-H(Xn|Yn) = nR - H(Xn|Yn)· n maxX
I(X;Y).
• Finally, since Yn must be sufficient to decode Xn, we must have
(1/n)H(Yn|Xn) ! 0, which means R · maxX I(X;Y).
Recap
• Information theory is concerned with efficient
and reliable transmission and storage of data.
• Focused on the problem of communication in
the presence of noise. Requires error correcting
codes.
• Capacity is the maximum rate of communication
for a noisy channel. Given by maxX I(X;Y).
• Two steps to capacity proof: (1) direct part: show
by randomized argument that there exist codes
with low probability of error. Hard part: error
estimates. (2) Converse: there are no codes that
do better---use entropy inequalities.
Coming up:
Quantum channels and their various
capacities.
• Homework is due at the
Beginning of next class.
• Fact II is called Jensen’s
inequality
• No partial credit

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 2 posted: 7/8/2011 language: English pages: 32