Analysis of Random Measurements - An introduction to the non

Document Sample

```					        Analysis of Random Measurements
An introduction to the non-asymptotic theory of random matrices

Roman Vershynin

Department of Mathematics
University of California, Davis

IPAM Short Course, 2007
Goals:
Short course in the non-asymptotic theory of random matrices.
Methods of geometric functional analysis.
Focus on techniques.
Sources:
Handouts: use as a bibliography guide
My webpage: copy of these slides; notes for my UCD course on
Non-asymptotic theory of random matrices (Winter 2007)
Lecture 1. The sparse reconstruction problem and random matrices

1   The sparse reconstruction problem

2   Random matrices: asymptotic and non-asymptotic theories

3   The ε-net method
The sparse reconstruction problem
Unknown signal: x ∈ Rd or Cd .
Sparse: | supp(x)| ≤ n, and n  d.
Want to reconstruct x from few (say, N) linear measurements.
Measurements are given by a linear measurement operator

Φ : Rd → RN ;       Measurements = Φx.

Trivial to construct N = d measurements (identity operator).
We hope to achieve N      d, because x is sparse:
its effective dimension n   nominal dimension d.
Lower bound: N ≥ n. Hopefully, can make N closer to n than to d:

N ∼ n log d

is everybody’s dream (sometimes achieved).
Measurement matrices
Signal x ∈ Rd , which is sparse: | supp(x)| ≤ n, and n d.
Reconstruct x from linear measurements Φx, where Φ : Rd → RN .
Hope to take few measurements: n ≤ N         d.
Φ realizes dimension reduction (from d to N).
We identify Φ with the N × d measurement matrix:
Restricted Isometries
To reconstruct any n-sparse x from Φx, the measurement map Φ
has to be one-to-one on n-sparse vectors.
Candes-Tao (2004): a slightly stronger condition yields a
reconstruction algorithm. A quantitative (rather than qualitative)
condition: Φ has to be an almost isometry on the sparse vectors.
Restricted Isometries
Theorem (Candes-Tao, 2004)
Suppose the measurement map Φ is a restricted isometry:

0.8 x ≤ Φx ≤ 1.2 x           for all 3n-sparse vectors x.

Then x can be reconstructed from the measurements b = Φx as the
solution to the convex optimization problem

min x   1   subject to Φx = b.

Meaning of 0.8 and 1.2: close to 1 (1 would be exact isometry).
Proof: nontrivial, elementary
Other applications of restricted isometries:
Vector quantization [Lyubarskii-V., 2006]
Invertibility conjectures on random matrices [Rudelson-V., 2007].
See Von Neumann Symposium talk.
Restricted Isometries
Equivalently, all N × n minors of Φ are almost isometries:

d       n                n
Verifying this is hard: there are   n   ∼ ( d )n ∼ exp(n log d ) minors.
Probabilistic method. Draw the measurement matrix Φ at random.
Verify the restricted isometry condition for one minor. Suppose
n
holds with probability at least 1 − exp(n log d ). Then take union
over all minors; with positive probability will satisfy all.
The problem reduces to one minor = one N × n random matrix.
Random matrix ensembles
¯
Generate a random N × d measurement matrix Φ =          1
√ Φ,   where Φ:
N
Gaussian: i.i.d. standard normal entries
Bernoulli: i.i.d. symmetric ±1 entries
Projections: orthog. projection in Rd onto a random N-dim. subspace
Fourier: random N rows of d × d Discrete Fourier Transform.
(Φx = random frequencies of x)
Unifying framework for Gaussian, Bernoulli, bdd.: subgaussian

Deﬁnition (Subgaussian random variables)
A random variable X is called subgaussian if its tail is dominated by
the gaussian tail:

P(|X | > t) ≤ 2 exp(−Ct 2 )    for all t > 0.

Will also assume variance 1.
From restricted isometries to singular values
Restricted isometry condition = all minors are almost isometries.
Minors of subgaussian random matrices are also subgaussian.
So the problem reduces to studying
¯
N × n subgaussian matrix A, normalized: A =       1
√ A.
N
¯
Main question: is A an approximate isometry?
¯
C1 x ≤ Ax ≤ C2 x          for all vectors x,

with C1 ≈ C2 ≈ 1 (like 0.8 and 1.2).
√
¯                  ¯       ¯ ¯
In terms of the singular values of A (eigenvalues of |A| = A∗ A),
¯
the best constant C2 is the largest singular value λmax (A);
¯
the best constant C1 is the smallest singular value λmin (A).
¯               ¯
We want to bound λmax (A) above, λmin (A) below,
with high probability.
Asymptotic theory of random matrices
Studies limiting spectral properties of random N × n matrices A,
n
as n → ∞ and the aspect ratio    N   → y.
During 1980–1993, it was shown for subgaussian i.i.d. matrices:
¯       √                ¯       √
λmax (A) → 1 + y ;       λmin (A) → 1 − y        a.s.

Thus close to 1 for small y (tall matrices are almost isometries).
Does this solve our problem (proves R.I.C.?)
No: the asymptotic theory holds in the limit, not for ﬁnite sizes n.
Also, the probability is insufﬁcient to take the union bound over all
(exponentially many) minors A.
We need a non-asymptotic theory of random matrices,
valid for all ﬁnite sizes.
Non-asymptotic theory of random matrices
Theorem (Largest singular value)
Let A be an N × n subgaussian matrix. Then
√    √
λmax (A) = A ≤ C( N + n)

with exponentially large probability 1 − e−c(n+m) .

¯
Equivalently, for the normalized matrix A =    1
√ A,
N

¯         √
λmax (A) ≤ C(1 + y ).

Matches the asymptotic bound up to the constant C.
Non-asymptotic: holds for every size N, n.
Proof: the ε-net method.
The ε-net method
Need to bound A = max Ax
x∈S n−1
(over the unit Euclidean sphere S n−1 in Rn ).
Discretization: approximate the sphere with a ﬁnite set of points.
Why possible? The sphere is compact ⇒ has a ﬁnite ε-net.
The ε-net method
Deﬁnition (ε-net)
A subset N of a metric space T is called an ε-net if every point of T is
within distance ε from some point in N .
The minimum cardinality of an ε-net is the covering number N(K , ε).

N(K , ε) is the minimal number of ε-balls needed to cover K .
N(K , ε) is the amount of information in K (complexity).
N(K , ε) is usually exponential in the dimension.
The ε-net method
Proposition (Cardinality of an ε-net)
3   n
N(S n−1 , ε) ≤             .
ε

Proof.
Construct an ε-net by the greedy algorithm:

x1 : arbitrary;
x2 : such that x2 − x1 > ε;
x3 : such that x3 − xk > ε,      k = 1, 2;
. . . until no such point can be found.
Then N = {x1 , x2 , . . .} is an ε-net.
(Or else there existed a point with distance > ε from all xk , which
N = {x1 , x2 , . . .} is an ε-net,
which is ε-separated: xj − xk > ε.)
Want to bound above |N |.
ε
The 2 -balls centered at xk are disjoint
ε
and are contained in a ball of radius 1 + 2 .

Count their volumes (B = unit ball):
ε              ε
|N | · Vol     B ≤ Vol 1 +    B
2              2
ε n       ε n
|N | ·       ≤ 1+
2         2
2    n
|N | ≤     +1
ε
The ε-net method
We want to replace the sphere by its small ε-net in the operator norm

A = max         Ax =         max            Ax, y .
x∈S n−1          x∈S n−1 , y ∈S N−1

Lemma (Discretization of the norm)
Let N , M be ε-nets of the unit spheres S n−1 , S N−1 . Then

A ≤ C(ε) max Ax ≤ C (ε)               max        Ax, y .
x∈N                    x∈N , y ∈M

Proof.
We write x ∈ S n−1 as x = y + z, where y ∈ N , z ≤ ε. Then

A ≤ max Ax        2   + max      Az .
x∈N              z: z ≤ε

The last term is bounded by ε A . This proves the ﬁrst inequality.
The ε-net method
Now we know how to compute the norm using 1 -nets N and M:
2

A       max       Ax, y .
x∈N , y ∈M

And these nets are small small: |N | ≤ 6n , |M| ≤ 6N .
For each x and y , the form Ax, y is a random variable.
Entries of A are subgaussian ⇒ Ax, y is subgaussian.
(Exercise)
2
This means: P( Ax, y > t) ≤ 2e−Ct for all t.
Take the union bound over x ∈ N , y ∈ M,
2             2
P     max         Ax, y > t ≤ |N ||M|2e−Ct ≤ 6n+N 2e−Ct .
x∈N , y ∈M

√        √
Choose t ∼       n+    N so that the probability is ≤ 2e−c(n+N) .
√    √
This proves: A           n + N with high probability.
Conclusion
Using an ε-net method, we proved an upper bound on subgaussian
matrices:
Theorem (Largest singular value)
Let A be an N × n subgaussian matrix. Then
√    √
λmax (A) = A ≤ C( N + n)

with exponentially large probability 1 − e−c(n+m) .

For the lower bound, will need to reﬁne the ε-net method. (Lecture 2).

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 24 posted: 11/21/2008 language: English pages: 19
gregoria