# Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss

Document Sample

```					Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform
Nir Ailon, Bernard Chazelle
(Princeton University)

Dimension Reduction
metric embedding technique (Rd, Lq) ! (Rk, Lp) k << d  Useful in algorithms requiring exponential (in d) time/space
 Johnson-Lindenstrauss  Algorithmic

for L2

 What is

exact complexity?

Dimension Reduction Applications
  

 

Approximate nearest neighbor [KOR00, IM98]… Text analysis [PRTV98] Clustering [BOR99, S00] Streaming [I00] Linear algebra [DKM05, DKM06]
  

Matrix multiplication SVD computation L2 regression

 

VLSI layout Design [V98] Learning [AV99, D99, V98] . . .

Three Quick Slides on: Approximate Nearest Neighbor Searching . . .

Approximate Nearest Neighbor
P = Set of n points

x

pmin

p

dist(x,p) · (1+)dist(x,pmin)

Approximate Nearest Neighbor
can be very large  -approx beats “curse of dimensionality”  [IM98, H01] (Euclidean), [KOR00] (Cube):



d

Time O(-2d log n) O(-2) Space n

Bottleneck: Dimension reduction

Using FJLT O(d log d + -3 log2 n)

The d-Hypercube Case
  

 


[KOR00] Binary search on distance 2 [d] For distance  multiply space by random matrix  2 Z2k £ d k=O(-2 log n) ij i.i.d. » biased coin Preprocess lookup tables for x (mod 2) Our observation:  can be made sparse


Using “handle” to p2 P s.t. dist(x,p)  

Time for each step: O(-2d log n) ) O(d + -2 log n)

How to make similar improvement for L2 ?

Back to Euclidean Space and Johnson-Lindenstrauss . . .

History of Johnson-Lindenstrauss Dimension Reduction
[JL84]  : Projection of Rd onto random subspace of dimension k=c-2 log n  w.h.p.: 8 pi,pj 2 P ||  pi -  pj ||2 = (1±O()) ||pi - pj||2  L2 ! L2 embedding

History of Johnson-Lindenstrauss Dimension Reduction
[FM87], [DG99]  Simplified proof, improved constant c   2 Rk £ d : random orthogonal matrix
1 2 k
||i||2=1 i ¢ j = 0

History of Johnson-Lindenstrauss Dimension Reduction
[IM98]   2 Rk£ d : ij i.i.d. » N(0,1/d)

1 2
k

E ||i||22=1 E i ¢ j = 0

History of Johnson-Lindenstrauss Dimension Reduction
[A03]  Need only tight concentration of |i ¢ v|2 +1 1/2 k£ d :  i.i.d. » 2 R ij -1 1/2
1 2 k
E ||i||22=1

E i ¢ j = 0

History of Johnson-Lindenstrauss Dimension Reduction
[A03]   2 Rk£ d : ij i.i.d. »  Sparse
1 2
k
+1 0 -1 1/6 2/3 1/6

E ||i||22=1 E i ¢ j = 0

Sparse Johnson-Lindenstrauss
parameter: s = Pr[ ij  0 ]  Cannot be o(1) due to “hidden coordinate”
 Sparsity
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

v=

0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

2 Rd

Uncertainty Principle
v sparse ) v dense
^ v=Hv ^

- Walsh - Hadamard matrix - Fourier transform on {0,1}log2 d - Computable in time O(d log d) - Isometry: ||v||2 = ||v||2
^

H

deterministic, invertible ) We’re back to square one! H with random diagonal D
±1 ±1
±1

 Precondition

D=
- Computable in time O(d) - Isometry

The l1-Bound Lemma
 w.h.p.:

8 pi,pj 2 P µ Rd : ||HD(pi - pj)||1 · O(d-1/2 log1/2 n) ||pi - pj||2
 Rules

out: HD(pi – pj) = “hidden coordinate vector” !! instead...

Hidden Coordinate-Set
Worst-case v = pi - pj (assuming l1-bound):  8 j  J: |vj| = (d-1/2 log1/2 n) 8 j  J: vj = 0 (assume ||v||2 = 1) J µ [d], |J| = (d/log n)

Fast J-L Transform
FJLT =  H D
Sparse JL Hadamard 0 1-s N(0,1) s l2 ! l2 log2 n s d
Bottleneck: Variance of |i ¢ v|2

Diag(±1)

ij i.i.d »
l2 ! l1 -1 log n s d Bottleneck: Bias of |i ¢ v|

Applications
 Approximate

nearest neighbor in (Rd, l2)

 l2 regression:

minimize ||Ax-b||2 A 2 Rn £ d over-constrained: d<<n [DMM06] approximate by sampling
non-constructive

[Sarlos06] using FJLT ) constructive  More applications...?

Interesting Problem I

Improvement & lower bound for J-L computation

Interesting Problem II
 

Dimension reduction is sampling Sampling by random walk:
 

Expander graphs for uniform sampling Convex bodies for volume estimation



  

[Kac59]: Random walk on orthogonal group for t=1..T: Thank You ! pick i,j 2R [d],  2R [0,2) vi vi cos + vj sin  vj Ã -vi sin  + vj cos Output (v1, ..., vk) as dimension reduction of v How many steps for J-L guarantee? [CCL01], [DS00], [P99] . . .

Thank You!

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 44 posted: 10/30/2008 language: English pages: 23