# Algorithm by dfhdhdhdhjr

VIEWS: 6 PAGES: 26

• pg 1
```									Re-ranking method based on
inter-document distances

授課老師：陳彥良教授、許秉瑜教授

報告者：吳家齊
Outline
   Background
   Main conception
   Method
   An example
   Conclusion
Background & Motivation
   The best documents should be located as
close to the top of the list as possible.
   None of the existing models can
guarantee that relevant documents will
occupy the top position in the list.
retrieval.
   One kind of additional information is
inter-document relationships.
Objective
   Proposing a new document re-ranking
method.
   It uses the distances between documents
for modifying initial relevance weights.
Main conception
   Similarities can be equivalently described
by means of distances.
   Documents strongly interrelated should
not be assigned very different weights.
   Metric space property:
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
Main conception

d        e

d*
Main conception

δ(d, e) ≤ δ(e, d*) + δ(d, d*)

e                  d*          d

δ(e, d*)
δ(d, d*)
Main conception

δ(d, e) ≥ δ(e, d*) - δ(d, d*)

e            d δ(d, d*) i*

δ(e, d*)
Method
   D = {d1, d2, …, dn} is a set of documents
returned in response to a query.
   Input
   distance vecter c = [δ(d1, d*), δ(d2, d*) , …,
δ(dn, d*)]
   distance matrix D[dij = δ(di, dj)]
   maxerror > 0
   Output
   A better distance vector c’
Method
    Change matrix
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
|ci - dij|≦ cj ≦ ci + dij

|ci - dij|- cj   if cj<|ci - dij|
Zi,j =    ci + dij - cj   if cj> ci + dij
0                 if |ci - dij|≦ cj ≦ ci + dij
Method
    Change matrix
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
|ci - dij|≦ cj ≦ ci + dij

|ci - dij|- cj   if cj<|ci - dij|
Zi,j =    ci + dij - cj   if cj> ci + dij
0                 if |ci - dij|≦ cj ≦ ci + dij
Method
    Change matrix
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
|ci - dij|≦ cj ≦ ci + dij

|ci - dij|- cj   if cj<|ci - dij|
Zi,j =    ci + dij - cj   if cj> ci + dij
0                 if |ci - dij|≦ cj ≦ ci + dij
Method
    Change matrix
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
|ci - dij|≦ cj ≦ ci + dij

|ci - dij|- cj   if cj<|ci - dij|
Zi,j =    ci + dij - cj   if cj> ci + dij
0                 if |ci - dij|≦ cj ≦ ci + dij
Method
    Change matrix
|δ(d, d*)-δ(e, d)|≦(e, d*)≦δ(d, d*) + δ(e, d)
|ci - dij|≦ cj ≦ ci + dij

|ci - dij|- cj   if cj<|ci - dij|
Zi,j =    ci + dij - cj   if cj> ci + dij
0                 if |ci - dij|≦ cj ≦ ci + dij
Method
   An example of change matrix

0 5 7
C = [2, 8, 3]   D=   5 0 3
i j             7 3 0

0   -1   +2
Z=    +1    0   +2
+2   -2    0
Method
   An example of change matrix

0 5 7
C = [2, 8, 3]   D=   5 0 3
j i             7 3 0

0   -1   +2
Z=    +1    0   +2
+2   -2    0
Method
An example
   query = data mining methods.
   A list produced by a dictionary
S = (association, classification, clustering, data,
method, mining, regression)
q = [0,0,0,1,1,1,0]
   Response documents D = {d1,d2,d3,d4}
d1=[0,0,0,2,1,2,0]
d2=[2,2,2,4,4,4,2]
d3=[0,1,2,1,2,2,1]
d4=[0,1,0,0,0,1,2]
An example
   Determine the relevance weights of
documents using the cosine coefficient
(Salton & McGill, 1983)
   r1 = sim(d1, q) = 0.96
   r2 = sim(d2, q) = 0.87
   r3 = sim(d3, q) = 0.75
   r4 = sim(d4, q) = 0.24
   r = [0.96, 0.87, 0.75, 0.24]
An example
S = (association, classification, clustering, data,
method, mining, regression)

q = [0,0,0,1,1,1,0]

d1=   [0,0,0,2,1,2,0]
d2=   [2,2,2,4,4,4,2]
d3=   [0,1,2,1,2,2,1]
d4=   [0,1,0,0,0,1,2]

r = [0.96, 0.87, 0.75, 0.24]
An example
   Obtain the distance matrix using the
Hamming distance

d1 =   [0,   0,   0,   2,   1,   2,   0]
d2 =   [2,   2,   2,   4,   4,   4,   2]
d3 =   [0,   1,   2,   1,   2,   2,   1]
d4 =   [0,   1,   0,   0,   0,   1,   2]

0        15 6             7
D=      15         0 11           16
6        11 0             7
7        16 7             0
An example
   Determine distances with the relevance
function
   relevance function: r = 1 – 0.03c
ri = 1 – 0.03ci
ci = (1 – ri) / 0.03

   c = [1.33, 4.33, 8.33, 25.33]
An example
   Produce initial change matrix with DWI
   c = [1.33, 4.33, 8.33, 25.33]
0   15 6      7
D=    15    0 11    16
6   11 0      7
7   16 7      0

0   -9.34    -1   -17
Z=    9.34   0       0    -5
1      0      0   -10
17     5      10     0
An example
   A new distance vector is determined
c’ = [10.25, 4.75, 9.08, 16.08]
   Using the relevance function
ri’ = 1 – 0.03ci’
new weights are calculated.
r’ = [0.69, 0.86, 0.73, 0.52]
Conclusion
   This method take the relationship with all
the documents in the answer into
account.
   The practical benefits of the method
proposed remain to be shown.