kapoor

Document Sample

```					Graph Modeled Data Clustering:
Fixed Parameter Algorithms for
Clique Generation

J. Gramm, J. Guo, F. Hüffner and R. Niedermeier
Theory of Computing Systems (2005)

Student: Vishal Kapoor
Presentation Outline
•   Problem Introduction
•   Past Research
•   Results of the paper
•   CLUSTER EDITING
– Kernelization
– Search Tree
• CLUSTER DELETION
• Questions
Problem Statement
• Make k changes to the edge set of an input
graph to get vertex disjoint cliques.
• Each connected component is a clique in the
resulting cluster graph

• CLUSTER EDITING
– Both edge additions and deletions are allowed
• CLUSTER DELETION
– Only edge deletions are allowed

• Used in clustering of data – vertices are adjacent iff their
similarity exceeds a threshold
Past Research
• [2000] Study of both these problems started by Shamir et.
al. who proved that they are NPC and APX-hard
• [1996] Cai studied the problem of edge additions and
deletions and vertex deletions for certain graphs and
showed it is FPT
• [2001] Natanzon et. al. gave a general c-approximation for
deletion and editing problems on bounded degree graphs
for graphs with certain properties
• [2002] Khot and Raman investigated the complexity of
vertex deletion problems to find subgraphs with hereditary
properties
Results of this paper
• CLUSTER EDITING – O(2.27k+|V|3)
• CLUSTER DELETION – O(1.77k+|V|3)

• By using certain reduction rules, the
resulting kernel size = O(k3)
– Has at most 2k2+ 2 vertices and 2k3+k2
edges.
CLUSTER EDITING

common neighbor

non-common neighbor

u                     v
Reduction Rules
•   Rule1:
a. If u and v have more than k common
neighbors then {u,v} is set to ADDED and
b. If u and v have more than k non-common
neighbors then {u,v} is set to DELETED and
deleted from E if already there
c. If u and v have both more than k common
neighbors and more than k non-common
neighbors then the instance has no solution
Reduction Rules
•   Rule2:
•   For every 3 vertices u, v and w:
b. If {u,v} = ADDED and {u,w} = DELETED
then {v,w} should be set to DELETED and
deleted from E if already present
Running Time
• What is checked?
– Every pair of vertices
• Every vertex which is a neighbor of both of
them

• Takes time O(|V|3)
Kernel Size
• The kernel contains at most (2k+1).k
vertices and at most (2k+1 choose 2).k
edges.

• Proof Skipped
Branch and Search Algorithm
• Identify a bad triple (of 3 vertices) in the
kernel and repair it by adding/deleting
edges to/from it, to transform the graph
into disjoint cliques
• Overall at most k edge additions/deletions
are allowed

• 2 branching strategies:
– Basic = O(3k)
Basic Branching
• Lemma: A graph consists of disjoint cliques iff
there are no three vertices u,v,w such that {u,v},
{u,w} are edges, but {v,w} is not an edge
• i.e. among such a triple, there should either be a
single edge or a triangle
u

v                   w
• Thus if a graph is not a union of disjoint cliques,
then a bad triple can be found and repaired
Basic Branch Algorithm
1. If G is a union of disjoint cliques, return
SUCCESS
2. If k <= 0, return FAIL
3. Otherwise, find 3 vertices u,v,w such that
edges {u,v}, {u,w} exist and {v,w} does not and
branch on 3 instances of G’ as follows:
a. E’ = E – {u,v}, k’=k-1 and set {u,v}=DELETED
b. E’ = E – {u,w}, k’=k-1 and set {u,w} and
c. E’ = E + {v,w}, k’=k-1 and set all edges=ADDED
Branching Rules
u

v                 w

u
u

?
?                      v             w
v                 w                 BR3
BR1                u

v                 w
BR2
Running time
The algorithm solves CLUSTER EDITING in
time = O(3k.k2+|V|3)

1. O(|V|3) is the time required to find all bad
triples
2. O(3k) is the size of the search tree
3. The kernel (modified input G’) has |V| = O(k2)
vertices. So a newly added/deleted edge can
create/delete at most O(k2) bad triples. [And
the edge list can then be updated only for
vertices affected by that edge in O(k2) time.]
Eg.

NOTE: The time can be improved to O(3k+|V|3)
by using repeated kernelization at every
search tree node whenever possible for a
polynomial size problem kernel

• Similarly CLUSTER-DELETION can be
solved in time = O(2k+|V|3)
1. Bad triples are considered, but their
classification is refined further as follows:
u

w
v
u
u

w      C2
v                                         w
v
C1

C3
Branching for each case
• For C1: BR3 cannot give a solution better than both BR1
and BR2 and can be omitted
u

w
u2            v
u1
v2                                v1
w2          C1                    w1

• If N(v) >= N(w), then total edges changed to make 1 clique
>= total edges changed to make 2 cliques
u

w
u2                v
u1
v2                                          v1
w2             C1                            w1

•   Edges added to make 1 clique =
–   {v,N(w)} added – {u,v} existing = N(v) – 1
–   {w,N(v)} added – {u,w} existing = N(w) – 1
–   joining all N(w) and N(v) = ([N(w)+N(v)] choose 2)
–   joining each N(v) and N(w) with u = N(v)+N(w)
–   Total = 2.[N(v) + N(w)] + ([N(w)+N(v)] choose 2) – 1 =>(A)

•   Edges changed to make 2 cliques =
–   N(w) deleted = N(w)
–   {v,N(w)} added – {u,v} existing = N(v) – 1
–   joining all N(w) and N(v) = ([N(w)+N(v)] choose 2)
–   joining each N(v) and N(w) with u = N(v)+N(w)
–   Total = N(v) + 3.N(w) + ([N(w)+N(v)] choose 2) – 1 =>(B)

•   Conclusion: As N(v) >= N(w) So (A) >= (B).
• Thus only BR1 and BR2 can be used:
u                           u
?
v
?          w        v                   w
BR1                          BR2

• So resulting graphs = G\{u,v} or G\{u,w} and branching
vector = (1,1)

• And final recurrence relation: T(k) = 2.T(k-1) with root =
2.
• So final tree size for C1 = 2k.
• For C2:

• Branching Vector = (1,2,3,2,3)
• For C3:

• Branching Vector = (1,2,3,2,3)
Overall Running Time
• Solve T(k) = T(k-1) + 2 [T(k-2) + T(k-3)]

• So final worst search tree size = O(2.27k)

• Thus CLUSTER-EDITING can be solved
in O(2.27k+|V|3)
• Cases for CLUSTER-DELETION:

• Branching Vector = (2,3,2,3) and running
time = O(1.77k + |V|3)
Questions?
Thanks.

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 0 posted: 2/16/2013 language: English pages: 25