Docstoc

kapoor

Document Sample
kapoor Powered By Docstoc
					Graph Modeled Data Clustering:
Fixed Parameter Algorithms for
      Clique Generation

 J. Gramm, J. Guo, F. Hüffner and R. Niedermeier
       Theory of Computing Systems (2005)

             Student: Vishal Kapoor
          Presentation Outline
•   Problem Introduction
•   Past Research
•   Results of the paper
•   CLUSTER EDITING
    – Kernelization
    – Search Tree
• CLUSTER DELETION
• Questions
            Problem Statement
• Make k changes to the edge set of an input
  graph to get vertex disjoint cliques.
• Each connected component is a clique in the
  resulting cluster graph

• CLUSTER EDITING
   – Both edge additions and deletions are allowed
• CLUSTER DELETION
   – Only edge deletions are allowed

• Used in clustering of data – vertices are adjacent iff their
  similarity exceeds a threshold
                 Past Research
• [2000] Study of both these problems started by Shamir et.
  al. who proved that they are NPC and APX-hard
• [1996] Cai studied the problem of edge additions and
  deletions and vertex deletions for certain graphs and
  showed it is FPT
• [2001] Natanzon et. al. gave a general c-approximation for
  deletion and editing problems on bounded degree graphs
  for graphs with certain properties
• [2002] Khot and Raman investigated the complexity of
  vertex deletion problems to find subgraphs with hereditary
  properties
        Results of this paper
• CLUSTER EDITING – O(2.27k+|V|3)
• CLUSTER DELETION – O(1.77k+|V|3)

• By using certain reduction rules, the
  resulting kernel size = O(k3)
  – Has at most 2k2+ 2 vertices and 2k3+k2
    edges.
    CLUSTER EDITING

                          common neighbor



non-common neighbor



                      u                     v
             Reduction Rules
•   Rule1:
    a. If u and v have more than k common
       neighbors then {u,v} is set to ADDED and
       added to E if not already there
    b. If u and v have more than k non-common
       neighbors then {u,v} is set to DELETED and
       deleted from E if already there
    c. If u and v have both more than k common
       neighbors and more than k non-common
       neighbors then the instance has no solution
             Reduction Rules
•   Rule2:
•   For every 3 vertices u, v and w:
    a. If {u,v} = ADDED and {u,w} = ADDED then
       {v,w} should be set to ADDED and added if
       not already in E
    b. If {u,v} = ADDED and {u,w} = DELETED
       then {v,w} should be set to DELETED and
       deleted from E if already present
             Running Time
• What is checked?
  – Every pair of vertices
     • Every vertex which is a neighbor of both of
       them


• Takes time O(|V|3)
              Kernel Size
• The kernel contains at most (2k+1).k
  vertices and at most (2k+1 choose 2).k
  edges.

• Proof Skipped
  Branch and Search Algorithm
• Identify a bad triple (of 3 vertices) in the
  kernel and repair it by adding/deleting
  edges to/from it, to transform the graph
  into disjoint cliques
• Overall at most k edge additions/deletions
  are allowed

• 2 branching strategies:
  – Basic = O(3k)
  – Advanced = O(2.27k)
             Basic Branching
• Lemma: A graph consists of disjoint cliques iff
  there are no three vertices u,v,w such that {u,v},
  {u,w} are edges, but {v,w} is not an edge
• i.e. among such a triple, there should either be a
  single edge or a triangle
                          u

                 v                   w
• Thus if a graph is not a union of disjoint cliques,
  then a bad triple can be found and repaired
       Basic Branch Algorithm
1. If G is a union of disjoint cliques, return
   SUCCESS
2. If k <= 0, return FAIL
3. Otherwise, find 3 vertices u,v,w such that
   edges {u,v}, {u,w} exist and {v,w} does not and
   branch on 3 instances of G’ as follows:
  a. E’ = E – {u,v}, k’=k-1 and set {u,v}=DELETED
  b. E’ = E – {u,w}, k’=k-1 and set {u,w} and
     {v,w}=DELETED, {u,v}=ADDED
  c. E’ = E + {v,w}, k’=k-1 and set all edges=ADDED
          Branching Rules
                       u

              v                 w


                                    u
     u

          ?
     ?                      v             w
v                 w                 BR3
    BR1                u



              v                 w
                      BR2
              Running time
The algorithm solves CLUSTER EDITING in
   time = O(3k.k2+|V|3)

  1. O(|V|3) is the time required to find all bad
     triples
  2. O(3k) is the size of the search tree
  3. The kernel (modified input G’) has |V| = O(k2)
     vertices. So a newly added/deleted edge can
     create/delete at most O(k2) bad triples. [And
     the edge list can then be updated only for
     vertices affected by that edge in O(k2) time.]
    Eg.


  NOTE: The time can be improved to O(3k+|V|3)
   by using repeated kernelization at every
   search tree node whenever possible for a
   polynomial size problem kernel

• Similarly CLUSTER-DELETION can be
  solved in time = O(2k+|V|3)
   Advanced Branch Algorithm
1. Bad triples are considered, but their
   classification is refined further as follows:
                       u


                              w
                 v
         u
                                      u


                w      C2
  v                                         w
                               v
         C1


                                     C3
        Branching for each case
• For C1: BR3 cannot give a solution better than both BR1
  and BR2 and can be omitted
                                u

                                    w
             u2            v
                                                u1
       v2                                v1
                   w2          C1                    w1




• If N(v) >= N(w), then total edges changed to make 1 clique
  >= total edges changed to make 2 cliques
                                              u

                                                    w
                       u2                v
                                                                      u1
               v2                                          v1
                              w2             C1                            w1


•   Edges added to make 1 clique =
     –   {v,w} added = +1
     –   {v,N(w)} added – {u,v} existing = N(v) – 1
     –   {w,N(v)} added – {u,w} existing = N(w) – 1
     –   joining all N(w) and N(v) = ([N(w)+N(v)] choose 2)
     –   joining each N(v) and N(w) with u = N(v)+N(w)
     –   Total = 2.[N(v) + N(w)] + ([N(w)+N(v)] choose 2) – 1 =>(A)

•   Edges changed to make 2 cliques =
     –   N(w) deleted = N(w)
     –   {v,N(w)} added – {u,v} existing = N(v) – 1
     –   joining all N(w) and N(v) = ([N(w)+N(v)] choose 2)
     –   joining each N(v) and N(w) with u = N(v)+N(w)
     –   Total = N(v) + 3.N(w) + ([N(w)+N(v)] choose 2) – 1 =>(B)

•   Conclusion: As N(v) >= N(w) So (A) >= (B).
• Thus only BR1 and BR2 can be used:
                 u                           u
                     ?
        v
                ?          w        v                   w
               BR1                          BR2

• So resulting graphs = G\{u,v} or G\{u,w} and branching
  vector = (1,1)

• And final recurrence relation: T(k) = 2.T(k-1) with root =
  2.
• So final tree size for C1 = 2k.
• For C2:




• Branching Vector = (1,2,3,2,3)
• For C3:




• Branching Vector = (1,2,3,2,3)
       Overall Running Time
• Solve T(k) = T(k-1) + 2 [T(k-2) + T(k-3)]

• So final worst search tree size = O(2.27k)

• Thus CLUSTER-EDITING can be solved
  in O(2.27k+|V|3)
• Cases for CLUSTER-DELETION:




• Branching Vector = (2,3,2,3) and running
  time = O(1.77k + |V|3)
Questions?
 Thanks.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:2/16/2013
language:English
pages:25