Document Sample

Graph Modeled Data Clustering: Fixed Parameter Algorithms for Clique Generation J. Gramm, J. Guo, F. Hüffner and R. Niedermeier Theory of Computing Systems (2005) Student: Vishal Kapoor Presentation Outline • Problem Introduction • Past Research • Results of the paper • CLUSTER EDITING – Kernelization – Search Tree • CLUSTER DELETION • Questions Problem Statement • Make k changes to the edge set of an input graph to get vertex disjoint cliques. • Each connected component is a clique in the resulting cluster graph • CLUSTER EDITING – Both edge additions and deletions are allowed • CLUSTER DELETION – Only edge deletions are allowed • Used in clustering of data – vertices are adjacent iff their similarity exceeds a threshold Past Research • [2000] Study of both these problems started by Shamir et. al. who proved that they are NPC and APX-hard • [1996] Cai studied the problem of edge additions and deletions and vertex deletions for certain graphs and showed it is FPT • [2001] Natanzon et. al. gave a general c-approximation for deletion and editing problems on bounded degree graphs for graphs with certain properties • [2002] Khot and Raman investigated the complexity of vertex deletion problems to find subgraphs with hereditary properties Results of this paper • CLUSTER EDITING – O(2.27k+|V|3) • CLUSTER DELETION – O(1.77k+|V|3) • By using certain reduction rules, the resulting kernel size = O(k3) – Has at most 2k2+ 2 vertices and 2k3+k2 edges. CLUSTER EDITING common neighbor non-common neighbor u v Reduction Rules • Rule1: a. If u and v have more than k common neighbors then {u,v} is set to ADDED and added to E if not already there b. If u and v have more than k non-common neighbors then {u,v} is set to DELETED and deleted from E if already there c. If u and v have both more than k common neighbors and more than k non-common neighbors then the instance has no solution Reduction Rules • Rule2: • For every 3 vertices u, v and w: a. If {u,v} = ADDED and {u,w} = ADDED then {v,w} should be set to ADDED and added if not already in E b. If {u,v} = ADDED and {u,w} = DELETED then {v,w} should be set to DELETED and deleted from E if already present Running Time • What is checked? – Every pair of vertices • Every vertex which is a neighbor of both of them • Takes time O(|V|3) Kernel Size • The kernel contains at most (2k+1).k vertices and at most (2k+1 choose 2).k edges. • Proof Skipped Branch and Search Algorithm • Identify a bad triple (of 3 vertices) in the kernel and repair it by adding/deleting edges to/from it, to transform the graph into disjoint cliques • Overall at most k edge additions/deletions are allowed • 2 branching strategies: – Basic = O(3k) – Advanced = O(2.27k) Basic Branching • Lemma: A graph consists of disjoint cliques iff there are no three vertices u,v,w such that {u,v}, {u,w} are edges, but {v,w} is not an edge • i.e. among such a triple, there should either be a single edge or a triangle u v w • Thus if a graph is not a union of disjoint cliques, then a bad triple can be found and repaired Basic Branch Algorithm 1. If G is a union of disjoint cliques, return SUCCESS 2. If k <= 0, return FAIL 3. Otherwise, find 3 vertices u,v,w such that edges {u,v}, {u,w} exist and {v,w} does not and branch on 3 instances of G’ as follows: a. E’ = E – {u,v}, k’=k-1 and set {u,v}=DELETED b. E’ = E – {u,w}, k’=k-1 and set {u,w} and {v,w}=DELETED, {u,v}=ADDED c. E’ = E + {v,w}, k’=k-1 and set all edges=ADDED Branching Rules u v w u u ? ? v w v w BR3 BR1 u v w BR2 Running time The algorithm solves CLUSTER EDITING in time = O(3k.k2+|V|3) 1. O(|V|3) is the time required to find all bad triples 2. O(3k) is the size of the search tree 3. The kernel (modified input G’) has |V| = O(k2) vertices. So a newly added/deleted edge can create/delete at most O(k2) bad triples. [And the edge list can then be updated only for vertices affected by that edge in O(k2) time.] Eg. NOTE: The time can be improved to O(3k+|V|3) by using repeated kernelization at every search tree node whenever possible for a polynomial size problem kernel • Similarly CLUSTER-DELETION can be solved in time = O(2k+|V|3) Advanced Branch Algorithm 1. Bad triples are considered, but their classification is refined further as follows: u w v u u w C2 v w v C1 C3 Branching for each case • For C1: BR3 cannot give a solution better than both BR1 and BR2 and can be omitted u w u2 v u1 v2 v1 w2 C1 w1 • If N(v) >= N(w), then total edges changed to make 1 clique >= total edges changed to make 2 cliques u w u2 v u1 v2 v1 w2 C1 w1 • Edges added to make 1 clique = – {v,w} added = +1 – {v,N(w)} added – {u,v} existing = N(v) – 1 – {w,N(v)} added – {u,w} existing = N(w) – 1 – joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) – joining each N(v) and N(w) with u = N(v)+N(w) – Total = 2.[N(v) + N(w)] + ([N(w)+N(v)] choose 2) – 1 =>(A) • Edges changed to make 2 cliques = – N(w) deleted = N(w) – {v,N(w)} added – {u,v} existing = N(v) – 1 – joining all N(w) and N(v) = ([N(w)+N(v)] choose 2) – joining each N(v) and N(w) with u = N(v)+N(w) – Total = N(v) + 3.N(w) + ([N(w)+N(v)] choose 2) – 1 =>(B) • Conclusion: As N(v) >= N(w) So (A) >= (B). • Thus only BR1 and BR2 can be used: u u ? v ? w v w BR1 BR2 • So resulting graphs = G\{u,v} or G\{u,w} and branching vector = (1,1) • And final recurrence relation: T(k) = 2.T(k-1) with root = 2. • So final tree size for C1 = 2k. • For C2: • Branching Vector = (1,2,3,2,3) • For C3: • Branching Vector = (1,2,3,2,3) Overall Running Time • Solve T(k) = T(k-1) + 2 [T(k-2) + T(k-3)] • So final worst search tree size = O(2.27k) • Thus CLUSTER-EDITING can be solved in O(2.27k+|V|3) • Cases for CLUSTER-DELETION: • Branching Vector = (2,3,2,3) and running time = O(1.77k + |V|3) Questions? Thanks.

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 0 |

posted: | 2/16/2013 |

language: | English |

pages: | 25 |

OTHER DOCS BY yanyanliu123

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.