VIEWS: 11 PAGES: 21 POSTED ON: 2/17/2011
Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron Property Testing: informal definition A relaxation of a decision problem: For a fixed property P and any object O, determine whether O has property P or is far from having property P (i.e., O is far from any other object having P). ? ? ? ? ? Focus: sub-linear time algorithms – performing the task by inspecting the object at few locations. Property Testing: the standard (one-sided error) def’n A property P = n Pn , where Pn is a set of functions with domain Dn. The tester gets explicit input n and , and oracle access to a function with domain Dn. • If f Pn then Prob[Tf(n,) accepts] = 1. • If f is -far from Pn then Prob[Tf(n,) rejects] > 2/3. (Distance is defined as fraction of disagreements.) Focus: query complexity q(n,)=q() ( « | Dn |) Terminology: is called the proximity parameter. How does a tester use the proximity parameter Some testers use the proximity parameter merely in order to determine the number of times that a basic test is performed, where the basic test is oblivious of the proximity parameter. We call such basic tests proximity oblivious testers. Example: the BLR (linearity) tester. On input (prox.par.) and oracle f, repeat the following test O(1/ ) times: 1. Select uniformly x,y in Dn 2. Accept iff f(x)+f(y)=f(x+y). Proximity Oblivious Testing: the basic definition A property P = n Pn ’ where Pn is a set of functions with domain Dn. A P.O. Tester (POT) gets explicit input n (but not ), and oracle access to a function with domain Dn. • If f Pn then Prob[Tf(n) accepts] = 1. • If f Pn then Prob[Tf(n) rejects] > (P(f)), where : (0,1] (0,1] (is the “detection rate”) and P(f) denotes the distance of f from P. N.B.: A standard tester is obtained by repeating the POT (i.e., on prox. par. , repeat O(1/()) times). Focus: constant query complexity q(n)=q ( « | Dn |) Questions addressed in this work 1. Which “testable” properties have POTs? 2. How does the complexity of the standard tester obtained by repeating the POT compare to the complexity of the best possible standard tester . These questions are studied mainly in two standard models of testing graph properties: (i) the adjacency matrix model and (ii) the bounded-degree model. Example: the BLR (linearity) tester. The complexity of the (std.) tester obtained by repeating the POT equals (up to a constant) the complexity of the best possible standard tester. PART 1: In the adjacency matrix model A graph G=(V,E) is represented by a function g:[N][N]{0,1} (i.e., g(u,v)=1 iff (u,v) is an edge in G). This (representation) determines: 1. The type of queries: adjacency queries 2. The distance measure: #differences/N2 The adjacency matrix model: two simple examples A graph G=(V,E) is represented by a function g:[N][N]{0,1}. Example 1: Clique. The property of being a clique has a “trivial” single-query POT with ()=. Example 2: BiClique. The property of being a biclique has a three-query POT with ()=. Select s[N] arbitrarily, and random u,v[N], and accept iff the induced subgraph is a biclique (i.e., has an even number of edges). Example 2: analysis of the 3-query POT Select s[N] arbitrarily, and random u,v[N], and accept iff the induced subgraph is a biclique (i.e., has an even number of edges). Analysis technique: s s induces a partition, u and v check it. (s) [N] \ (s) Suppose that the graph is -far from Biclique. Then #edges in same side + #non-edges between sides > N2 induced subgraph induced subgraph has 1 or 3 edges has a single edge Example 3: triangle-freeness [AFKS, Alon] THM: -freeness has a 3-query POT with ()=1/Tower(1/), but no O(1)-query POT with ()=poly(). The point is that being -far from -freeness means that N2 edges must be omitted to obtain a -free graph, but this does not mean that the graph has N3 (nor poly()N3 ) triangles. Conclusion: easy testability and POT-ness are “far from straightforward”. Example 4: testing bipartiteness Recall that Bipartitness is efficiently testable with poly(1/) queries. THM: Bipartitness has no O(1)-query POT. PF: A graph can be -far from Bipartiteness still all its O(1)-vertex induced subgraphs may be bipartite. E.g., an odd-length super-cycle consisting of (1/√) (equal-sized) independent sets such that each adjacent pairs of sets is connected by a complete bipartite graph. Conclusion: easily testable properties may not have POTs. Characterization of graph properties having a POT THM (oversimplified): Property P has an O(1)-query POT iff P equals the set of F-free graphs, where F is a fixed set of O(1)-size graphs. PF idea: Given a POT , we derive a canonical POT (a la [GT]), which yields a characterization of P in terms of forbidden subgraphs (equiv., allowed induced subgraphs). In the other direction, use [AFKS]. Clarification: For a set of graphs F and a graph G, we say that G is F-free if no induced subgraph of G belongs to F. THM (actual): Property P = N PN has a O(1)-query POT iff for some constant c and every N, it holds that PN equals the set of FN -free graphs, where FN is a set of c-size graphs. Example 5: testing Clique Collection (CC) Recall that CC is efficiently testable with Õ(1/) queries [GR], and even Õ(-4/3) non-adaptive queries suffice. THM: CC has a 3-query POT with ()=O(2), and no O(1)-query POT can do better. PF (of the lower bound): Consider a collection of 1/4 balanced bicliques, each of size 4N. This graph is -far from CC while rejecting it requires hitting some biclique at least three times. Conclusion: The (std.) tester obtained by repeating the best POT may have significantly higher complexity than the standard tester. Example 6: testing c-Clique Collection (c-CC) Recall that c-CC is testable with Õ(1/) queries [GR], even non-adaptively! THM: For every c2, the property c-CC has a (c+1)-query POT with ()=O(c/2), and no O(1)-query POT can do better. PF (of the lower bound): Consider a graph consisting of c small cliques, each of size sqrt()N and one large clique of size (1-c√))N. This graph is -far from c-CC while rejecting it requires hitting each of the c small cliques. Conclusion: The (std.) tester obtained by repeating the best POT may have tremendously higher complexity than the standard tester. PART 2: In the bounded-degree model A graph G=(V,E) of degree bound d, is represented by a function g:[N][d][N]{0} (i.e., g(u,i)=v iff v is the ith neighbor of u in G and g(u,i)=0 iff v has less than i neighbors). This (representation) determines: 1. The type of queries: incidence queries 2. The distance measure: #differences/dN The bounded-degree model: preliminaries to the characterization Augment the definition of (induced) subgraph freeness by referring to the non-existence of external edges that are incident at certain (marked) vertices. E.g., standard triangle freeness vs no isolated triangles. unmarked marked DEF (generalized subgraph freeness): The specified graphs should not appear as induced subgraphs unless some marked vertex has an external neighbor. E.g., this can express degree upper bounds. Generalized subgraph freeness: non-propagation DEF (generalized subgraph freeness): The specified graphs should not appear as induced subgraphs unless some marked vertex has an external neighbor. Def: F is non-propagating if there exists :(0,1](0,1] such that if some vertex set B covers all occurrences in G of graphs in F, then G is (|B|/N)-close to being F-free. • Not all sets F are non-propagating. • For any F with no marked vertices, F is non-propagating. • Degree-regularity is captured by a non-propagating F. Note that this is a non-hereditary property. The bounded-degree model: characterization Def: F is non-propagating if there exists :(0,1](0,1] such that if some vertex set B covers all occurrences in G of graphs in F, then G is (|B|/N)-close to being F-free. • Not all sets F are non-propagating. • For any F with no marked vertices, F is non-propagating. • Degree-regularity is captured by a non-propagating F. Note that this is a non-hereditary property. THM (over. sim.): A property P has an O(1)-query POT iff for some non-propagating F it holds that P equals F-freeness. OPEN: Can every generalized subgraph freeness property be captured by F-freeness for some non-propagating F ? Other Models (of property testing) THM: If property P is testable by a non-adaptive tester that (i) makes a number of queries that only depends on the proximity parameter and (ii) rejects based on a constant-sized “refutation”, then P has a POT. Note: strong codeword tests (cf. [GS]) correspond to POTs. OPEN: Do codes of 1/polylog rate have O(1)-query codeword POT? The codeword tester of [BS]+[D] is not strong. The End The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/pot.ppt The paper itself is available at http://www.wisdom.weizmann.ac.il/~oded/p_testPOT.html A companion paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_testAA.html On the companion paper “Algorithmic Aspects of Property Testing in the Dense Graphs Model” THM [GT]: If a graph property is testable by q(N,) queries then it is testable by a canonical tester of query complexity O(q(N,)2). A canonical tester inspects a random induced subgraph and accepted iff the inspected graph has a predetermined property. Me (since 2001): “In this model, there is no room for algorithms -- property testing reduces to sheer combinatorics.” Me (now): A finer examination (which cares for the quadratic blow-up) reveals the role of algorithms; as shown in the paper, adaptive algorithms outperform non-adaptive ones, which in turn outperform canonical testers.