VIEWS: 8 PAGES: 43 POSTED ON: 3/12/2012
Jo Ellis-Monaghan St. Michaels College, Colchester, VT 05439 e-mail: jellis-monaghan@smcvt.edu website: http://academics.smcvt.edu/jellis-monaghan Graphs and Networks A Graph or Network is a set of vertices (dots) with edges (lines) connecting them. A A A B B B A multiple edge C D C D D C A loop Two vertices are adjacent if there is a line between them. The vertices A and B above are adjacent because the edge AB is between them. An edge is incident to each of the vertices which are its end points. The degree of a vertex is the number of edges sticking out from it. The Kevin Bacon Game or 6 Degrees of separation Bacon # of Connery # of Number People Number people 0 1 0 1 1 1766 1 2216 2 141840 2 204269 3 385670 3 330591 4 93598 4 32857 5 7304 5 2948 6 920 6 409 http://www.spub.ksu.edu/issues/v100/FA/n069/fea- 7 115 7 46 making-bacon-fuqua.html 8 61 8 8 Total number of linkable Kevin Bacon is not even actors: 631275 Average Connery Number: 2.706 among the top 1000 most Weighted total of linkable actors: 1860181 connected actors in Hollywood Average Bacon number: Data from The Oracle of Bacon at UVA (1222th). 2.947 Maximal Matchings in Bipartite Graphs A Bipartite Graph Start with any matching Start at an unmatched vertex on the left End at an unmatched vertex on the right Find an alternating path A maximal Switch matching to matching! nonmatching and vice versa The small world phenomenon Stanley Milgram sent a series of traceable letters from people in the Midwest to one of two destinations in Boston. The letters could be sent only to someone whom the current holder knew by first name. Milgram kept track of the letters and found a median chain length of about six, thus supporting the notion of "six degrees of http://mathforum.org/mam/04/poster.html separation." Social Networks •Stock Ownership (2001 NY Stock Exchange) •Children’s Social Network •Social Network of Sexual Contacts http://mathforum.org/mam/04/poster.html Infrastructure and Robustness Scale Free Number of vertices Vertex degree JetBlue Distributed Number of vertices Vertex degree MapQuest Rolling Blackouts inAugust 2003 http://encyclopedia.thefreedictionary.com/_/viewer.aspx?path =2/2f/&name=2003-blackout-after.jpg Some Networks are more robust than others. But how do we measure this? http://www.caida.org/tools/visualization/mapnet/Backbones/ A network modeled by a graph (electrical, communication, transportation) Question: If each edge operates independently with probability p, what is the probability that the whole network is functional? t s A functional network A dysfunctional network (can get from any vertex to any (vertices s and t can’t other along functioning edges) communicate) Deletion and Contraction is a Natural Reduction for Network Reliability If an edge is working (this happens with probability p), it’s as thought the two vertices were “touching”—i.e. just contract the edge: If an edge is not working (this happens with probability 1-p), it might as well not be there—i.e. just delete it: Thus, if R(G;p) is the reliability of the network G where all edges function with a probability of p, and e is not a bridge nor a loop, then R(G;p) =(1-p)R(G-e;p) + p R(G/e;p) Reliability Example Note that if every edge of the network is a bridge (i.e. the network is a disjoint union of trees), then R(G;p) = (p)E, where E is the number of edges. Also note that R(loop;p) = 1 E.g.: (1-p) + p = (1-p)p2 + p (1-p) +pp = (1-p)p2 + p(1-p)p + p2 So R(G;p) = 3p2- 2p3 gives the probability that the network is functioning. E.g. R(G; .5)=.5625 Bothersome question: Does the order in which the edges are deleted and contracted matter? Conflict Scheduling A A E B E B D C D C Draw edges between classes Color so that adjacent vertices have different colors. with conflicting times Minimum number of colors = minimum required classrooms. Coloring Algorithm The Chromatic Polynomial counts the ways to vertex color a graph: C(G, n ) = # proper vertex colorings of G in n colors. G G\e G-e + = Recursively: Let e be an edge of G . Then, CG; n C(G e; n) CG \ e; n C; n n = - = n(n-1)2 + + = n(n-1)2 +n(n-1) + 0 = n2 (n-1) Conflict Scheduling Frequency Assignment Register Allocation Assign frequencies to mobile radios and Assign variables to hardware registers other users of the electromagnetic during program execution. Variables spectrum. Two customers that are conflict with each other if one is used both sufficiently close must be assigned before and after the other within a short different frequencies, while those that period of time (for instance, within a are distant can share frequencies. subroutine). Minimize the use of non- Minimize the number of frequencies. register memory. Vertices: users of mobile radios Vertices: the different variables Edges: between users whose Edges: between variables which conflict frequencies might interfere with each other Colors: assignments of different Colors: assignment of registers frequencies Need at least as many frequencies as the Need at least as many registers as the minimum number of colors required! minimum number of colors required! The Ising Model Consider a sheet of Metal: It has the property that at low temperatures it is magnetized, but as the temperature increases, the magnetism “melts away”. We would like to model this behavior. We make some simplifying assumptions to do so— •The individual atoms have a “spin”, i.e., they act like little bar magnets, and can either point up (a spin of +1), or down (a spin of –1). •Neighboring atoms with different spins have an interaction energy, which we will assume is constant. •The atoms are arranged in a regular lattice. At low temperature “coalescing” states are more probable and there is non-zero magnetization As the temperature rises, the states become more random, and the magnetization “melts away” Applet by Peter Young at http://bartok.ucsc.edu/peter/java/ising/keep/ising.html Magnetization = 1 N si , Energy = N si s j 1 where N is the number of lattice points. Critcal Temperature is 2 ln(1 2) Lattice and Hamiltonian A choice of spins at each point gives what is called a “state” of the lattice: The Hamiltonian (total energy) of a state w is H w f si , s j where the sum is over all adjacent points, and f is 0 if the spins are the same and 1 if they are different. H(w) is just the total number of edges in the state with different spins on their endpoints. A Little Thermodynamics e H ( w) H w The probability of a state occurring is: e all states w 1 23 Here , where T is the temperature and k is the Boltzman constant 1.38 10 joules/Kelvin. kT The numerator is easy. The denominator, called the partition function is the interesting (hard) piece. It has a deletion-contraction reduction! Let H w P G; e . Then all states w P G; P G e; e 1 P G / e; Rectilinear pattern recognition joint work with J. Cohn (IBM), R. Snapp and D. Nardi (UVM) IBM’s objective is to check a chip’s design and find all occurrences of a simple pattern to: – Find possible error spots – Check for already patented segments – Locate particular devices for updating The Haystack The Needle… Pre-Processing BEGIN /* GULP2A CALLED ON THU FEB 21 15:08:23 2002 */ EQUIV 1 1000 MICRON +X,+Y MSGPER -1000000 -1000000 1000000 1000000 0 0 HEADER GYMGL1 'OUTPUT 2002/02/21/14/47/12/cohn' LEVEL PC LEVEL RX (Raw data format) CNAME ULTCB8AD CELL ULTCB8AD PRIME PGON N RX 1467923 780300 1468180 780300 1468180 780600 + 1469020 780600 1469020 780300 1469181 780300 1469181 + 781710 1469020 781710 1469020 781400 1468180 781400 + 1468180 781710 1467923 781710 PGON N PC 1468500 782100 1468300 782100 1468300 781700 + 1468260 781700 1468260 780300 1468500 780300 1468500 + 780500 1468380 780500 1468380 781500 1468500 781500 RECT N PC 1467800 780345 1503 298 ENDMSG Two different layers/rectangles are combined into one layer that contains three shapes; one rectangle (purple) and two polygons (red and blue) Algorithm is cutting edge, and not currently used for this application in industry. Linear time subgraph search for target Both target pattern and entire chip are encoded like this, with the vertices also holding geometric information about the shape they represent. Then we do a depth-first search for the target subgraph. The addition information in the vertices reduces the search to linear time, while the entire chip encoding is theoretically N2 in the number of faces, but practically NlogN. Netlist Layout (joint work with J. Cohn, A. Dean, P. Gutwin, J. Lewis, G. Pangborn) How do we convert this… … into this? Netlist A set S of vertices ( the pins) hundreds of thousands. A partition P1 of the pins (the gates) 2 to 1000 pins per gate, average of about 3.5. A partition P2 of the pins (the wires) again 2 to 1000 pins per wire, average of about 3.5. A maximum permitted delay between pairs of pins. Example Gate Pin Wire The Wires The Wiring Space Placement layer- Vias (vertical gates/pins go here connectors) Horizontal wiring Vertical wiring layer Up to 12 or so layers layer The general idea Place the pins so that pins are in their gates on the placement layer with non-overlapping gates. Place the wires in the wiring space so that the delay constrains on pairs of pins are met, where delay is proportional to minimum distance within the wiring, and via delay is negligible Lots of ProbLems…. Identify Congestion B D G Identify dense substructures from the netlist Develop a congestion ‘metric’ A F C E H Congested area Congested area What often happens What would be good Automate Wiring Small Configurations Some are easy to place and route Simple left to right logic No / few loops (circuits) Uniform, low fan-out Statistical models work Some are very difficult E.g. ‘Crossbar Switches’ Many loops (circuits) Non-uniform fan-out Statistical models don’t work SPRING EMBEDDING Random layout Spring embedded layout Biomolecular constructions Nano-Origami: Scientists At Scripps Research Create Single, Clonable Strand Of DNA That Folds Into An Octahedron A group of scientists at The Scripps Research Institute has designed, constructed, and imaged a single strand of DNA that spontaneously folds into a highly rigid, nanoscale octahedron that is several million times smaller than the length of a standard ruler and about the size of several other common biological structures, such as a small virus or a cellular ribosome. http://www.sciencedaily.com/releases/2004/02/040 212082529.htm DNA Strands Forming a Cube http://seemanlab4.chem.NYU.edu Assuring cohesion A problem from biomolecular computing—physically constructing graphs by ‘zipping together’ single strands of DNA (not allowed) N. Jonoska, N. Saito, ’02 A Characterization A theorem of C. Thomassen specifies precisely when a graph may be constructed from a single strand of DNA, and theorems of Hongbing and Zhu to characterize graphs that require at least m strands of DNA in their construction. Theorem: A graph G may be constructed from a single strand of DNA if and only if G is connected, has no vertex of degree 1, and has a spanning tree T such that every connected component of G – E(T) has an even number of edges or a vertex v with degree greater than 3. L. M. Adleman, Molecular Computation of Solutions to Combinatorial Problems. Science, 266 (5187) Nov. 11 (1994) 1021-1024. Oriented Walk Double Covering and Bidirectional Double Tracing Fan Hongbing, Xuding Zhu, 1998 “The authors of this paper came across the problem of bidirectional double tracing by considering the so called “garbage collecting” problem, where a garbage collecting truck needs to traverse each side of every street exactly once, making as few U-turns (retractions) as possible.” DNA sequencing (joint work with I. Sarmiento) It is very hard in general to “read off’ the sequence of a long strand of DNA. Instead, researchers probe AGGCTC AGGCT for “snippets” of a GGCTC fixed length, and read those. CTACT TCTAC The problem then becomes reconstructing the original long strand of DNA from the CTCTA TTCTA set of snippets. Enumerating the reconstructions This leads to a directed graph with the same number of in-arrows as out arrows at each vertex. The number of reconstructions is then equal to the number of paths through the graph that traverse all the edges in the direction of their arrows. Graph Polynomials Encode the Enumeration A very fancy polynomial, the interlace polynomial, of Arratia, Bollobás, and Sorkin ,2000, encodes the number of ways to reassemble the original strand of DNA. It is related, with a lot of work, to the contraction- deletion approach of the Chromatic and Reliability polynomials. The interlace polynomial is computed, not on the a The “snippet” graph “snippet” graph, but on b an associated circle graph. d c a c b a c d a d b c d b A chord diagram The associated circle graph Pendant Duplicate Graphs Effect of adding a pendant vertex or duplicating a vertex v' v a c v’ v' Adding a v b pendant vertex v to v. v a c v' v’ Duplicating v the vertex v. v b v' Theorem A set of subsequences of DNA permits exactly two reconstructions iff the circle graph associated to any Eulerian circuit of the ‘snippet’ graph is a pendant-duplicate graph. Side note to the cognesci: Pendant-duplicate graphs correspond to series-parallel graphs via a medial graph construction, so the two reconstructions is actually a new interpretation of the beta invariant.