Document Sample

The Emerging Power of Network Analysis for Complex Systems Christopher L. Magee SDM Alumni Conference October 28, 2005 Professor C. Magee, 2005 Page 1 Motivating Questions and Outline of talk • What has been going on recently in “The New Science of Networks”? • Why might we (who are interested in design, management, etc. of complex systems) care? • What are some interesting and possibly lasting findings? • What are the general strengths and possible weaknesses of using these approaches? • How far have we gotten towards the potential value to us? Professor C. Magee, 2005 Page 2 What has been going on recently in “The New Science of Networks”? • The Physicists and their friends have come to this area strongly starting with the paper in Nature by Watts and Strogatz in 1998 • The publications started with a few per year and now have reached ~100’s per year in various journals (plus 3 books). • All of the effort builds upon work done by sociologists and Operations Researchers over the preceding 40 or more years. • Strong activities now exist at a variety of academic institutions: – The University of Michigan – Oxford University – The Sante Fe Institute – Columbia University – Notre Dame University – Many others Professor C. Magee, 2005 Page 3 New Network Science Essentials • Network Analysis (or Science) consists of a relatively simple way (Euler was first) of modeling or representing a system – Each Element (or subsystem or) is a node – The relationship between nodes (elements or) is a link • The appeal of generality of application is based upon the very simple model for a system described by this representation combined with the mathematics of graph theory for quantifying various aspects of such models. • A limitation for widespread utility is the simplicity • The research front is where people are sacrificing as little simplicity as possible while making the models reflect more reality and thus have increased utility Professor C. Magee, 2005 Page 4 Why might We (who are interested in design, management, behavior, etc. of complex systems) care? • A strong mathematical basis is being established for developing relatively tractable models of large-scale complex systems – We need more modeling tools that are useful for large-scale systems with many elements, interactions and complex behaviors • Quantifiable metrics are being developed that may be of use in predicting behavior of complex large-scale systems – We need such metrics as they would be valuable in designing and managing our systems • Algorithms for extracting information from complex systems are being developed and these can improve “observability” of such systems. • New visual representations for complex systems are being developed Professor C. Magee, 2005 Page 5 Comparative Progress in Understanding and performance: CLM objective/subjective observations • 1940-2000 improvement – Small-scale electro-mechanical systems (x40-100) – Energy transformation systems (x 10-20) – Information processing systems (x 10 5 to 10 7 ) – Cosmology (x 30-100) – Paleontology (x 50) – Organizational theory and practice (x 1.1 to 2) – Economic systems (x 1.1 to 2) – Complex large-scale socio-technological systems (?) • Possible reasons for large differences for organization issues – Lack of attention by deep thinkers – Low utilization of mathematical tools – “Hardness” of problem particularly human intent – Lack of detailed quantitative observation to improve models Professor C. Magee, 2005 Page 6 What is needed to greatly improve the practice of complex social/ technological system design? • The major opportunity is to transition from the “pre- engineering” (experiential or craft) approach now widely used to a solid (post 1870 at least) engineering approach to these design problems. What does this entail? • If you are doing work where all factors involved are quantitatively and accurately determined from mathematical approaches, you are possibly doing accounting or actuarial work but you are not doing engineering design because you are not doing creative work. • If you are pursuing a creative end but are using no quantitative methods developed from a scientific perspective, you are possibly a sculptor or a painter but are certainly not an engineer • The critical need to greatly accelerate the rate of improvement in practice is objective methods for quantitative observation to develop reliable and well-understood “small” models. Professor C. Magee, 2005 Page 7 The Iterative Learning Process Objectively obtained quantitative data (facts, phenomena) deduction induction deduction induction hypothesis ( model, theory that can be disproved) As this process matures, what new can the models accomplish? The major accomplishment will be the rapid facilitation of a transition to engineering (vs. craft approaches) for the design of complex social/ technological systems Professor C. Magee, 2005 Page 8 What are some interesting and possibly lasting findings? • Metrics • Comparison among network descriptions of different kinds of systems – Social – Information – Biological – Technological • Models • Key characterization procedures (and algorithms) – Community Structure – Motifs and coarse-graining – Self-dissimilarity Professor C. Magee, 2005 Page 9 Network metrics • Size • Density of interactions, sparseness • Path Length –dependence on size Professor C. Magee, 2005 Page 10 Network Metrics I • n, the number of nodes • m, the number of links • m/n is the average degree <k> as the number of links on a given node, k, is the degree. • m/[(n)(n-1)] or <k>/(n-1)is the “sparseness” or normalized interconnection “density” • Path length, l 1 l 1 d ij n(n 1) i j 2 – “Small Worlds” – In a “Small World”, l is relatively small – And at given <k>, l ~ to ln n or less rapid rise is taken to mean “Small World” (where clustering is high) Professor C. Magee, 2005 Page 11 Network metrics II • Size • Density (mean degree), maximum degree • Path Length –dependence on size • Connectivity • Degree Distribution-(power laws normal and uninformative) • Social Network Analysis – Centrality, Prestige, closeness, proximity, closeness, etc. • Clustering • Degree Correlation Coefficient, assortativity and homophily Professor C. Magee, 2005 Page 12 Definition of the clustering coefficient C This network has one triangle and eight connected triples, and therefore has a clustering coefficient, C1, of 3 x 1/8 = 3/8 The individual vertices have local clustering coefficients, of 1, 1, 1/6, 0 and 0, for a mean value, C 2 = 13/30. Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics Professor C. Magee, 2005 Page 13 Selective Linking • Which nodes link with which nodes: food webs, social networks • Internet – Three broad categories of nodes: A. Internet Backbone, B. ISP’s and C. customers – Many A-B and B-C links but few A-C or C-C • Social networks researchers have labeled this assortative mixing or homophily. Age, income, geography, profession etc. show correlations • Assortativity coefficient, r, from normalized sociomatrix, e. Eij is the number of links that connect node types i and j E e Tr e e 2 E r 1 e2 • r is o for randomly mixed networks and 1 for perfectly assortative networks Professor C. Magee, 2005 Page 14 Degree correlation • A special case of assortative mixing is according to node degree. – Do high-degree nodes associate with other high-degree nodes? or do they prefer low-degree nodes? – This has been done by 2 dimensional histograms and by studying mean degree of node neighbors as a function of k. – Recently, it has been shown that calculating the Pearson correlation coefficient of the degrees at either ends of an edge is the most compact representation. This coefficient is positive for assortatively mixed networks and negative for disassortative networks. Professor C. Magee, 2005 Page 15 What are some interesting and possibly lasting findings? • Metrics • Comparison among network descriptions of different kinds of systems – Social – Information – Biological – Technological • Models • Key characterization procedures (and algorithms) – Community Structure – Motifs and coarse-graining – Self-dissimilarity Professor C. Magee, 2005 Page 16 Example Networks • Road map • Food webs • Electric circuit or pipe • Nerve pathways system • Gene control systems • Structure of bridge or • Electronic Control building, with load paths systems • Organizational chart • Phone system • Acquaintance network • Chemical reaction • Supply chain • Sequential event plan • Markov chain • Internet • Electric Power Grid • Product Development • Co-authors of papers Tasks • Citations in papers • 1000’s of other examples Professor C. Magee, 2005 Page 17 Various classes of networks an undirected network a network with a number with only a single type of of discrete node and link node and a single type of types link a network with a directed network varying node and in which each link link weights has a direction Acyclic and cyclic on later slide Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics Professor C. Magee, 2005 Page 18 Network Typology • Social Networks = A set of people or groups of people with some pattern of contact or interactions among them; studied ones include friendships between individuals, business relationships between companies, intermarriages between families and many others as this work dates from the 1920’s. • Information Networks are those where the nodes contain information or knowledge (citation networks and the world wide web are the best studied examples). Professor C. Magee, 2005 Page 19 The 2 best studied information networks; citation network and the World Wide Web Citation network of academic papers in which World Wide Web, a network of text pages the nodes are papers and the directed links are accessible over the Internet, in which the nodes are citations of one paper by another. Since papers pages and the directed links are hyperlinks. There can only cite those that came before them the are no constraints on the Web that forbid cycles graph is acyclic -it has no closed loops. and hence it is in general cyclic. Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics Professor C. Magee, 2005 Page 20 Network Typology • Social Networks = A set of people or groups of people with some pattern of contact or interactions among them; studied ones include friendships between individuals, business relationships between companies, intermarriages between families and many others as this work dates from the 1920’s. • Information Networks are those where the nodes contain information or knowledge (citation networks and the world wide web are the best studied examples). • Technological Networks are human designed and produced usually for distributing some resource. Examples include electric power, internet, road systems, supply chains, railways, telephone system, water and many others • Biological Networks are biological systems usefully represented as networks and include metabolic pathways, protein networks, gene regulation, food webs, neural networks, blood vessels and vascular networks in plants, etc. Professor C. Magee, 2005 Page 21 Metrics in a Variety of Networks • See Table Handout (adapted form Newman review article) – 10/27 are directed networks – Large range of network scales – Good coverage of all 4 network “types”. – Decent scale range in all 4 types but the maximum size of studied technological and biological networks is smaller – Decent range on <k> overall and all are sparse (biological and technological less so) – All social networks are assortative-positive r (except student relationships) – All technological and biological are non-assortative – Information networks are an open question Professor C. Magee, 2005 Page 22 What are some interesting and possibly lasting findings? • Metrics • Comparison among network descriptions of different kinds of systems – Social – Information – Biological – Technological • Models • Key characterization procedures (and algorithms) – Community Structure – Motifs and coarse-graining – Self-dissimilarity Professor C. Magee, 2005 Page 23 Network Mathematical Models • Poisson Random Networks • Small World Models • Generalized Random Networks • Growth Models • Search Models • Community-based Models • Reliability and Failure Cascade Models • Communication Models • Organization Modeling Professor C. Magee, 2005 Page 24 Metrics/Models in a Variety of Networks • See Table Handout (adapted form Newman review article) – Large range of network scales – 10/27 are directed networks – Good coverage of all 4 network “types”. – Decent scale range in all 4 types but the maximum size of studied technological and biological networks is smaller – Decent range on <k> overall and all are sparse – All social networks are assortative (except student relationships) – All technological and biological are non-assortative – Path Length, l, is generally small (small worlds) and often approximately equal to that given by Poisson random network – Clustering is usually orders of magnitude higher than predicted by random networks for the large networks and is ~constant – Degree correlation as predicted for social networks with homophily Professor C. Magee, 2005 Page 25 What are some interesting and possibly lasting findings? • Metrics • Comparison among network descriptions of different kinds of systems – Social – Information – Biological – Technological • Models • Key characterization procedures (and algorithms) – Community Structure – Motifs and coarse-graining – Self-dissimilarity Professor C. Magee, 2005 Page 26 Community Structure • Various algorithms have been developed for decomposing a network into its “logical” sub-structure: – For simple (all node equivalent) models, the most useful of these is one due to Garvin and Newman that finds the minimal links to cut in order to find the most appropriate subsystems. – Separation and optimization parameters are also calculated that allow one to determine the “best” number of subsystems • Example from a co-author network Professor C. Magee, 2005 Page 27 Professor C. Magee, 2005 Page 28 Motifs • Milo et al. first extended the concept of motifs beyond sociological networks in a 2002 article in Science titled: “Network Motifs: Simple building blocks of Complex Networks”, – They defined motifs in this paper as patterns of interactions that occur at significantly higher rates in an actual network than in randomized networks and developed an algorithm for extracting them from (directed) networks Professor C. Magee, 2005 Page 29 Schematic of network motif detection. Motifs are found in the real network (A) much more frequently than in a ensemble of random networks (B) Professor C. Magee, 2005 Page 30 Motifs b • Milo et al. first extended the concept in a 2002 article in Science titled: “Network Motifs: Simple building blocks of Complex Networks”, – They define motifs as patterns of interactions that are significantly higher than in randomized networks – They studied 19 networks (in six different classes) • For 2 gene transcription networks they found that the two different transcription systems showed the same motifs Professor C. Magee, 2005 Page 31 The number of times these two motifs occur is more than 10 standard deviations greater than their mean number of appearances in randomized networks. None of the other 13 three node possible patterns or any other of the 199 4 node possible patterns appear more than the mean plus 2 standard deviations of their appearance in randomized networks Professor C. Magee, 2005 Page 32 Dependence of detectability of the 3 node motif on the size of the E. coli network studied. Professor C. Magee, 2005 Page 33 Motifs c • Milo et al. first extended the concept in a 2002 article in Science titled: “Network Motifs: Simple building blocks of Complex Networks”, – They define motifs as patterns of interactions that are significantly higher than in randomized networks – They studied 19 networks (in six different classes) • For 2 gene transcription networks they found that the two different transcription systems showed the same motifs • For 8 electronic circuits (in 2 classes), they found Professor C. Magee, 2005 Page 34 The extremely high ratios for the motifs in these cases (even at small size) can probably be interpreted as evidence of design intent and for these small technological systems the importance of available modules in such systems probably accounts for the reuse of the same “motifs” in the variety of circuits of the same class. Professor C. Magee, 2005 Page 35 Motifs d • Milo et al. first extended the concept in a 2002 article in Science titled: “Network Motifs: Simple building blocks of Complex Networks”, – They define motifs as patterns of interactions that are significantly higher than in randomized networks – They studied 19 networks (in six different classes) • For 2 gene transcription networks they found that the two different transcription systems showed the same motifs • For 8 electronic circuits (in 2 classes), they found reproducible motifs at high concentration for each class of circuit studied • One interesting conclusion is that the technique can be applied to networks with variable nodes and links. • A second interesting conclusion coming from comparison of neurons, genes, food webs and electronic circuits is – “Information processing seems to give rise to significantly different structures than does energy flow.” The possible relevance to Whitney’s work is intriguing. Professor C. Magee, 2005 Page 36 Self-similarity and self-dissimilarity • Wolpert and Macready(2000) introduced the concept of self- dissimilarity as a complexity metric • Self-dissimilarity is defined as “the variability of interaction patterns of a system at different spatio-temporal scales” • Note that as defined this definition is in a sense counter to the notion (often loosely defined) of “scale free” which implies (at least seems to) the notion that structure is repetitive at various scales • Wolpert and Macready invented relatively elaborate methods for statistically applying their concept and demonstrate it only through numerical simulations • Itzkovitz et. al (2004) have recently developed a method they call “coarse-graining” based on their prior work on motifs. This method also assesses self-dissimilarity and has been applied to biological and technological networks. Professor C. Magee, 2005 Page 37 Hierarchical Organization of Modularity in Metabolic Networks E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, A.-L. Baraba´si SCIENCE VOL 297 30 AUGUST 2002 p 1551 A. Scale free B. Modular, not scale free C. Nested modular, scale free Professor C. Magee, 2005 Page 38 Coarse-Graining: An extension of motifs • Itzkovitz et. al. investigate Coarse-Graining as an objective means for “reverse-engineering” that can be applied even when the lower level functional units are unknown (biological focus). • The coarse-grained version of a network is a new network with fewer elements. This is achieved by replacing some of nodes by CGU’s (patterns of node interactions at the level being examined-motifs chosen somewhat differently). • Itzkovitz et. al. apply simulated annealing to arrive at an optimum set of CGU’s (minimize the “vocabulary” of CGU's while maximizing the coverage of the original network by the coarse-grained description). • Applying this algorithm to an electronic circuit.. Professor C. Magee, 2005 Page 39 Transistor level map of an 8 bit binary counter used in a digital fractional multiplier. Highlighted is a sub-graph that represents the transistors that make up one NOT gate. Examining possible motifs up to 6 nodes shows Professor C. Magee, 2005 Page 40 Two sets of possible optimal motif-based CGU’s. The solid boxes choice can be arranged to arrive at a “gate-level coarse-graining” Professor C. Magee, 2005 Page 41 In the transistor level, nodes represent transistor junctions. In the gate level, nodes are CGU’s, made of transistors, each representing a logic gate. Shown is the CGU that corresponds to a NAND gate. Re-applying the coarse-graining optimization sequentially yields 2 more levels.. Professor C. Magee, 2005 Page 42 In the flip-flop level, nodes are either gates or a CGU made of gates that corresponds to a D-type-flip-flop with an additional gate as logic input. In the counter level, each node is either a gate or a CGU of gates/flip-flops that corresponds to a counter unit. Professor C. Magee, 2005 Page 43 Coarse-Graining b • Itzkovitz et. al. investigate Coarse-Graining as an objective means for “reverse-engineering” that can be applied even when the lower level functional units are unknown (biological focus). • The coarse-grained version of a network is a new network with fewer elements. This is achieved by replacing some of nodes by GCU’s (patterns of node interactions at the level being examined. • Itzkovitz et. al. apply simulated annealing to arrive at an optimum set of GCU’s (minimize the “vocabulary” of GCU’s while maximizing the coverage of the original network by the coarse-grained description). • Applying this algorithm to an electronic circuit, one finds a four level description which has variable functional significance and self-dissimilarity at each level Professor C. Magee, 2005 Page 44 Self-dissimilarity at multiple levels in the electronic circuit. This change of patterns with level apparently applies to all biological and technological networks studied thus far. Professor C. Magee, 2005 Page 45 Coarse-Graining c • Itzkovitz et. al. investigate Coarse-Graining as an objective means for “reverse-engineering”. • The coarse-grained version of a network is a new network with fewer elements. • Itzkovitz et. al. apply simulated annealing to arrive at an optimum set of GCU’s • Applying this algorithm to an electronic circuit, one finds a four level description which has variable functional significance and self- dissimilarity at each level • Note the fundamental difference between Coarse-Graining and algorithms for detection of community structure: – Community structure algorithms try to optimally divide networks into sub-graphs with minimal interconnections but these sub- graphs are distinct and complex – Coarse-Graining seeks a small dictionary of simple sub-graph types in order to elucidate the function of the network in terms of recurring building blocks Professor C. Magee, 2005 Page 46 Coarse-Graining c • Itzkovitz et. al. investigate Coarse-Graining as an objective means for “reverse-engineering”. • The coarse-grained version of a network is a new network with fewer elements. • Itzkovitz et. al. apply simulated annealing to arrive at an optimum set of GCU’s • Applying this algorithm to an electronic circuit, one finds a four level description which has variable functional significance and self- dissimilarity at each level • Note the fundamental difference between Coarse-Graining and algorithms for detection of community structure: – Community structure algorithms try to optimally divide networks into sub-graphs with minimal interconnections (modularity1) but these sub-graphs are distinct and complex – Coarse-Graining seeks a small dictionary of simple sub-graph types in order to elucidate the function of the network in terms of recurring building blocks (modularity 2) Professor C. Magee, 2005 Page 47 Different Definitions of “Modular” or “Module” (after Whitney) • You can see different elements and the places where they join (modularity 1) • Each item does a specific thing (form-function, genotype- phenotype in a one-to-one relationship) (Suh, Altenberg) (modularity 2) • You need only know how to use them and don’t need to know what’s inside (modularity 2) • Interconnectedness is concentrated inside them (Alexander)(software design) (modularity 1) • Their links to the outside are standardized (modularity 2), or simple and few (Alexander) (modularity 1) Professor C. Magee, 2005 Page 48 Coarse-Graining d • Itzkovitz et. al. investigate Coarse-Graining as an objective means for “reverse-engineering”. • The coarse-grained version of a network is a new network with fewer elements. • Itzkovitz et. al. apply simulated annealing to arrive at an optimum set of GCU’s • Applying this algorithm to an electronic circuit, one finds a four level description which has variable functional significance and self-dissimilarity at each level • Note the fundamental difference between Coarse-Graining and algorithms for detection of community structure • Research Hypothesis: Simultaneous study of communities and CGU’s in a variety of complex technological systems would further clarify the concept of modularity. • Note that motifs and coarse-graining have thus far only been applied to fairly simple technological systems Professor C. Magee, 2005 Page 49 What are the general strengths and possible weaknesses of using these approaches? • Some Sub-questions: – To what extent are all networks the same? – What network growth/development processes can be modelled and are these actually operating in specific networks? – How should local and global descriptions of networks be integrated? – What principles appear to operate in nature to select network and agent characteristics? – To what extent can the desirable properties from social, biological and other networks be transferred to designed networks? – What principles (should) operate in designed and human evolved systems? Professor C. Magee, 2005 Page 50 How far have we gotten towards the potential value to us? • First examine some examples of initial studies and then consider some general questions • Examples – Worldwide Airport Network – Shape and efficiency in Spatial distribution networks – Heuristic Design of the Internet Network Professor C. Magee, 2005 Page 51 Three Case studies from the network literature: I Worldwide Airport Network • 1. The Structure and Efficiency of the Worldwide Airport network – By Guimera et. al. – More analysis and reverse engineering than design • Network of 3883 cities with airports studied to examine the drivers of airport utilization and the evolution of the network • All passenger flights from Nov. 1-Nov. 7, 2000 was focus of work with 531,574 unique flight non-stop flight segments between the 3883 cities • Guimera et. al. view the airport network as a communication (process ID) network and interpret airports as routers (queues that receive passengers and direct them to a new destination). Professor C. Magee, 2005 Page 52 Worldwide Airport Network b • The authors assert that the relevant design question is: “What is the network topology that minimizes the number of waiting flights/passengers?” • They also hypothesize the plausibility of a star-network being optimal (at least regionally and up to a traffic limit) Professor C. Magee, 2005 Page 53 Worldwide Airport Network c • They also hypothesize that as flight frequency increases, the waiting times for planes and passengers (at the single hub) becomes unacceptably large, so the star is replaced by a partly decentralized network… Professor C. Magee, 2005 Page 54 Worldwide Airport Network d • They test whether the multiple hubs seen in the actual network evolved according to their hypotheses (principles) and conclude that physical limits in router capacity do limit the capacity of a given airport not just saturation • Guimera et. al. also study betweenness centrality of all the airports and arrive at the same conclusion from this data. Professor C. Magee, 2005 Page 55 Worldwide Airport Network e • Guimera et al. also show that the most connected cities would also be the most central cities from preferential attachment but that the real data do not show this for geographic and political reasons. • Further desired work – Consideration of other properties of importance in the world-wide airport network such as – Fuel costs, flight lengths and airplane limits, economic tradeoffs that passengers might make (e.g. time and cost) – Study of airport routing capacity from empirical data and in-depth study of factors limiting airport routing capacity such as – Flight number limits by runway capacity, by air traffic control capacity (landing pattern limits), gate capacity… – Router capacity due to airport/customer use limitations such as distance between gates, ground transport links.. Professor C. Magee, 2005 Page 56 Three Case studies from the network literature: II Spatial distribution networks • 1. The Structure and Efficiency of the Worldwide Airport network • 2. Shape and efficiency in Spatial distribution networks – By Gorman and Kulkarni and by Gastner and Newman – A suggested design method compared to existing systems • Study of technological systems with nodes that are located at specific geographical sites; Such network representations can be used as models for – many distribution networks such as water, sewage, oil pipelines, natural gas, electric power grids, Fedex – many transportation networks such as air, rail, road etc. Professor C. Magee, 2005 Page 57 Spatial distribution networks b • The model developed was for the case where the distribution system has a “root node” which is the sole source or sink for the items being distributed. • Additional Design factors considered – Additional node locations (constraint) – Total link length (minimize to minimize cost ) – Shortest path length between two nodes (minimize to minimize transport time) • Tradeoffs in last two factors is the design/architecting problem – Look at ideal solutions for each criteria – Examine how real networks compare on the tradeoffs – Build growth model to derive pattern and look for consistency. Professor C. Magee, 2005 Page 58 Spatial distribution networks c • For actual example system (a), minimum total edge length including paths to the root node is given by a Minimum Spanning Tree (c) while obtaining shortest paths to the root node is optimized by a star graph (b) Professor C. Magee, 2005 Page 59 Spatial distribution networks d • From transportation research, a route factor is with l the shortest actual path length and and d is the shortest Euclidean distance and is equal to 1 for a star graph • For three real technological system networks, • The systems favor minimum edge length but have route factors considerably superior to MST optimums indicating effective tradeoff in the two criteria. • A simple growth model is used to explain this result Professor C. Magee, 2005 Page 60 Spatial distribution networks e • The growth model assumes that the systems evolve from the root node by adding new (but already existing) nodes using a greedy optimization criterion that adds unconnected node, i, to an already connected node, j with the weighting factor given by • Simulations using these model assumptions yields showing small tradeoffs in total link length give large improvements in path length • In real systems, =? Professor C. Magee, 2005 Page 61 Spatial distribution networks f: desirable future work • Consideration of other network properties – Shipment capacity – Link capacities (and scaling/cost effects for key links) – Node capacities and roles (joints vs. transfer/routers) – Flexibility for growth (new nodes as well as new connections of existing nodes) – Robustness to node or link breakdowns • Development of more broadly applicable models – More than one source/sink node • Development of other rules/protocols for growth that achieve the key properties well • Consideration of top-down vs. evolved systems Professor C. Magee, 2005 Page 62 Three Case studies from the network literature: III Heuristic internet design • 1. The Structure and Efficiency of the Worldwide Airport network • 2. Shape and efficiency in Spatial distribution networks • 3. Heuristically designed internets – Fabrikant et. al – Li et al. • Toy internets designed for studying some principles and making comparisons – Fabricant et. al. attempt to balance the “last mile costs” and the communication distance in a growing system (the internet). – They use (and were the originators) of the already seen – They focused on the ease of obtaining power laws but noted transition between MST and star for this case as well. Professor C. Magee, 2005 Page 63 Heuristic internet design b • Li et al. spend much of their time correcting the previous over- emphasis on power-laws as an indicator of structure. Their “First-Principles Approach” to the Internet router- level design problem outlines the approach. • For their “First-Principles Approach, Li et. al. start simple and attempt “to identify some minimal functional requirements and physical constraints needed to develop simple models that are .. consistent with engineering reality”. They also focus on single ISP’s as the fundamental building block. • They argue that the best candidates for a minimal set of constraints on topology construction (architecture) for a single ISP are: – Router technology and – Network economics Professor C. Magee, 2005 Page 64 Heuristic internet design c-Router Technology Limits • Li et al point out that for a given router there is a limit on the number of packets that can be processed in any given time. This limits the number of connections and connection speeds and creates a “feasible region” and “efficient frontier” for given router designs Professor C. Magee, 2005 Page 65 Heuristic internet design d- Router Technology Constraints II Considering multiple routers and other technologies, a feasible region results Professor C. Magee, 2005 Page 66 Heuristic internet design e- Economic Constraints • Costs of installing and operating physical links can dominate the cost of the infrastructure so the availability of multiplexing and aggregating throughout the hierarchy is essential • These technologies are deployed depending upon customer needs and willingness to pay Professor C. Magee, 2005 Page 67 Heuristic internet design f: Heuristically Optimal Networks • Li et al then define a heuristically optimal network: • They also show that several real Internet network elements have these broad characteristics (Abilene and CENIC) • Note the “hierarchical tree” in the quote above would actually be better described by the model in the preceding case ( a modified MST arrived at by a “growth rule” followed by the ISP). Professor C. Magee, 2005 Page 68 Heuristic internet design g: Properties and designs evaluated • Performance – Throughput – Router utilization ( distance to frontier) – End user bandwidth Distribution • Self similarity Professor C. Magee, 2005 Page 69 Heuristic internet design g: Properties and designs evaluated • Performance – Throughput – Router utilization ( distance to frontier) – End user bandwidth Distribution • Self-similarity • Li et al create 4 different designs for comparison purposes. These all have the same number of nodes and degree distribution, and the same routers but are constructed according to: – HSF (hierarchically scale free-the modular hierarchical) – Random (also with highly connected central nodes) – Poor design (heuristic network with central overloaded) – Heuristically optimal design (HOT) with 3 tier network hierarchy Professor C. Magee, 2005 Page 70 Heuristic internet design h: designs Professor C. Magee, 2005 Page 71 Heuristic internet design j: performance as a function of the Scale-Free parameter Professor C. Magee, 2005 Page 72 Heuristic internet design j; Router utilization comparison Professor C. Magee, 2005 Page 73 Heuristic internet design k: Summary • Li et al introduce some additional engineering design constraints and then are able to use this insight to produce simple (toy) models that demonstrate very clearly that the mental image of a scale-free graph is totally inconsistent with real ISP’s. • They do not calculate correlation coefficients but it is clear that the designs that “..have consistency with real design considerations” have far better performance than the other examples,. In addition, they have minimum values of the S metric (= 0) whereas the random and hierarchically modular designs have much higher values of S. Thus, it appears that meeting the design requirements drives one towards negative r. This occurs because the efficiency frontier of a router trades off the number of connections vs. bandwidth and most customers do not want to pay for high bandwidth. (thus moderate k backbone routers/nodes connect to other moderate k routers and high k routers tend to connect to very low k customer(?) nodes Professor C. Magee, 2005 Page 74 Heuristic internet design l: Possible further work • Consider robustness of toy models to see if this influences topology choice • Consider distribution and throughput in one model • Investigate how other constraints such as new customer desires for bandwidth, new router technology, wireless technology, cable vs. DSL and other issues may affect internet topology (architecture) and desired flexibility • Observe more on actual networks( internet and others) to further test modeling assumptions. • Can one quantitatively predict degree correlation from models such as theirs for various infrastructures and other technological systems and learn something from its experimental variation? Professor C. Magee, 2005 Page 75 How far have we gotten towards the potential value to us? • Some Questions – Are the network representations useful for complex engineering systems? – Are the models that have been developed useful on directly transferring to technological networks? – Are the structural characteristics so far developed adequate for technological/engineered networks? Do we need new ones? – To what extent can the desirable properties from social, biological and other networks be transferred to designed networks? Professor C. Magee, 2005 Page 76 Network representations and utility • Are graphs and networks the only approach to quantification? – No • Are networks useful in fulfilling this need ? – Yes • Evidence – Most importantly, the research has developed a number of objective methods for quantitative observation • The best are probably community structure algorithms and motif/coarse-graining approaches but all the metrics-particularly those developed in the social science network research have potential value for objectively observing complex systems quantitatively but care must be exercised in defining and measuring such systems (as always) and not all quantitative observations (for example, degree distribution) will prove to be very useful. – Models? Professor C. Magee, 2005 Page 77 Existing Models and utility • What complex systems do we want to model? – Sociological • Informal or self-organized • Formal or organized by design – Technological (and biological) • For what purposes? – Organizational “goodness” (designed sociological network-to design more effective organizations) – Robustness, Flexibility (and other properties?) in “Engineering Systems” or systems that simultaneously possess high levels of social and technical complexity – To understand how to choose (and cause to occur) better architectures for such systems –To be able to apply the strong existing engineering design approach which combines strong quantitative, science-based models with heuristics and other creatively adapted human skills Professor C. Magee, 2005 Page 78 Network representations and utility • Are graphs and networks the only approach to quantification? – No • Are networks useful in fulfilling this need ? – Yes • Evidence – The development of a number of objective methods for quantitative observation – Models • Sociological network models are applicable when an aspect of a problem contains such networks –the homophily and group structure must be understood to intelligently discuss such networks • Organizational and technological/social system models exist but will not be improved without observational work. It is clear that such models will have to contain a sense of hierarchy and that they will have to deal with tradeoffs among processes and properties to be appropriate to such systems Professor C. Magee, 2005 Page 79 How far have we gotten towards the potential value to us? • Some Questions – Are the network representations useful for complex engineering systems? – Are the models that have been developed useful on directly transferring to technological networks? – Are the structural characteristics so far developed adequate for technological/engineered networks? Do we need new ones? – To what extent can the desirable properties from social, biological and other networks be transferred to designed networks? • We are clearly not there yet and it is hard to say whether one or many breakthroughs are needed to get there. • The participation of domain experts with network science experts in the research will hasten development. Professor C. Magee, 2005 Page 80 References • D. J. Watts, Six Degrees: The Science of a Connected Age, W. W. Norton and Co. (1993) • M. E. J. Newman, “The structure and function of complex networks” SIAM Review vol. 45, 167-256 (2003) • R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon, “Network Motifs: Simple building blocks of Complex Networks”, Science, Vol. 298, pp 824-827, (2002) • Pierre-Alain Martin, “A Framework for Quantifying Complexity and Understanding its Sources: Application to two Large-Scale Systems” SM thesis, MIT, 2004. • D. H. Wolpert and W. Macready, “Self-dissimilarity: An empirically observable complexity Metric”, Unifying themes in complex systems, New England Complex Systems Institute(2000). A second paper found on the NASA Moffet web site has similar information and is titled “Self-dissimilarity as a high dimension complexity measure” (2004?) Professor C. Magee, 2005 Page 81 References II • S. Itzkovitz, R. Levitt, N. Kashtan, R. Milo, M. Itzkovitz and U. Alon, “Coarse-Graining and Self-Dissimilarity of Complex Networks”, (Oct. 2004) • A. Fabrikant, E. Koutsoupias, and C. H. Papadimitriou, “Heuristically optimized trade-offs: A new paradigm for Power Laws in the Internet, in Proceedings of the International Colloquium on Automata, Languages and Programming, vol 2380 ;in lecture notes in Computer Science,pp110-112, 2002 • L. Li, D. Alderson, W. Willinger and J. Doyle, “ A First- Principles Approach to Understanding the Internet’s Router- Level Topology” SIGCOMM 04., 2004 • L. Li, D. Alderson, R. Tanaka, J.C. Doyle and W. Willinger, “Towards a Theory of Scale-free graphs: Definitions, Properties, and Implications” Technical Report CIT-CDS-04- 006, Engineering and Applied Sciences Division California Institute of Technology, Pasadena, CA Dec.. 2004) Professor C. Magee, 2005 Page 82

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 17 |

posted: | 2/5/2012 |

language: | |

pages: | 82 |

OTHER DOCS BY ewghwehws

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.