Docstoc

Slide 1 - MIT SDM

Document Sample
Slide 1 - MIT SDM Powered By Docstoc
					      The Emerging Power of Network
      Analysis for Complex Systems


Christopher L. Magee   SDM Alumni Conference
                           October 28, 2005


                           Professor C. Magee, 2005
                           Page 1
 Motivating Questions and Outline of talk

• What has been going on recently in “The New
  Science of Networks”?
• Why might we (who are interested in design,
  management, etc. of complex systems) care?
• What are some interesting and possibly lasting
  findings?
• What are the general strengths and possible
  weaknesses of using these approaches?
• How far have we gotten towards the potential value
  to us?


                             Professor C. Magee, 2005
                             Page 2
 What has been going on recently in
 “The New Science of Networks”?

• The Physicists and their friends have come to this area strongly
  starting with the paper in Nature by Watts and Strogatz in 1998
• The publications started with a few per year and now have
  reached ~100’s per year in various journals (plus 3 books).
• All of the effort builds upon work done by sociologists and
  Operations Researchers over the preceding 40 or more years.
• Strong activities now exist at a variety of academic institutions:
    – The University of Michigan
    – Oxford University
    – The Sante Fe Institute
    – Columbia University
    – Notre Dame University
    – Many others
                                      Professor C. Magee, 2005
                                      Page 3
  New Network Science Essentials

• Network Analysis (or Science) consists of a relatively simple
  way (Euler was first) of modeling or representing a system
    – Each Element (or subsystem or) is a node
    – The relationship between nodes (elements or) is a link
• The appeal of generality of application is based upon the very
  simple model for a system described by this representation
  combined with the mathematics of graph theory for quantifying
  various aspects of such models.
• A limitation for widespread utility is the simplicity

• The research front is where people are sacrificing as little
  simplicity as possible while making the models reflect more
  reality and thus have increased utility

                                     Professor C. Magee, 2005
                                     Page 4
  Why might We (who are interested in
  design, management, behavior, etc.
  of complex systems) care?
• A strong mathematical basis is being established for developing
  relatively tractable models of large-scale complex systems
    – We need more modeling tools that are useful for large-scale
       systems with many elements, interactions and complex
       behaviors
• Quantifiable metrics are being developed that may be of use in
  predicting behavior of complex large-scale systems
    – We need such metrics as they would be valuable in
       designing and managing our systems
• Algorithms for extracting information from complex systems
  are being developed and these can improve “observability” of
  such systems.
• New visual representations for complex systems are being
  developed
                                    Professor C. Magee, 2005
                                    Page 5
 Comparative Progress in Understanding
 and performance: CLM
 objective/subjective observations
• 1940-2000 improvement
    – Small-scale electro-mechanical systems (x40-100)
    – Energy transformation systems (x 10-20)
    – Information processing systems (x 10 5 to 10 7 )
    – Cosmology (x 30-100)
    – Paleontology (x 50)
    – Organizational theory and practice (x 1.1 to 2)
    – Economic systems (x 1.1 to 2)
    – Complex large-scale socio-technological systems (?)
• Possible reasons for large differences for organization issues
    – Lack of attention by deep thinkers
    – Low utilization of mathematical tools
    – “Hardness” of problem particularly human intent
    – Lack of detailed quantitative observation to improve models

                                      Professor C. Magee, 2005
                                      Page 6
 What is needed to greatly improve the
 practice of complex social/ technological
 system design?
• The major opportunity is to transition from the “pre-
  engineering” (experiential or craft) approach now widely used
  to a solid (post 1870 at least) engineering approach to these
  design problems. What does this entail?
• If you are doing work where all factors involved are
  quantitatively and accurately determined from mathematical
  approaches, you are possibly doing accounting or actuarial
  work but you are not doing engineering design because you are
  not doing creative work.
• If you are pursuing a creative end but are using no quantitative
  methods developed from a scientific perspective, you are
  possibly a sculptor or a painter but are certainly not an engineer
• The critical need to greatly accelerate the rate of improvement
  in practice is objective methods for quantitative observation to
  develop reliable and well-understood “small” models.
                                      Professor C. Magee, 2005
                                      Page 7
The Iterative Learning Process
  Objectively obtained quantitative data (facts, phenomena)



          deduction   induction      deduction   induction




     hypothesis ( model, theory that can be disproved)

                   As this process matures,
             what new can the models accomplish?


     The major accomplishment will be the rapid facilitation
    of a transition to engineering (vs. craft approaches) for the
          design of complex social/ technological systems


                                           Professor C. Magee, 2005
                                           Page 8
  What are some interesting and
  possibly lasting findings?
• Metrics
• Comparison among network descriptions of different kinds of
  systems
    – Social
    – Information
    – Biological
    – Technological
• Models

• Key characterization procedures (and algorithms)
   – Community Structure
   – Motifs and coarse-graining
   – Self-dissimilarity

                                    Professor C. Magee, 2005
                                    Page 9
 Network metrics

• Size
• Density of interactions, sparseness
• Path Length –dependence on size




                                    Professor C. Magee, 2005
                                    Page 10
 Network Metrics I
• n, the number of nodes
• m, the number of links
• m/n is the average degree <k> as the number of links on a given
  node, k, is the degree.
• m/[(n)(n-1)] or <k>/(n-1)is the “sparseness” or normalized
  interconnection “density”
• Path length, l
                         1
                l
                     1                 d ij
                       n(n  1) i  j
                     2
    – “Small Worlds”
    – In a “Small World”, l is relatively small
    – And at given <k>, l ~ to ln n or less rapid rise is taken to
      mean “Small World” (where clustering is high)


                                               Professor C. Magee, 2005
                                               Page 11
    Network metrics II

• Size
• Density (mean degree), maximum degree
• Path Length –dependence on size
• Connectivity
• Degree Distribution-(power laws normal and uninformative)
• Social Network Analysis
    – Centrality, Prestige, closeness, proximity, closeness, etc.
• Clustering
• Degree Correlation Coefficient, assortativity and homophily




                                    Professor C. Magee, 2005
                                    Page 12
           Definition of the clustering coefficient C




                   This network has one triangle and eight connected triples, and therefore has a
                   clustering coefficient, C1, of 3 x 1/8 = 3/8 The individual vertices have local
                   clustering coefficients, of 1, 1, 1/6, 0 and 0, for a mean value, C 2 = 13/30.



Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics




                                                                                                          Professor C. Magee, 2005
                                                                                                          Page 13
  Selective Linking
• Which nodes link with which nodes: food webs, social networks
• Internet
    – Three broad categories of nodes: A. Internet Backbone, B.
       ISP’s and C. customers
    – Many A-B and B-C links but few A-C or C-C
• Social networks researchers have labeled this assortative mixing
  or homophily. Age, income, geography, profession etc. show
  correlations
• Assortativity coefficient, r, from normalized sociomatrix, e.
    Eij is the number of links that connect node types i and j

             E
          e                                Tr e  e 2
             E                         r
                                              1  e2
• r is o for randomly mixed networks and 1 for perfectly
  assortative networks

                                        Professor C. Magee, 2005
                                        Page 14
 Degree correlation

• A special case of assortative mixing is according to node
  degree.
    – Do high-degree nodes associate with other high-degree
      nodes? or do they prefer low-degree nodes?
    – This has been done by 2 dimensional histograms and by
      studying mean degree of node neighbors as a function of k.
    – Recently, it has been shown that calculating the Pearson
      correlation coefficient of the degrees at either ends of an
      edge is the most compact representation. This coefficient is
      positive for assortatively mixed networks and negative for
      disassortative networks.




                                     Professor C. Magee, 2005
                                     Page 15
  What are some interesting and
  possibly lasting findings?
• Metrics
• Comparison among network descriptions of different kinds of
  systems
    – Social
    – Information
    – Biological
    – Technological
• Models

• Key characterization procedures (and algorithms)
   – Community Structure
   – Motifs and coarse-graining
   – Self-dissimilarity

                                    Professor C. Magee, 2005
                                    Page 16
 Example Networks

• Road map                    •   Food webs
• Electric circuit or pipe    •   Nerve pathways
  system                      •   Gene control systems
• Structure of bridge or      •   Electronic Control
  building, with load paths       systems
• Organizational chart        •   Phone system
• Acquaintance network        •   Chemical reaction
• Supply chain                •   Sequential event plan
• Markov chain                •   Internet
• Electric Power Grid         •   Product Development
• Co-authors of papers            Tasks
• Citations in papers         •   1000’s of other examples

                                  Professor C. Magee, 2005
                                  Page 17
              Various classes of networks



  an undirected network                                                                                                                a network with a number
with only a single type of                                                                                                             of discrete node and link
node and a single type of                                                                                                              types
                      link




                 a network with                                                                                                        a directed network
              varying node and                                                                                                         in which each link
                    link weights                                                                                                       has a direction



                                                                                        Acyclic and cyclic on later slide
Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics



                                                                                                      Professor C. Magee, 2005
                                                                                                      Page 18
 Network Typology

• Social Networks = A set of people or groups of people with
  some pattern of contact or interactions among them; studied
  ones include friendships between individuals, business
  relationships between companies, intermarriages between
  families and many others as this work dates from the 1920’s.
• Information Networks are those where the nodes contain
  information or knowledge (citation networks and the world
  wide web are the best studied examples).




                                    Professor C. Magee, 2005
                                    Page 19
         The 2 best studied information networks;
         citation network and the World Wide Web




Citation network of academic papers in which                                            World Wide Web, a network of text pages
the nodes are papers and the directed links are                                         accessible over the Internet, in which the nodes are
citations of one paper by another. Since papers                                         pages and the directed links are hyperlinks. There
can only cite those that came before them the                                           are no constraints on the Web that forbid cycles
graph is acyclic -it has no closed loops.                                               and hence it is in general cyclic.

          Source: M. E. J. Newman, The Structure and Function of Complex Networks, SIAM Review, Vol. 45, No. 2, pp . 167–256, 2003 Society for Industrial and Applied Mathematics



                                                                                                 Professor C. Magee, 2005
                                                                                                 Page 20
 Network Typology

• Social Networks = A set of people or groups of people with
  some pattern of contact or interactions among them; studied
  ones include friendships between individuals, business
  relationships between companies, intermarriages between
  families and many others as this work dates from the 1920’s.
• Information Networks are those where the nodes contain
  information or knowledge (citation networks and the world
  wide web are the best studied examples).
• Technological Networks are human designed and produced
  usually for distributing some resource. Examples include
  electric power, internet, road systems, supply chains, railways,
  telephone system, water and many others
• Biological Networks are biological systems usefully
  represented as networks and include metabolic pathways,
  protein networks, gene regulation, food webs, neural networks,
  blood vessels and vascular networks in plants, etc.
                                     Professor C. Magee, 2005
                                     Page 21
 Metrics in a Variety of Networks
• See Table Handout (adapted form Newman review
  article)
    – 10/27 are directed networks
    – Large range of network scales
    – Good coverage of all 4 network “types”.
    – Decent scale range in all 4 types but the maximum
       size of studied technological and biological networks
       is smaller
    – Decent range on <k> overall and all are sparse
       (biological and technological less so)
    – All social networks are assortative-positive r (except
       student relationships)
    – All technological and biological are non-assortative
    – Information networks are an open question

                                   Professor C. Magee, 2005
                                   Page 22
  What are some interesting and
  possibly lasting findings?
• Metrics
• Comparison among network descriptions of different kinds of
  systems
    – Social
    – Information
    – Biological
    – Technological
• Models

• Key characterization procedures (and algorithms)
   – Community Structure
   – Motifs and coarse-graining
   – Self-dissimilarity

                                    Professor C. Magee, 2005
                                    Page 23
 Network Mathematical Models

• Poisson Random Networks
• Small World Models
• Generalized Random Networks
• Growth Models
• Search Models

• Community-based Models
• Reliability and Failure Cascade Models
• Communication Models
• Organization Modeling
                             Professor C. Magee, 2005
                             Page 24
  Metrics/Models in a Variety of Networks

• See Table Handout (adapted form Newman review article)
    – Large range of network scales
    – 10/27 are directed networks
    – Good coverage of all 4 network “types”.
    – Decent scale range in all 4 types but the maximum size of studied
       technological and biological networks is smaller
    – Decent range on <k> overall and all are sparse
    – All social networks are assortative (except student relationships)
    – All technological and biological are non-assortative
    – Path Length, l, is generally small (small worlds) and often
       approximately equal to that given by Poisson random network
    – Clustering is usually orders of magnitude higher than predicted
       by random networks for the large networks and is ~constant
                                        
    – Degree correlation as predicted for social networks with
       homophily

                                          Professor C. Magee, 2005
                                          Page 25
  What are some interesting and
  possibly lasting findings?
• Metrics
• Comparison among network descriptions of different kinds of
  systems
    – Social
    – Information
    – Biological
    – Technological
• Models

• Key characterization procedures (and algorithms)
   – Community Structure
   – Motifs and coarse-graining
   – Self-dissimilarity

                                   Professor C. Magee, 2005
                                   Page 26
 Community Structure

• Various algorithms have been developed for decomposing a
  network into its “logical” sub-structure:
    – For simple (all node equivalent) models, the most useful of
      these is one due to Garvin and Newman that finds the
      minimal links to cut in order to find the most appropriate
      subsystems.
    – Separation and optimization parameters are also calculated
      that allow one to determine the “best” number of
      subsystems
• Example from a co-author network




                                    Professor C. Magee, 2005
                                    Page 27
Professor C. Magee, 2005
Page 28
 Motifs

• Milo et al. first extended the concept of motifs beyond
  sociological networks in a 2002 article in Science titled:
  “Network Motifs: Simple building blocks of Complex
  Networks”,
    – They defined motifs in this paper as patterns of
      interactions that occur at significantly higher rates in an
      actual network than in randomized networks and
      developed an algorithm for extracting them from (directed)
      networks




                                    Professor C. Magee, 2005
                                    Page 29
     Schematic of network motif detection. Motifs are found in the real
network (A) much more frequently than in a ensemble of random networks (B)

                                        Professor C. Magee, 2005
                                        Page 30
 Motifs b

• Milo et al. first extended the concept in a 2002 article in Science
  titled: “Network Motifs: Simple building blocks of Complex
  Networks”,
     – They define motifs as patterns of interactions that are
       significantly higher than in randomized networks
     – They studied 19 networks (in six different classes)
         • For 2 gene transcription networks they found that the
           two different transcription systems showed the same
           motifs




                                      Professor C. Magee, 2005
                                      Page 31
    The number of times these two motifs occur is more than 10 standard
   deviations greater than their mean number of appearances in randomized
networks. None of the other 13 three node possible patterns or any other of
 the 199 4 node possible patterns appear more than the mean plus 2 standard
            deviations of their appearance in randomized networks

                                            Professor C. Magee, 2005
                                            Page 32
Dependence of detectability of the 3 node motif on
     the size of the E. coli network studied.


                           Professor C. Magee, 2005
                           Page 33
 Motifs c

• Milo et al. first extended the concept in a 2002 article in Science
  titled: “Network Motifs: Simple building blocks of Complex
  Networks”,
     – They define motifs as patterns of interactions that are
       significantly higher than in randomized networks
     – They studied 19 networks (in six different classes)
         • For 2 gene transcription networks they found that the
           two different transcription systems showed the same
           motifs
         • For 8 electronic circuits (in 2 classes), they found




                                      Professor C. Magee, 2005
                                      Page 34
The extremely high ratios for the motifs in these cases (even at small
  size) can probably be interpreted as evidence of design intent and
  for these small technological systems the importance of available
          modules in such systems probably accounts for the
reuse of the same “motifs” in the variety of circuits of the same class.

                                         Professor C. Magee, 2005
                                         Page 35
 Motifs d
• Milo et al. first extended the concept in a 2002 article in Science
  titled: “Network Motifs: Simple building blocks of Complex
  Networks”,
     – They define motifs as patterns of interactions that are
        significantly higher than in randomized networks
     – They studied 19 networks (in six different classes)
          • For 2 gene transcription networks they found that the two
            different transcription systems showed the same motifs
          • For 8 electronic circuits (in 2 classes), they found
            reproducible motifs at high concentration for each class of
            circuit studied
• One interesting conclusion is that the technique can be applied to
  networks with variable nodes and links.
• A second interesting conclusion coming from comparison of neurons,
  genes, food webs and electronic circuits is
     – “Information processing seems to give rise to significantly
        different structures than does energy flow.” The possible
        relevance to Whitney’s work is intriguing.
                                        Professor C. Magee, 2005
                                        Page 36
Self-similarity and self-dissimilarity
• Wolpert and Macready(2000) introduced the concept of self-
  dissimilarity as a complexity metric
• Self-dissimilarity is defined as “the variability of interaction
  patterns of a system at different spatio-temporal scales”
• Note that as defined this definition is in a sense counter to the
  notion (often loosely defined) of “scale free” which implies (at
  least seems to) the notion that structure is repetitive at various
  scales
• Wolpert and Macready invented relatively elaborate methods
  for statistically applying their concept and demonstrate it only
  through numerical simulations
• Itzkovitz et. al (2004) have recently developed a method they
  call “coarse-graining” based on their prior work on motifs. This
  method also assesses self-dissimilarity and has been applied to
  biological and technological networks.

                                      Professor C. Magee, 2005
                                      Page 37
Hierarchical Organization of
Modularity in Metabolic
Networks
E. Ravasz, A. L. Somera,
D. A. Mongru, Z. N. Oltvai,
A.-L. Baraba´si
SCIENCE VOL 297
30 AUGUST 2002 p 1551




                               A.   Scale free
                               B.   Modular, not scale free
                               C.   Nested modular, scale free


                                                                 Professor C. Magee, 2005
                                                                 Page 38
Coarse-Graining: An extension of motifs

• Itzkovitz et. al. investigate Coarse-Graining as an objective
  means for “reverse-engineering” that can be applied even when
  the lower level functional units are unknown (biological focus).
• The coarse-grained version of a network is a new network with
  fewer elements. This is achieved by replacing some of nodes
  by CGU’s (patterns of node interactions at the level being
  examined-motifs chosen somewhat differently).
• Itzkovitz et. al. apply simulated annealing to arrive at an
  optimum set of CGU’s (minimize the “vocabulary” of CGU's
  while maximizing the coverage of the original network by the
  coarse-grained description).
• Applying this algorithm to an electronic circuit..



                                    Professor C. Magee, 2005
                                    Page 39
Transistor level map of an 8 bit binary counter used in a digital fractional
 multiplier. Highlighted is a sub-graph that represents the transistors that
make up one NOT gate. Examining possible motifs up to 6 nodes shows


                                             Professor C. Magee, 2005
                                             Page 40
Two sets of possible optimal motif-based CGU’s. The solid boxes
choice can be arranged to arrive at a “gate-level coarse-graining”


                                   Professor C. Magee, 2005
                                   Page 41
   In the transistor level, nodes represent transistor junctions. In the gate
level, nodes are CGU’s, made of transistors, each representing a logic gate.
  Shown is the CGU that corresponds to a NAND gate. Re-applying the
       coarse-graining optimization sequentially yields 2 more levels..

                                          Professor C. Magee, 2005
                                          Page 42
 In the flip-flop level, nodes are either gates or a CGU made of gates that
  corresponds to a D-type-flip-flop with an additional gate as logic input.
In the counter level, each node is either a gate or a CGU of gates/flip-flops
                      that corresponds to a counter unit.
                                        Professor C. Magee, 2005
                                        Page 43
Coarse-Graining b

• Itzkovitz et. al. investigate Coarse-Graining as an objective
  means for “reverse-engineering” that can be applied even when
  the lower level functional units are unknown (biological focus).
• The coarse-grained version of a network is a new network with
  fewer elements. This is achieved by replacing some of nodes
  by GCU’s (patterns of node interactions at the level being
  examined.
• Itzkovitz et. al. apply simulated annealing to arrive at an
  optimum set of GCU’s (minimize the “vocabulary” of GCU’s
  while maximizing the coverage of the original network by the
  coarse-grained description).
• Applying this algorithm to an electronic circuit, one finds a
  four level description which has variable functional significance
  and self-dissimilarity at each level

                                     Professor C. Magee, 2005
                                     Page 44
Self-dissimilarity at multiple levels in the electronic circuit.
 This change of patterns with level apparently applies to all
  biological and technological networks studied thus far.


                                  Professor C. Magee, 2005
                                  Page 45
Coarse-Graining c
• Itzkovitz et. al. investigate Coarse-Graining as an objective means for
  “reverse-engineering”.
• The coarse-grained version of a network is a new network with fewer
  elements.
• Itzkovitz et. al. apply simulated annealing to arrive at an optimum set
  of GCU’s
• Applying this algorithm to an electronic circuit, one finds a four level
  description which has variable functional significance and self-
  dissimilarity at each level
• Note the fundamental difference between Coarse-Graining and
  algorithms for detection of community structure:
     – Community structure algorithms try to optimally divide networks
       into sub-graphs with minimal interconnections but these sub-
       graphs are distinct and complex
     – Coarse-Graining seeks a small dictionary of simple sub-graph
       types in order to elucidate the function of the network in terms
       of recurring building blocks

                                         Professor C. Magee, 2005
                                         Page 46
Coarse-Graining c
• Itzkovitz et. al. investigate Coarse-Graining as an objective means for
  “reverse-engineering”.
• The coarse-grained version of a network is a new network with fewer
  elements.
• Itzkovitz et. al. apply simulated annealing to arrive at an optimum set
  of GCU’s
• Applying this algorithm to an electronic circuit, one finds a four level
  description which has variable functional significance and self-
  dissimilarity at each level
• Note the fundamental difference between Coarse-Graining and
  algorithms for detection of community structure:
     – Community structure algorithms try to optimally divide networks
       into sub-graphs with minimal interconnections (modularity1)
       but these sub-graphs are distinct and complex
     – Coarse-Graining seeks a small dictionary of simple sub-graph
       types in order to elucidate the function of the network in terms
       of recurring building blocks (modularity 2)

                                         Professor C. Magee, 2005
                                         Page 47
   Different Definitions of “Modular” or
   “Module” (after Whitney)
• You can see different elements and the places where they join
  (modularity 1)
• Each item does a specific thing (form-function, genotype-
  phenotype in a one-to-one relationship) (Suh, Altenberg)
  (modularity 2)
• You need only know how to use them and don’t need to know
  what’s inside (modularity 2)
• Interconnectedness is concentrated inside them
  (Alexander)(software design) (modularity 1)
• Their links to the outside are standardized (modularity 2), or
  simple and few (Alexander) (modularity 1)




                                      Professor C. Magee, 2005
                                      Page 48
Coarse-Graining d
• Itzkovitz et. al. investigate Coarse-Graining as an objective
  means for “reverse-engineering”.
• The coarse-grained version of a network is a new network with
  fewer elements.
• Itzkovitz et. al. apply simulated annealing to arrive at an
  optimum set of GCU’s
• Applying this algorithm to an electronic circuit, one finds a
  four level description which has variable functional significance
  and self-dissimilarity at each level
• Note the fundamental difference between Coarse-Graining and
  algorithms for detection of community structure
• Research Hypothesis: Simultaneous study of communities and
  CGU’s in a variety of complex technological systems would
  further clarify the concept of modularity.
• Note that motifs and coarse-graining have thus far only been
  applied to fairly simple technological systems

                                     Professor C. Magee, 2005
                                     Page 49
 What are the general strengths and
 possible weaknesses of using these
 approaches?
• Some Sub-questions:
   – To what extent are all networks the same?
   – What network growth/development processes can be
     modelled and are these actually operating in specific
     networks?
   – How should local and global descriptions of networks be
     integrated?
   – What principles appear to operate in nature to select
     network and agent characteristics?
   – To what extent can the desirable properties from social,
     biological and other networks be transferred to designed
     networks?
   – What principles (should) operate in designed and human
     evolved systems?


                                   Professor C. Magee, 2005
                                   Page 50
 How far have we gotten towards the
 potential value to us?
• First examine some examples of initial studies and
  then consider some general questions
• Examples
    – Worldwide Airport Network
    – Shape and efficiency in Spatial distribution
      networks
    – Heuristic Design of the Internet Network




                              Professor C. Magee, 2005
                              Page 51
 Three Case studies from the network
 literature: I Worldwide Airport Network
• 1. The Structure and Efficiency of the Worldwide Airport
  network
    – By Guimera et. al.
    – More analysis and reverse engineering than design
• Network of 3883 cities with airports studied to examine the
  drivers of airport utilization and the evolution of the network
• All passenger flights from Nov. 1-Nov. 7, 2000 was focus of
  work with 531,574 unique flight non-stop flight segments
  between the 3883 cities

• Guimera et. al. view the airport network as a communication
  (process ID) network and interpret airports as routers (queues
  that receive passengers and direct them to a new destination).



                                      Professor C. Magee, 2005
                                      Page 52
 Worldwide Airport Network b

• The authors assert that the relevant design question is: “What
  is the network topology that minimizes the number of waiting
  flights/passengers?”
• They also hypothesize the plausibility of a star-network being
  optimal (at least regionally and up to a traffic limit)




                                        Professor C. Magee, 2005
                                        Page 53
 Worldwide Airport Network c
• They also hypothesize that as flight frequency increases, the
  waiting times for planes and passengers (at the single hub)
  becomes unacceptably large, so the star is replaced by a partly
  decentralized network…




                                        Professor C. Magee, 2005
                                        Page 54
 Worldwide Airport Network d
• They test whether the multiple hubs seen in the actual network
  evolved according to their hypotheses (principles) and conclude
  that physical limits in router capacity do limit the capacity of a
  given airport not just saturation




• Guimera et. al. also study betweenness centrality of all the
  airports and arrive at the same conclusion from this data.

                                         Professor C. Magee, 2005
                                         Page 55
 Worldwide Airport Network e
• Guimera et al. also show that the most connected cities would
  also be the most central cities from preferential attachment but
  that the real data do not show this for geographic and political
  reasons.
• Further desired work
    – Consideration of other properties of importance in the
       world-wide airport network such as
    – Fuel costs, flight lengths and airplane limits, economic
       tradeoffs that passengers might make (e.g. time and cost)
    – Study of airport routing capacity from empirical data and
       in-depth study of factors limiting airport routing capacity
       such as
    – Flight number limits by runway capacity, by air traffic
       control capacity (landing pattern limits), gate capacity…
    – Router capacity due to airport/customer use limitations
       such as distance between gates, ground transport links..
                                     Professor C. Magee, 2005
                                     Page 56
 Three Case studies from the network
 literature: II Spatial distribution networks
• 1. The Structure and Efficiency of the Worldwide Airport
  network
• 2. Shape and efficiency in Spatial distribution networks
    – By Gorman and Kulkarni and by Gastner and Newman
    – A suggested design method compared to existing systems
• Study of technological systems with nodes that are located at
  specific geographical sites; Such network representations can
  be used as models for
    – many distribution networks such as water, sewage, oil
      pipelines, natural gas, electric power grids, Fedex
    – many transportation networks such as air, rail, road etc.




                                    Professor C. Magee, 2005
                                    Page 57
 Spatial distribution networks b
• The model developed was for the case where the distribution
  system has a “root node” which is the sole source or sink for
  the items being distributed.

• Additional Design factors considered
   – Additional node locations (constraint)
   – Total link length (minimize to minimize cost )
   – Shortest path length between two nodes (minimize to
      minimize transport time)
• Tradeoffs in last two factors is the design/architecting problem
   – Look at ideal solutions for each criteria
   – Examine how real networks compare on the tradeoffs
   – Build growth model to derive pattern and look for
      consistency.


                                     Professor C. Magee, 2005
                                     Page 58
 Spatial distribution networks c

• For actual example system (a), minimum total edge length
  including paths to the root node is given by a Minimum
  Spanning Tree (c) while obtaining shortest paths to the root
  node is optimized by a star graph (b)




                                        Professor C. Magee, 2005
                                        Page 59
 Spatial distribution networks d
• From transportation research, a route factor is
   with l the shortest actual path length and
  and d is the shortest Euclidean distance
 and is equal to 1 for a star graph

• For three real technological system networks,




• The systems favor minimum edge length but have route factors considerably
  superior to MST optimums indicating effective tradeoff in the two criteria.
• A simple growth model is used to explain this result

                                             Professor C. Magee, 2005
                                             Page 60
 Spatial distribution networks e
• The growth model assumes that the systems evolve from the root node
  by adding new (but already existing) nodes using a greedy optimization
  criterion that adds unconnected node, i, to an already connected node, j
  with the weighting factor given by

• Simulations using these model assumptions yields




 showing small
 tradeoffs in total
link length give
large improvements in
path length

• In real systems,      =?

                                        Professor C. Magee, 2005
                                        Page 61
Spatial distribution networks f: desirable
future work
• Consideration of other network properties
    – Shipment capacity
    – Link capacities (and scaling/cost effects for key links)
    – Node capacities and roles (joints vs. transfer/routers)
    – Flexibility for growth (new nodes as well as new
       connections of existing nodes)
    – Robustness to node or link breakdowns
• Development of more broadly applicable models
    – More than one source/sink node
• Development of other rules/protocols for growth that achieve
  the key properties well
• Consideration of top-down vs. evolved systems


                                    Professor C. Magee, 2005
                                    Page 62
 Three Case studies from the network
 literature: III Heuristic internet design
• 1. The Structure and Efficiency of the Worldwide Airport network
• 2. Shape and efficiency in Spatial distribution networks
• 3. Heuristically designed internets
    – Fabrikant et. al
    – Li et al.
        • Toy internets designed for studying some principles and
          making comparisons
    – Fabricant et. al. attempt to balance the “last mile costs” and the
       communication distance in a growing system (the internet).
    – They use (and were the originators) of the already seen


    – They focused on the ease of obtaining power laws but noted
      transition between MST and star for this case as well.

                                         Professor C. Magee, 2005
                                         Page 63
 Heuristic internet design b
• Li et al. spend much of their time correcting the previous over-
  emphasis on power-laws as an indicator of structure. Their
  “First-Principles Approach” to the Internet router- level design
  problem outlines the approach.
• For their “First-Principles Approach, Li et. al. start simple and
  attempt “to identify some minimal functional requirements and
  physical constraints needed to develop simple models that are
  .. consistent with engineering reality”. They also focus on
  single ISP’s as the fundamental building block.
• They argue that the best candidates for a minimal set of
  constraints on topology construction (architecture) for a single
  ISP are:
     – Router technology and
     – Network economics



                                     Professor C. Magee, 2005
                                     Page 64
 Heuristic internet design c-Router
 Technology Limits
• Li et al point out that for a given router there is a limit on the
  number of packets that can be processed in any given time. This
  limits the number of connections and connection speeds and
  creates a “feasible region” and “efficient frontier” for given
  router designs




                                      Professor C. Magee, 2005
                                      Page 65
Heuristic internet design d- Router
Technology Constraints II
Considering multiple routers and other technologies, a feasible region results




                                             Professor C. Magee, 2005
                                             Page 66
Heuristic internet design e- Economic
Constraints
• Costs of installing and operating physical links can dominate the cost
  of the infrastructure so the availability of multiplexing and
  aggregating throughout the hierarchy is essential
• These technologies are deployed depending upon customer needs and
  willingness to pay




                                        Professor C. Magee, 2005
                                        Page 67
 Heuristic internet design f: Heuristically
 Optimal Networks
• Li et al then define a heuristically optimal network:




• They also show that several real Internet network elements have
  these broad characteristics (Abilene and CENIC)
• Note the “hierarchical tree” in the quote above would actually be
  better described by the model in the preceding case ( a modified MST
  arrived at by a “growth rule” followed by the ISP).


                                         Professor C. Magee, 2005
                                         Page 68
 Heuristic internet design g: Properties
 and designs evaluated
• Performance
   – Throughput
   – Router utilization ( distance to frontier)
   – End user bandwidth Distribution
• Self similarity




                                      Professor C. Magee, 2005
                                      Page 69
 Heuristic internet design g: Properties
 and designs evaluated
• Performance
   – Throughput
   – Router utilization ( distance to frontier)
   – End user bandwidth Distribution
• Self-similarity

• Li et al create 4 different designs for comparison purposes.
  These all have the same number of nodes and degree
  distribution, and the same routers but are constructed according
  to:
    – HSF (hierarchically scale free-the modular hierarchical)
    – Random (also with highly connected central nodes)
    – Poor design (heuristic network with central overloaded)
    – Heuristically optimal design (HOT) with 3 tier network
       hierarchy


                                      Professor C. Magee, 2005
                                      Page 70
Heuristic internet design h: designs




                       Professor C. Magee, 2005
                       Page 71
Heuristic internet design j: performance
as a function of the Scale-Free parameter




                       Professor C. Magee, 2005
                       Page 72
Heuristic internet design j; Router
utilization comparison




                       Professor C. Magee, 2005
                       Page 73
 Heuristic internet design k: Summary
• Li et al introduce some additional engineering design
  constraints and then are able to use this insight to produce
  simple (toy) models that demonstrate very clearly that the
  mental image of a scale-free graph is totally inconsistent with
  real ISP’s.
• They do not calculate correlation coefficients but it is clear that
  the designs that “..have consistency with real design
  considerations” have far better performance than the other
  examples,. In addition, they have minimum values of the S
  metric (= 0) whereas the random and hierarchically modular
  designs have much higher values of S. Thus, it appears that
  meeting the design requirements drives one towards negative r.
  This occurs because the efficiency frontier of a router trades off
  the number of connections vs. bandwidth and most customers
  do not want to pay for high bandwidth. (thus moderate k
  backbone routers/nodes connect to other moderate k routers and
  high k routers tend to connect to very low k customer(?) nodes
                                      Professor C. Magee, 2005
                                      Page 74
  Heuristic internet design l: Possible
  further work
• Consider robustness of toy models to see if this influences
  topology choice
• Consider distribution and throughput in one model
• Investigate how other constraints such as new customer desires
  for bandwidth, new router technology, wireless technology,
  cable vs. DSL and other issues may affect internet topology
  (architecture) and desired flexibility
• Observe more on actual networks( internet and others) to
  further test modeling assumptions.
• Can one quantitatively predict degree correlation from models
  such as theirs for various infrastructures and other technological
  systems and learn something from its experimental variation?




                                      Professor C. Magee, 2005
                                      Page 75
 How far have we gotten towards the
 potential value to us?
• Some Questions
   – Are the network representations useful for complex
     engineering systems?
   – Are the models that have been developed useful on directly
     transferring to technological networks?
   – Are the structural characteristics so far developed
     adequate for technological/engineered networks? Do we
     need new ones?
   – To what extent can the desirable properties from social,
     biological and other networks be transferred to designed
     networks?




                                   Professor C. Magee, 2005
                                   Page 76
 Network representations and utility
• Are graphs and networks the only approach to quantification?
   – No
• Are networks useful in fulfilling this need ?
   – Yes
• Evidence
   – Most importantly, the research has developed a number of
      objective methods for quantitative observation
       • The best are probably community structure algorithms
         and motif/coarse-graining approaches but all the
         metrics-particularly those developed in the social
         science network research have potential value for
         objectively observing complex systems quantitatively
         but care must be exercised in defining and measuring
         such systems (as always) and not all quantitative
         observations (for example, degree distribution) will
         prove to be very useful.
   – Models?
                                   Professor C. Magee, 2005
                                   Page 77
   Existing Models and utility
• What complex systems do we want to model?
   – Sociological
       • Informal or self-organized
       • Formal or organized by design
   – Technological (and biological)

• For what purposes?
    – Organizational “goodness” (designed sociological network-to
       design more effective organizations)
    – Robustness, Flexibility (and other properties?) in “Engineering
       Systems” or systems that simultaneously possess high levels of
       social and technical complexity
    – To understand how to choose (and cause to occur) better
       architectures for such systems
      –To be able to apply the strong existing engineering design approach
         which combines strong quantitative, science-based models with
              heuristics and other creatively adapted human skills


                                             Professor C. Magee, 2005
                                             Page 78
 Network representations and utility
• Are graphs and networks the only approach to quantification?
   – No
• Are networks useful in fulfilling this need ?
   – Yes
• Evidence
   – The development of a number of objective methods for
      quantitative observation
   – Models
       • Sociological network models are applicable when an
         aspect of a problem contains such networks –the
         homophily and group structure must be understood to
         intelligently discuss such networks
       • Organizational and technological/social system models
         exist but will not be improved without observational
         work. It is clear that such models will have to contain a
         sense of hierarchy and that they will have to deal with
         tradeoffs among processes and properties to be
         appropriate to such systems
                                     Professor C. Magee, 2005
                                     Page 79
 How far have we gotten towards the
 potential value to us?
• Some Questions
    – Are the network representations useful for complex
      engineering systems?
    – Are the models that have been developed useful on directly
      transferring to technological networks?
    – Are the structural characteristics so far developed
      adequate for technological/engineered networks? Do we
      need new ones?
    – To what extent can the desirable properties from social,
      biological and other networks be transferred to designed
      networks?
• We are clearly not there yet and it is hard to say whether one or
  many breakthroughs are needed to get there.
• The participation of domain experts with network science
  experts in the research will hasten development.
                                      Professor C. Magee, 2005
                                      Page 80
References
• D. J. Watts, Six Degrees: The Science of a Connected Age, W.
  W. Norton and Co. (1993)
• M. E. J. Newman, “The structure and function of complex
  networks” SIAM Review vol. 45, 167-256 (2003)
• R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii,
  U. Alon, “Network Motifs: Simple building blocks of Complex
  Networks”, Science, Vol. 298, pp 824-827, (2002)
• Pierre-Alain Martin, “A Framework for Quantifying
  Complexity and Understanding its Sources: Application to two
  Large-Scale Systems” SM thesis, MIT, 2004.
• D. H. Wolpert and W. Macready, “Self-dissimilarity: An
  empirically observable complexity Metric”, Unifying themes in
  complex systems, New England Complex Systems
  Institute(2000). A second paper found on the NASA Moffet
  web site has similar information and is titled “Self-dissimilarity
  as a high dimension complexity measure” (2004?)


                                      Professor C. Magee, 2005
                                      Page 81
References II
• S. Itzkovitz, R. Levitt, N. Kashtan, R. Milo, M. Itzkovitz and U.
  Alon, “Coarse-Graining and Self-Dissimilarity of Complex
  Networks”, (Oct. 2004)
• A. Fabrikant, E. Koutsoupias, and C. H. Papadimitriou,
  “Heuristically optimized trade-offs: A new paradigm for Power
  Laws in the Internet, in Proceedings of the International
  Colloquium on Automata, Languages and Programming, vol
  2380 ;in lecture notes in Computer Science,pp110-112, 2002
• L. Li, D. Alderson, W. Willinger and J. Doyle, “ A First-
  Principles Approach to Understanding the Internet’s Router-
  Level Topology” SIGCOMM 04., 2004
• L. Li, D. Alderson, R. Tanaka, J.C. Doyle and W. Willinger,
  “Towards a Theory of Scale-free graphs: Definitions,
  Properties, and Implications” Technical Report CIT-CDS-04-
  006, Engineering and Applied Sciences Division California
  Institute of Technology, Pasadena, CA Dec.. 2004)


                                     Professor C. Magee, 2005
                                     Page 82

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:17
posted:2/5/2012
language:
pages:82