Docstoc

Airports

Document Sample
Airports Powered By Docstoc
					              USC 3001: Complexity
               AY 05/06 Semester 2
                  Project Report




     Around the world in eighty flights:
 Modeling the world-wide airport network




By: Tan Kai Xin, Grace and Tan Li Yuan


Instructor: Dr Rajesh Parwani


                    Contents
1. Introduction
2. Concept of low degree and high betweenness-centrality
3. Degree-betweenness anomalies and multi-community
  networks
4. Model A: Based on preferential attachment and
  geographical distance constraints
5. Validating Model A
6. Model B: Inclusion of geo-political constraints
7. Validating Model B
8. Critical Analysis
9. Conclusion




                             1
1. Introduction
The world-wide airport network not only provides convenience to travelers. Just as other
critical infrastructures, the air transportation network poses an enormous impact on local,
national, and international economies. It also has an indirect role to play the propagation of
infectious diseases such as influenza and SARS.


The world-wide air transportation network serves to mobilize millions of people every
day. Often, particular states within a continent are designated to handle high volume of
daily flights, sometimes more than they can handle. This results in delays and flight
cancellations across the country in the event of dire weather conditions, leading to large
economic losses. These failure and inefficiencies prompt various questions: What has led
the system to this point? Why is a more efficient system not developed? In order to deal
with these queries, it is important to characterize the structure and the evolutionary
mechanisms of the world-wide airport transportation network.


In principle, the structure of the air transportation network is mainly determined by the
airline companies, that will aim maximize their immediate profit. However, it is also the
result of geographical and political factors.


It was observed that the world-wide air transportation network is a small-world network
for which (i) the number of direct connections k to a given city (degree), and (ii) the
number of shortest paths b going through a given city (betweenness centrality) have
distributions that are scale-free.


However, in contrast to the prediction of scale-free network models, it was observed that
the most connected cities (largest degree) are not necessarily the most central cities
(largest betweenness centrality) both on the world-wide level as well as regional airport
networks. This was an important finding as it has been shown that nodes with high
betweenness tend to play a more important role that those with high degree in keeping
networks connected, which might necessarily play a key role in the propagation of
diseases.


In „Modeling the world-wide airport network‟, Guimera and Amaral aimed to address the
issue of identifying the mechanism by which central nodes that are not hubs can emerge.


                                                2
They not only have shown how current models that consider preferential attachment and
geographical distance constraints solely cannot reproduce the observed behavior of low
degree and high betweenness centrality in airport networks, though it could generate the
phenomenon in which airports tend to be connected with other airports that are
geographically close. They took a step further to account for the large betweenness-small
degree occurrence by introducing a new type of mechanism that encompasses geo-political
constraints.


In this paper, we shall introduce the concept of low degree and high betweenness, in
relation to the airport network.




                                            3
2. Concept of low degree and high betweenness-centrality
The degree of a node, also known as connectivity, provides information on its importance;
however, it is certainly not the only proponent that depicts the significance of a node in the
network.




 Fig 1: Node v has low degree but all the shortest paths from region C1 to C2 has to go thorough v, hence
 implying very large centrality.



Indeed, the node v in Fig. 1 above, has a small connectivity (linking only 2 neighbors),
however, the effect of its removal is certainly not determined by its connectivity but by the
fact that it links together different parts of the network. A good measure of the centrality of
a node has thus to incorporate a more global information such as its role played in the
existence of paths between any two given nodes in the network. One is thus naturally led
to the definition of the betweenness centrality (BC) which counts the fraction of shortest
paths going through a given node. More precisely, the BC of a node v is given by :
                                       st (v)
                  g (v )    
                             s  v  t  st
                                                              (1)

where σst is the total number of shortest paths from node s to node t and σst (v) is the
number of shortest paths from s to t going through v.


A pair-dependency relationship, describing the relationship between node v and the
                                                                            st (v)
shortest path from node s to node t is defined as:             st (v) 
                                                                             st


Hence, any two nodes that reside within the same region (eg: s and t both from the region
C1) will have a zero value for μst (v) because the nodes that form the shortest path will not



                                                     4
include node v and σst (v) will naturally be 0. Therefore, it is reasonable to equate nodes s
and t to reside in two separate regions.


Consider the nodes v1, v and v2 in Fig. 1 above.
Number of shortest path from node v1 to node v2 : σst = 1 (v1  v  v2)
Number of shortest path from node v1 to node v2 through v: σst (v) = 1 (v1  v  v2)
Hence  st (v) = 1. For a pair dependency relationship of 1, this implies that all the shortest
paths linking 2 nodes will have to pass through v, thus showing the importance of node v
in linking the 2 nodes.




              Fig 1a: A modification to Figure 1 (shows the shortest paths from node v1 to v2)



To further illustrate the importance of a node via the pair dependency relationship,
Consider now the figure 1a shown above.
Number of shortest path from node v1 to node v2 :             =2
(v1  v  v2 or v1  v‟  v2)

Number of shortest path from node v1 to node v2 through v:                   = 1 (v1  v  v2)

Thus         now takes on the value of ½. A smaller value of                 implies that node v
now takes on a less important role in linking the 2 nodes. It is shown in Fig 1a that even
without node v, an alternative shortest path exists between v1 and v2.




                                                  5
BC can also be rewritten as: g (v)               
                                                 s  v t
                                                            st   (v)           
                                                                         i  j sCi ,tC j
                                                                                             st   (v) where i and j belong to

different regions and nodes s and t reside in Ci and Cj regions respectively.


For Fig 1,  st (v) =1 for all s  C1 , t  C2 since all short paths have to go through node v, as
in the case for v1 and v2. Therefore, the BC of the node v (in Fig 1) is given by
                                            st (v)
                g (v )  2      
                             sC1 ,tC 2     st
                                                     2   st (v)  2 1  2 N1 N 2
                                                       sC ,tC       sC ,tC
                                                                 1   2                   1         2


where N1 and N2 are the number of nodes in region C1 and C2. This result shows that
although v has small connectivity, its BC defined by (1) is large. 2N1N2 is therefore, the
largest BC value for a particular node where all the shortest paths between 2 nodes from
separate regions pass through that particular node. Thus it is noted that BC can be large
regardless of the degree of a particular node. In Fig 1, nodes v1, v and v2 despite having
different connectivity, have the same BC.


High values of the centrality indicate that a node can reach the others on short paths or that
this vertex lies on many short paths. If one removes a node with large centrality, it will
lengthen the paths between many pairs of nodes. The extreme case is when the node is a
cut-vertex (eg: node v in Fig1); its removal breaks the network into two disconnected
components.


Hence it highlights that central hubs with low degree may play a significant role in the
evolution of the structure, as well as the efficiency of the network. Thus, it is not sufficient
to focus our attention on hubs. Instead, it is crucial to examine the underlying causes
which results in central airports which are not hubs, in order to design a better system for
the world-wide airport network. We shall illustrate how such a phenomenon may occur in
the real world network in the following section.




                                                             6
3. Degree-betweenness anomalies and multi-community networks
In most of the complex networks, the existence of nodes with small degree and large
betweenness centrality is not significant. In particular, the degree and betweenness
centrality of a node are highly correlated in random networks. In other words, for nodes
with small degree, there is a high probability that it has a small betweenness centrality, and
vice versa. Hence, the presence of central airports, which are not hubs, can be considered
as an anomaly. . A region is defined to be a cluster of densely connected nodes. However,
more important is the issue on the mechanism that drives the formation of such scale-free
networks with the obtained anomalous distribution of betweenness centralities.


Towards this matter, we shall consider Alaska, which is a sparsely populated, isolated
community. Despite is low population density; it has a disproportionately large number of
airports. However, it was observed that only a few Alaskan airports, including Anchorage
and Fairbanks, are connected to the continental US, while most Alaskan airports only have
connections to other airports within Alaska. Furthermore, Alaska is nearer to Canada than
continental US, but, it was observed that there are no connections from Alaskan airports to
airports in Canada‟s Northern Territories. (see Fig3) The main reason for this observation
is that the Alaskan population has to be connected to its political centers, which are
situated in continental US. Thus, geopolitical constraint is the main factor which results in
the abovementioned observation.




                       Figure 3: Geographical location of Alaska and Canada




                                                 7
In a simplified model for the airport network the following scenarios could result:




 Figure 4a: A simplified model displaying low degree and large centrality of node v (representing
 Anchorage). Node s is the hub of the Alaskan community.

For the first scenario in Fig 4a, Anchorage has low degree and high centrality, since it is
necessary for all the flights from within Alaska to Continental US to pass through
Anchorage. In this case, all the Alaskan airports are connected to node s (the hub) before
connecting to Anchorage. This is possibly due to the effects of preferential attachment as
well as geographical constraints: the other Alaskan airports, could be located at a distance
much further away from Anchorage as compared to node s, hence, would preferably
connect to an airport that is geographically close by (represented by node s) before
connecting to Anchorage.


However, Anchorage could also serve as the hub within the community (as depicted in fig
4b) the other Alaskan airports are located at a geographically close distance. Hence,
Anchorage may now have high degree within Alaska, but it is certainly not a hub to all the
nodes in the network. Nonetheless, it still serves as the main link to Continental US. Thus,
cities like Anchorage provide the major connection to the outside world for the other cities
in the communities, thereby explaining the large betweenness centrality. Indeed, the
existence of nodes with anomalous centrality is related to the existence of region with a
high density of airports but few connections to the outside. The degree-betweenness
anomaly is therefore ultimately related to the existence of „communities‟ in the network.


Having given a generalized description of the real-world airport network, we will describe
how the models were derived by the authors, in an attempt to prove the hypothesis that
geopolitical constraints play a part in affecting the structure of the airport network.




                                                    8
4. Model A: Based on preferential attachment and geographical distance constraints
Aim: Construct a simple model which takes into account preferential attachment and
geographical distance constraints

 At each time step, one of the following events takes place:
      (i)          A new link between two existing nodes is established with probability p
      (ii)         A new node is added and connected to m existing nodes with probability (1-
                   p)


 When :
       Event (i) occurs, a new link is created between existing nodes i and j according
             to:
                                                           ki k j
                                               ij 
                                                       F (d ij )

       Event (ii) occurs, a link is created between the new node i and an existing node
             j, with j selected according to:
                                                            kj
                                               ij 
                                                       F (d ij )


 Investigate two different forms for the function F(d):
         (a) F1 (d )  d r         (b) F1 (d )  exp( d d x ) ; d x is the characteristic distance


     Preferential attachment leads to a power-law degree distribution
     F(d) leads to the truncation of the power-law decay and when F(d) increases very
      rapidly, the power-law decay regime may disappear completely


Notes:
     1. Nodes are created in locations which correspond to actual airport locations
     2. Size of the model network is the same as the size of the real network
     3. Locations of new nodes are chosen in random order




                                                       9
5. Validating Model A
Case (a): F1 (d )  d r



           Figure 5: Degree distribution for scaled degree and betweenness for F1 (d )  d r




Fix p=0.65 so that the exponent of the degree distribution to agree with the observed
data.Also fix m=1 so that the average degree is as close as possible to the average degree
of the world-wide airport network


The results for r=1, r=2 and r=3 are presented in Figure 5. From Figure 5, it can be
observed that the model is able to reproduce the observed degree distributions and the
observed betweeness distributions.


Next, plot „Betweeness against Degree‟ (b(k)) to check if the simulation produces data
which corresponds to the real world data, in which there exist airports which have low
degree, but high betweenness. If it can be observed from the simulation results that there
are some nodes which have low degree, but high betweenness, then the model can be used
to represent the world-wide airport network.



Figure 6a: Betweenness of the nodes as a function of their degree for a model world-wide airport network




The points in Figure 6a correspond to the simulations of the model, while the shaded
regions represent the 95% confidence intervals for random networks which have the same
degree distributions as the model networks.


Note that the confidence intervals for random networks are used here because there exist a
high correlation between the „betweenness‟ and „degree‟ in random networks.




                                                    10
Since most of the simulation data falls within or are close to the shaded regions, it shows
that there exist a high correlation between the „betweenness‟ and „degree‟ in the model
network, especially at the level where small degrees occur.


However, in the world-wide airport network, it is observed that there exist a low
correlation between „degree‟ and „betweenness‟, especially at the level where small
degrees occur. Hence the model is unable to explain for the presence of central airports
which are not hubs.



Figure 6b: Betweenness of the nodes as a function of their degree for a model North American airport network



The model gives rise to similar results for the case of North American airport network
(Figure 6b), in which there exist a high correlation between the „betweenness‟ and „degree‟
in the model network, especially at the level where small degrees occur.


Although the model is unable to explain for the presence of central airports which are not
hubs, it can be observed that the model is fairly consistent, regardless of the size of the
network.


Case (b): F1 (d )  exp( d d x )



          Figure 7: Degree distribution for scaled degree and betweenness for F1 (d )  exp( d d x )




As for case (a), fix p=0.65 so that the exponent of the degree distribution to agree with the
observed data. Also fix m=1 so that the average degree is as close as possible to the
average degree of the world-wide airport network


The results for dx=1.0RT and dx=0.2 RT are presented in Figure 7, where RT is the radius of
the Earth.


It can be observed from Figure 7, that the model is able to reproduce the observed degree
distributions and the observed betweenness distributions. Besides that, it was observed that


                                                     11
F(d) only affects the structure of the network when considering regions much larger than
dx. Hence, one may foresee some problems when applying the model to a regional airport
network.


Similar to case (a), plot „Betweenness against Degree‟ (b(k)) to check if the simulation
produces data which corresponds to the real world data.



  Figure 8a: Betweenness of the nodes as a function of their degree for a model world-wide airport network



Similarly, the points in Figure 8a correspond to the simulations of the model, while the
shaded regions represent the 95% confidence intervals for random networks which have
the same degree distributions as the model networks.


There are some fluctuations of b(k) in the model world-wide airport network. The model
produces the presence of nodes with relatively small degree and high betweenness, which
is in agreement with the real world case.


However, the model North American airport network does not produce the consistent
results, as of that of the model world-wide airport network.



Figure 8b: Betweenness of the nodes as a function of their degree for a model North American airport network



As observed in Figure 8b, unlike the model North American airport network, there exist a
high correlation between the „betweenness‟ and „degree‟ in the model network and it does
not produce simulation data with small degree and high betweenness.


Hence, it can be deduced that the introduction of characteristic length (dx) poses some
problems. Since the results are not consistent for the world-wide network and North
American network, it can be confirmed that F(d) only affects the structure of the network
when considering regions much larger than dx.


Thus, case(a) F1 (d )  d r provides a better model than case (b) F1 (d )  exp( d d x ) .



                                                    12
6. Model B: Inclusion of geo-political constraints
From the analysis of Model A which takes into account the influence of preferential
attachments and distance constraints, it can be observed that these two factors appear to
explain the degree and betweenness distributions, but fail to account for the presence of
nodes with small degree but high betweenness. Hence, there must other factors which
affect the formation and evolution of the airport network.


Suppose that there is an additional constraint which is a consequence of geo-political
considerations. In other words, only a few airports in each country are connected to
airports in other countries, while the other airports are only allowed to connect to airports
within the same country, regardless of the geographical distance between the airports. This
has been illustrated earlier in the paper, for the case of Alaskan airports being connected to
the continental US due to political reasons, instead of being connected to nearer airports in
Canada‟s Northern Territories.


In order to account for the effect of geo-political constraints, the following model is a
modification to Model A such that most nodes are only allowed to establish connections
with other nodes within the same country and only a few are allowed to establish
international connections.


 The first 10% of nodes are added exactly as in Model A, with F (d )  d r and r=1 (which
 gives the best fit for the degree and betweenness distribution of the real world data).


 Next, add the remaining nodes according to the rules in Model A, but with the additional
 rule that only allow connections to be formed within the same country.




                                              13
7. Validating Model B



  Figure 9: Betweenness of the nodes as a function of their degree for cases with geo-political constraints




Model B generates central nodes with small degree (as shown in fig 9), which is consistent
with the observations in the real airport network, for both the case of the world-wide
airport network and North American airport network.


Hence, it can be concluded that the new model is able to explain the existence of large-
betweenness and small-degree nodes at both the global and regional level.




                                                     14
8. Critical Analysis
The aim of the models was to determine the main factor which gives rise to the existence
of airports which have low degree but high betweenness centrality.


It was observed that Model A (based on preferential attachment and geographical
constraints) was unable to produce simulation results which illustrate the presence of
nodes with small degree and high betweenness centrality. Hence the authors modified it to
introduce geopolitical constraints, in Model B. The simulation results of Model B were
able to account for the low degree and high betweenness-centrality phenomena. However,
there were several assumptions made in the model.


The authors fixed p=0.65 and m=1 such that the cumulative degree distributions of the
betweenness and degree agrees with the real world data, as shown in Fig.5 and Fig. 7. This
was a reasonable assumption since the main aim of the models was to show that the factors
results in nodes with low degree and high betweenness centrality. Hence, it was necessary
to ensure that the cumulative degree distributions matched the real world data before
proceeding with the analysis of the correlation between degree and centrality.


In Model B, the first 10% of nodes were allowed to establish international connections.
However, we feel that this value should be varied so that we may better understand how
the significance of geopolitical constraints. For example, if the variations of this
percentage can more accurately describe the correlation between the degree and centrality
of the nodes in the network, we can approximate the extent of the effect of geopolitical
constraints on the evolution of the world-wide airport network.


Besides that, F(d) was taken to follow a power law ( F (d )  d r ) in Model B. We feel that
this assumption is reasonable, since in Model A, it was already proven that
F (d )  d r provides a better approximation than F (d )  exp( d d x ) .

Nevertheless, the assumptions were reasonable.


Based on these reasonable assumptions, the model produced simulation data which is of
good fit to the real world data and accounted for the effects of geo-political constraints.




                                               15
9. Conclusion
In this project, we have shown that the authors have created a good model that accounts
for the constraints and pressures governing the evolution of the airport transportation
network. This explains why central nodes that are not hubs can emerge. However, one
other factor they could possibly consider is population density; within a region that is
sparsely populated, it is natural that there will be fewer links extending out of that region.
Hence, this could also influence the structuring of the airport transportation network.


One benefit from the understanding of the airport network structure is the ability to
identify central nodes, which play an important role in the propagation of infectious
diseases. This could curb the further spread of diseases by temporarily breaking these links
so that disease remains self-contained within the region. Hence knowledge of the topology
of the airport transportation network allows quicker identification, which is consequently
more effective in controlling the transmission of diseases across regions.




                                              16
References
Barthelemy, Marc. “Betweeness centrality in large complex network” Arxiv:cond-mat
0309436 v2 (13 May 2004)


Guimera, R. and Amaral, L.A.N. “Modeling the world-wide airport network.” The
European Physical Journal B 38 (2004): 381-385.


Guimera, R et al. “The world-wide air transportation network: Anomalous centrality,
community structure, and cities‟ global roles.” Arxiv:cond-mat 0312535 v2 (12 Jul 2005).




                                           17