Document Sample

Diffusion Over Dynamic Networks (plus some social network intro since I’m first) NetSci Workshop May 16, 2006 James Moody This work supported by the Network Modeling Project through the University of Washington: NIH grants DA12831 and HD41877 Introduction We live in a connected world: “To speak of social life is to speak of the association between people – their associating in work and in play, in love and in war, to trade or to worship, to help or to hinder. It is in the social relations men establish that their interests find expression and their desires become realized.” Peter M. Blau Exchange and Power in Social Life, 1964 Introduction We live in a connected world: "If we ever get to the point of charting a whole city or a whole nation, we would have … a picture of a vast solar system of intangible structures, powerfully influencing conduct, as gravitation does in space. Such an invisible structure underlies society and has its influence in determining the conduct of society as a whole." J.L. Moreno, New York Times, April 13, 1933 These patterns of connection form a social space, that can be seen in multiple contexts: Introduction Source: Linton Freeman “See you in the funny pages” Connections, 23, 2000, 32-42. Introduction High Schools as Networks Introduction And yet, standard social science analysis methods do not take this space into account. “For the last thirty years, empirical social research has been dominated by the sample survey. But as usually practiced, …, the survey is a sociological meat grinder, tearing the individual from his social context and guaranteeing that nobody in the study interacts with anyone else in it.” Allen Barton, 1968 (Quoted in Freeman 2004) Moreover, the complexity of the relational world makes it impossible to identify social connectivity using only our intuitive understanding. Social Network Analysis (SNA) provides a set of tools to empirically extend our theoretical intuition of the patterns that construct social structure. Introduction Why do Networks Matter? Local vision Introduction Why do Networks Matter? Local vision Introduction Why networks matter: • Intuitive: “goods” travel through contacts between actors, which can reflect a power distribution or influence attitudes and behaviors. Our understanding of social life improves if we account for this social space. • Less intuitive: patterns of inter-actor contact can have effects on the spread of “goods” or power dynamics that could not be seen focusing only on individual behavior. Introduction Social network analysis is: •a set of relational methods for systematically understanding and identifying connections among actors. SNA is •is motivated by a structural intuition based on ties linking social actors •is grounded in systematic empirical data •draws heavily on graphic imagery •relies on the use of mathematical and/or computational models. (Freeman, 2004) •Social Network Analysis embodies a range of theories relating types of observable social spaces. 1. Introduction 2. Social Network Basics a. Basic data Elements b. Basic data structures c. Network Analysis Buffet 3. Networks & Diffusion a. Structural constraints on network diffusion a. Reachability b. Distance c. Connectivity d. Closeness centrality b. Temporal Constraints on network diffusion a. Defining dynamic networks b. How order constrains flow c. Reachability variance w. constant structure d. Minimum temporal reachability c. New time-dependent network measures a. Graph-level measures b. Node-level measures d. Visualizing Diffusion potential in time-dependent Graphs Social Network Data Elements Social Network data consists of two linked classes of data: a) Information on the individuals (aka: actors, nodes, points) • Network nodes are most often people, but can be any other unit capable of being linked to another (schools, countries, organizations, personalities, etc.) • The information about nodes is what we usually collect in standard social science research: demographics, attitudes, behaviors, etc. • Includes the times when the node is active b) Information on relations among individuals (lines, edges, arcs) • Records a connection between the nodes in the network • Can be valued, directed (arcs), binary or undirected (edges) • One-mode (direct ties between actors) or two-mode (actors share membership in an organization) • Includes the times when the relation is active Social Network Data Elements The unit of interest in a network are the combined sets of actors and their relations. We represent actors with points and relations with lines. Actors are referred to variously as: Nodes, vertices or points Relations are referred to variously as: Edges, Arcs, Lines, Ties Example: b d a c e Social Network Data Elements In general, a relation can be: Binary or Valued Directed or Undirected b d b d a c e a c e Undirected, binary Directed, binary b d b d 1 3 1 2 a c 4 e a c e Undirected, Valued Directed, Valued Social Network Data Elements Social network data are substantively divided by the number of modes in the data. 1-mode data represents edges based on direct contact between actors in the network. All the nodes are of the same type (people, organization, ideas, etc). Examples: Communication, friendship, giving orders, sending email. 1-mode data are usually singly reported (each person reports on their friends), but you can use multiple-informant data, which is more common in child development research (Cairns and Cairns). Social Network Data Elements Social network data are substantively divided by the number of modes in the data. 2-mode data represents nodes from two separate classes, where all ties are across classes. Examples: People as members of groups People as authors on papers Words used often by people Events in the life history of people The two modes of the data represent a duality: you can project the data as people connected to people through joint membership in a group, or groups to each other through common membership There may be multiple relations of multiple types connecting nodes in any given substantive setting. Social Network Data Elements Levels of analysis Global-Net Ego-Net Partial-Network Social Network Data Elements We can examine networks across multiple levels: 1) Ego-network - Have data on a respondent (ego) and the people they are connected to (alters). Example: 1985 GSS module - May include estimates of connections among alters 2) Partial network - Ego networks plus some amount of tracing to reach contacts of contacts - Something less than full account of connections among all pairs of actors in the relevant population - Example: CDC Contact tracing data for STDs Social Network Data Elements We can examine networks across multiple levels: 3) Complete or “Global” data - Data on all actors within a particular (relevant) boundary - Never exactly complete (due to missing data), but boundaries are set -Example: Coauthorship data among all writers in the social sciences, friendships among all students in a classroom For the most part, I will be discussing issues surrounding global networks. Social Network Data Structures Visualization A good network drawing allows viewers to come away from the image with an almost immediate intuition about the underlying structure of the network being displayed. However, because there are multiple ways to display the same information, and standards for doing so are few, the information content of a network display can be quite variable. Each of these images represents the exact same graph information. Social Network Data Structures Visualization Network visualization helps build intuition, but you have to keep the drawing algorithm in mind. Again, the same graph with two different techniques: Tree-Based layouts Spring embedder layouts (good) (Fair - poor) Most effective for very sparse, regular graphs. Very useful Most effective with graphs that have a strong when relations are strongly community structure (clustering, etc). Provides a very directed, such as organization clear correspondence between social distance and charts. plotted distance Two images of the same network Social Network Data Structures Visualization Another example: Tree-Based layouts Spring embedder layouts (poor) (good) Two layouts of the same network Social Network Data Structures Visualization Network visualization helps build intuition, but you have to keep the drawing algorithm in mind. Hierarchy & Tree models Use optimization routines to add meaning to the vertical dimension of the plot. This makes it possible to easily see who is most central by who is on the top of the figure. These also include some routine for minimizing line- crossing. Spring Embedder layouts Work on an analogy to a physical system: ties connecting a pair have „springs‟ that pull them together. Unconnected nodes have springs that push them apart. The resulting image reflects the balance of these two forces. This usually creates a layout with a close correspondence between physical closeness and network distance. In the next slides we give examples of successful graph layouts Social Network Data Structures Visualization A spring embedder layout of romantic relations in a single high school. This image “works” because the sparse nature of the graph allows you to easily trace all of the connections without any line crossings. 2 12 9 63 Male Female Social Network Data Structures Visualization Using colors to code attributes makes it simpler to compare attributes and relations. This plot compares the effectiveness of two different clustering routines on a school friendship network. Because the spring-embedder model pulls communities close, we would expect cohesive groups to be in the same region of the graph. This is what we see in the RNM solution at the bottom. Social Network Data Structures Visualization Social Network Data Structures Social Network Data Structures Social Network Data Structures Visualization As networks increase in size, the effectiveness of a point-and-line display routines diminishes, because you simply run out of plotting space. You can still get some insight by using the „overlap‟ that results in from a space-based layout as information. Here we plot a very large and dense network (the standard point-and-line image is in the upper right). Social Network Data Structures Visualization Adding time to social networks is also complicated, as you run out of space to put time in most network figures. One solution is to animate the network. Here we see streaming interaction in a classroom, where the teacher (yellow square) has trouble maintaining order. The SONIA software program (McFarland and Bender-deMoll) will produce these figures. Social Network Data Structures Data Representations Pictures only take us so far: from pictures to adjacency matrices b d b d a c e a c e Undirected, binary Directed, binary a b c d e a b c d e a 1 a 1 b 1 1 b 1 c 1 1 1 c 1 1 1 d 1 1 d e 1 1 e 1 1 Social Network Data Structures Data Representations From matrices to lists a b c d e Adjacency List Arc List a 1 ab b 1 1 ab ba bac bc c 1 1 1 cbde cb d 1 1 dce cd e 1 1 ecd ce dc de ec ed Social Networks & Diffusion “Goods” flow through networks: Social Networks & Diffusion In addition to the dyadic probability that one actor passes something to another (pij), two factors affect flow through a network: Topology - the shape, or form, of the network - Example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contact matters - Example: an actor cannot pass information he has not receive yet Social Networks & Diffusion Three features of the network‟s topology are known to be important: Reachability, Distance & Number of Paths (redundancy) Connectivity refers to how actors in one part of the network are connected to actors in another part of the network. • Reachability: Is it possible for actor i to reach actor j? This can only be true if there is a chain of contact from one actor to another. • Distance: Given they can be reached, how many steps are they from each other? •How efficiently do ties reach new nodes? (How clustered is the network) • Number of paths: How many different paths connect each pair? Social Networks & Diffusion Without full network data, you can‟t distinguish actors with limited diffusion potential from those more deeply embedded in a setting. c b a Social Networks & Diffusion Reachability Given that ego can reach alter, distance determines the likelihood of information passing from one end of the chain to another. • Because flow is rarely certain, the probability of transfer decreases over distance. • However, the probability of transfer increases with each alternative path connecting pairs of people in the network. Social Networks & Diffusion Reachability Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them. b d a b f a c e c f d e Paths can be directed, leading to a distinction between “strong” and “weak” components Social Networks & Diffusion Reachability Basic elements in connectivity •A path is a sequence of nodes and edges starting with one node and ending with another, tracing the indirect connection between the two. On a path, you never go backwards or revisit the same node twice. Example: a b cd •A walk is any sequence of nodes and edges, and may go backwards. Example: a b c b c d •A cycle is a path that starts and ends with the same node. Example: a bca Social Networks & Diffusion Reachability Reachability If you can trace a sequence of relations from one actor to another, then the two are reachable. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component. Intuitively, a component is the set of people who are all connected by a chain of relations. Social Networks & Diffusion Reachability This example contains many components. Social Networks & Diffusion Reachability In general, components can be directed or undirected. For a graph with any directed edges, there are two types of components: Strong components consist of the set(s) of all nodes that are mutually reachable Weak components consist of the set(s) of all nodes where at least one node can reach the other. Social Networks & Diffusion Distance & number of paths Distance is measured by the (weighted) number of relations separating a pair: Actor “a” is: 1 step from 4 2 steps from 5 3 steps from 4 4 steps from 3 5 steps from 1 a Social Networks & Diffusion Distance & number of paths Paths are the different routes one can take. Node-independent paths are particularly important. There are 2 independent paths connecting a and b. b There are many non- independent paths a Measuring Networks: Large-Scale Models Social Cohesion White, D. R. and F. Harary. 2001. "The Cohesiveness of Blocks in Social Networks: Node Connectivity and Conditional Density." Sociological Methodology 31:305-59. Moody, James and Douglas R. White. 2003. “Structural Cohesion and Embeddedness: A hierarchical Conception of Social Groups” American Sociological Review 68:103-127 White, Douglas R., Jason Owen-Smith, James Moody, & Walter W. Powell (2004) "Networks, Fields, and Organizations: Scale, Topology and Cohesive Embeddings." Computational and Mathematical Organization Theory. 10:95-117 Moody, James "The Structure of a Social Science Collaboration Network: Disciplinary Cohesion from 1963 to 1999" American Sociological Review. 69:213- 238 Measuring Networks: Large-Scale Models Social Cohesion •Networks are structurally cohesive if they remain connected even when nodes are removed. Each of these graphs have the exact same density. 0 1 2 3 Node Connectivity Measuring Networks: Large-Scale Models Social Cohesion Formal definition of Structural Cohesion: (a) A group’s structural cohesion is equal to the minimum number of actors who, if removed from the group, would disconnect the group. Equivalently (by Menger‟s Theorem): (b) A group’s structural cohesion is equal to the minimum number of node- independent paths linking each pair of actors in the group. Measuring Networks: Large-Scale Models Social Cohesion Structural cohesion gives rise automatically to a clear notion of embeddedness, since cohesive sets nest inside of each other. 2 1 3 9 4 8 10 11 5 7 12 13 6 14 15 17 18 16 19 20 2 22 23 Measuring Networks: Large-Scale Models Social Cohesion Project 90, Sex-only network (n=695) 3-Component (n=58) Measuring Networks: Large-Scale Models Social Cohesion IV Drug Sharing Connected Largest BC: 247 Bicomponents k > 4: 318 Max k: 12 Structural Cohesion simultaneously gives us a positional and subgroup analysis. Social Networks & Diffusion Distance & number of paths Probability of transfer by distance and number of paths, assume a constant pij of 0.6 1.2 1 10 paths 0.8 probability 5 paths 0.6 2 paths 0.4 1 path 0.2 0 2 3 4 5 6 Path distance Social Networks & Diffusion Clustering and diffusion Arcs: 11 Arcs: 11 Largest component: 12, Largest component: 8, Clustering: 0 Clustering: 0.205 Clustering turns network paths back on already identified nodes. This has been well known since at least Rappaport, and is a key feature of the “Biased Network” models in sociology. Social Networks & Diffusion Diffusion features on static graphs Social Networks & Diffusion Example on static graphs Social Networks & Diffusion Example on static graphs Define as a general measure of the “diffusion susceptibility” of a graph as the ratio of the area under the observed curve to the area under the random curve. As this gets smaller than 1.0, you get effectively slower median transmission. Social Networks & Diffusion Example on static graphs Table 2. OLS Regression of Relative Diffusion Ratio on Network Structure Variable Model 1 Model 2 Model 3 Model 4 Model 5 *** *** Intercept 1.62 1.90 1.02*** 1.81*** 1.71*** Connectivity Distance -0.207*** -0.179*** -0.171*** Independent Paths -0.077*** -0.056*** -0.052*** Distance x Paths 0.023*** 0.015*** 0.016*** Clustering Clustering Coefficient -0.692*** -0.653*** -0.454*** Grade Homophily -0.026** -0.007 -0.009* *** Peer Group Strength -0.868 -0.141 -0.146 Degree Distribution Degree Skew -0.023 -0.007 -0.002 * Assortative Mixing -0.189 -0.059 -0.071 Control Variables Network Size/100 0.005*** -0.005*** -.005*** 0.004* 0.002** Proportion Isolated -0.007 -1.106*** -.984*** -0.300* 0.058 Non-Complete -0.006 -0.052* -.078** -0.006 0.018 2 Adj- R 0.85 0.76 0.60 0.90 0.93 N 124 124 124 124 121 Social Networks & Diffusion Example on static graphs Figure 4. Relative Diffusion Ratio By Distance and Number of Independent Paths 1.2 1 Observed / Random k=8 0.8 k=6 k=4 0.6 k=2 0.4 2.3 2.8 3.3 3.8 4.3 4.8 5.3 5.8 6.3 Average Path Length Social Networks & Diffusion Centrality Centrality refers to (one dimension of) location, identifying where an actor resides in a network. • For example, we can compare actors at the edge of the network to actors at the center. • In general, this is a way to formalize intuitive notions about the distinction between insiders and outsiders. Social Networks & Diffusion Centrality At the individual level, one dimension of position in the network can be captured through centrality. Conceptually, centrality is fairly straight forward: we want to identify which nodes are in the „center‟ of the network. In practice, identifying exactly what we mean by „center‟ is somewhat complicated, but substantively we often have reason to believe that people at the center are very important. Three standard centrality measures capture a wide range of “importance” in a network: •Degree •Closeness •Betweenness Social Networks & Diffusion Centrality A common measure of centrality is closeness centrality. An actor is considered important if he/she is relatively close to all other actors. Closeness is based on the inverse of the distance of each actor to every other actor in the network. Closeness Centrality: 1 g Cc (ni ) d (ni , n j ) j 1 Normalized Closeness Centrality CC (ni ) (CC (ni ))( g 1) ' Social Networks & Diffusion Centrality Closeness Centrality in 4 examples C=1.0 C=0.0 C=0.36 C=0.28 Measuring Networks: Flow Time Two factors that affect network flows: Topology - the shape, or form, of the network - simple example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contacts matters - simple example: an actor cannot pass information he has not yet received. Measuring Networks: Flow Time Timing in networks A focus on contact structure has often slighted the importance of network dynamics,though a number of recent pieces are addressing this. Time affects networks in two important ways: 1) The structure itself evolves, in ways that will affect the topology an thus flow. 2) The timing of contact constrains information flow Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 1 Data on drug users in Colorado Springs, over 5 years Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 2 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 3 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 4 Current year in red, past relations in gray Measuring Networks: Flow Time Drug Relations, Colorado Springs, Year 5 Current year in red, past relations in gray When is a network? Source: Bender-deMoll & McFarland “The Art and Science of Dynamic Network Visualization” JoSS Forthcoming When is a network? At the finest levels of aggregation networks disappear, but at the higher levels of aggregation we mistake momentary events as long-lasting structure. Is there a principled way to analyze and visualize networks where the edges are not stable? There is unlikely to be a single answer for all questions, but the set of types of questions might be manageable: •Diffusion and flow (networks as resources or constraints for actors): •The timing of relations affects flow in a way that changes many of our standard measures. If our interest is in “Relational ties [as] channels for transfer or flow of resources” (W&F p.4), then we can use the diffusion process to shape our analyses. •Structural change (networks as dynamic objects of study). •The interest is in mapping changes in the topography of the network, to see model how the field itself changes over time. •Ultimately, this has to be linked to questions about how network macro- structures emerge as the result of actor behavior rules. Network Dynamics & Flow The key element that makes a network a system is the path: it‟s how sets of actors are linked together indirectly. A walk is a sequence of nodes and lines, starting and ending with nodes, in which each node is incident with the lines following and preceding it in a sequence. A path is a walk where all of the nodes and lines are distinct. Paths are the routes through networks that make diffusion possible. In a dynamic network, the timing of edges affect whether a good can flow across a path. A good cannot pass along a relation that ends prior to the actor receiving the good: goods can only flow forward in time. A time-ordered path exists between i and j if a graph-path from i to j can be identified where the starting time for each edge step precedes the ending time for the next edge. The notion of a time-ordered path must change our understanding of the system structure of the network. Networks exist both in relation-space and time-space. Network Dynamics & Flow A time-ordered path exists between i and j if a graph-path from i to j can be identified where the starting time for each edge step precedes the ending time for the next edge. Note that this allows for non-intuitive non-transitivity. Consider this simple example: 1-2 3-4 1-2 A B C D Here A can reach B, B can reach C, and C and reach D. But A cannot reach D, since any flow from A to C would have happened after the relation between C and D ended. Network Dynamics & Flow This can also introduce a new dimension for “shortest” paths: 3-4 B C A D E The geodesic from A to D is AE, ED and is two steps long. But the fastest path would be AB, BC, CD, which while 3 steps long could get there by day 5 compared to day 7. Network Dynamics & Flow Reachability 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Direct Contact Network of 8 people in a ring Network Dynamics & Flow Reachability 1 2 2 2 2 2 1 1 1 2 2 2 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 2 2 2 2 1 1 2 2 2 2 2 2 1 1 1 2 2 2 2 2 1 Implied Contact Network of 8 people in a ring All relations Concurrent Network Dynamics & Flow Reachability 3 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 = 0.57 reachability Implied Contact Network of 8 people in a ring Mixed Concurrent Network Dynamics & Flow Reachability 8 1 1 1 1 1 1 1 1 7 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 6 1 1 1 1 1 1 1 5 4 1 1 = 0.71 reachability Implied Contact Network of 8 people in a ring Serial Monogamy (1) Network Dynamics & Flow Reachability 8 1 1 1 1 1 1 7 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 6 1 1 1 1 1 1 1 1 4 1 1 = 0.51 reachability Implied Contact Network of 8 people in a ring Serial Monogamy (2) Network Dynamics & Flow 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 = 0.43 reachability 1 2 Which is the minimum possible reachability given the contact structure. Minimum Contact Network of 8 people in a ring Serial Monogamy (3) Identifying the Minimum Path Density of a Graph A 2-regular graph t2 t2 t2 t1 t1 t1 t1 t2 t2 l 3g 4 line l 3g cycle Identifying the Minimum Path Density of a Graph A 3-regular spanning tree t3 t2 t1 t2 t1 t1 14 15 t3 16 t2 13 t3 t1 t2 t3 5 6 t1 t2 t1 t2 t1 17 12 t3 t3 t2 4 2 10 t3 t1 t3 7 t2 1 t2 t1 t2 t1 18 11 t2 t3 t3 3 t1 t3 t2 9 8 t2 t1 19 t2 22 t3 t1 t3 t3 21 20 t2 t2 t1 t3 l = 7g Identifying the Minimum Path Density of a Graph A 3-regular grid t3 t3 t3 t1 t2 t1 t2 t1 t2 t3 t3 t3 t3 t2 t1 t2 t1 t2 t1 Each person can reach 4 t3 t3 t3 people indirectly., leading again to 7g total arcs per person. t1 t2 t1 t2 t1 t2 t3 t3 t3 t3 t2 t1 t2 t1 t2 t1 t3 t3 t3 t1 t2 t1 t2 t1 t2 t3 t3 t3 t3 t2 t1 t2 t1 t2 t1 t3 t3 t3 Identifying the Minimum Path Density of a Graph A 3-regular linked clusters 2 t2 6 10 t1 t1 t2 t1 t2 t3 t3 t3 t3 1 t3 4 5 t3 8 9 t3 12 t2 t1 t2 t1 t2 t1 3 7 11 If you count self-loops, one still hits 7l overall. Reachability as a function of relationship adjacency Identified paths: For a regular graph with d()=T t1 t2 T T (T 1)(T 2)...(T l 1) t1 t3 Pi t2 t3 l 2 l! t1 t2 t3 I think it‟s an open question to define a minimum reachability graph for non-regular structures. Network Dynamics & Flow In this graph, timing alone can change mean reachability from 2.0 when all ties are concurrent to 0.43: a factor of ~ 4.7. 2 1 In general, ignoring time order is equivalent to assuming all relations occur simultaneously – 1 2 assumes perfect concordance across all relations. 2 1 1 2 Network Dynamics & Flow 2 1 At the graph level, we are interested in two 1 2 properties immediately: 2 1 a) the temporal-implied reachability (perhaps relative to minimum) 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 b) The asymmetry in reachability. What proportion 1 1 1 1 1 1 1 of reachable dyads can mutually reach each 1 1 1 1 1 1 1 1 1 1 1 other? 1 1 1 1 1 1 1 1 1 These are directly relevant for overall diffusion potential in a network. Alternative measures: Relative Reach P R min( P ) Conditional Reachability (Harary, 1983) P min( P ) max( P ) min( P ) Network Dynamics & Flow The distribution of paths is important for many of the measures we typically construct on networks, and these will be change if timing is taken into consideration: Centrality: Closeness centrality Path Centrality Information Centrality Betweenness centrality Network Topography Clustering Path Distance Groups & Roles: Correspondence between degree-based position and reach-based position Structural Cohesion & Embeddedness Opportunities for Time-based block-models (similar reachability profiles) In general, any measures that take the systems nature of the graph into account will differ in a dynamic graph from a static graph. Network Dynamics & Flow New versions of classic reachability measures: 1) Temporal reach: The ij cell = 1 if i can reach j through time. 2) Temporal geodesic: The ij cell equals the number of steps in the shortest path linking i to j over time. 3) Temporal cohesion: The ij cell equals the number of time-ordered node- independent paths linking i to j. These will only equal the standard versions when all ties are concurrent. Duration explicit measures 4) Quickest path: The ij cell equals the shortest time within which i could reach j. 5) Earliest path: The ij cell equals the real-clock time when i could first reach j. 6) Latest path: The ij cell equals the real-clock time when i could last reach j. 7) Exposure duration: The ij cell equals the longest (shortest) interval of time over which i could transfer a good to j. Each of these also imply different types of “betweenness” roles for nodes or edges, such as a “limiting time” edge, which would be the edge whose comparatively short duration places the greatest limits on other paths. Network Dynamics & Flow Define time-dependent closeness as the inverse of the sum of the distances needed for an actor to reach others in the network.* 1 CTDCloseness ( Dij ) T j Actors with high time-dependent closeness centrality are those that can reach others in few steps given temporal order. Note this is directed. Since Dij =/= Dji (in most cases) once you take time into account. *If i cannot reach j, I set the distance to n+1 Network Dynamics & Flow Timing affects the symmetry of a symmetric contact graph. 8-9 C E 2-5 A B 3-5 D F Numbers above lines indicate contact periods Network Dynamics & Flow Timing affects the symmetry of a symmetric contact graph. C E A B D F Network Dynamics & Flow Define fastness centrality as the average of the clock-time needed for an actor to reach others in the network: C fast 1 N 1 max( time) time j ij Actors with high fastness centrality are those that would reach the most people early. These are likely important for any “first mover” problem. Network Dynamics & Flow Define quickness centrality as the average of the minimum amount of time needed for an actor to reach others in the network: Cquick 1 N 1 min( T j jit Tit ) Where Tjit is the time that j receives the good sent by i at time t, and Tit is the time that i sent the good. This then represents the shortest duration between transmission and receipt between i and j. Note that this is a time-dependent feature, depending on when i “transmits” the good out into the population. The min is one of many functions, since the time-to-target speed is really a profile over the duration of t. Network Dynamics & Flow Define exposure centrality as the average of the amount of time that actor j is at risk to a good introduced by actor i. Cexposure 1 N 1 (T j ijl Tijf ) Where Tijl is the last time that j could receive the good from i and Tiif is the first time that j could receive the good from i, so the difference is the interval in time when i is at risk from j. Network Dynamics & Flow How do these centrality scores compare? Here I compare the duration-dependent measures to the standard measures on this example graph. Based only on the structure of the ties, this graph has lots of different centers, depending on closeness, betweeneess or degree (size). In this graph, Closeness and Betweenness correlate at 0.64, Closeness and Degree at 0.56, and Betweeness and degree at 0.71 Node size proportional to degree Network Dynamics & Flow How do these centrality scores compare? Here I compare the duration-dependent measures to the standard measures on this example graph. But these edges are timed, since publications occur at a particular date. Here I treat the edges as lasting between the first and last publication date, and animate the resulting network. Dark blue edges are active, past edges are “ghosted” onto the map. Make note of the fairly high concurrency (some of it necessary due to two-mode data). Network Dynamics & Flow How do these centrality scores compare? At the individual level, what is the relation between structural centrality and duration centrality? Network Dynamics & Flow How do these centrality scores compare? At the individual level, what is the relation between structural centrality and duration centrality? Network Dynamics & Flow How do these centrality scores compare? Here I compare the duration-dependent measures to the standard measures on this example graph. Correlation w. Closeness centrality Box plots based on 500 permutations of the observed time durations. This holds constant the duration distribution and the number of edges active at any given time. Network Dynamics & Flow How do these centrality scores compare? What about at the system level? How do the features of the temporal ordering affect the overall asymmetry in reachability and the proportion of pairs reachable? Reachability Asymmetry Concordance (k3) Concordance (k3) Network Dynamics & Flow How do these centrality scores compare? The “most important actors” in the graph depend crucially on when they are active. The correlations can range wildly over the exact same contact structure. Concordance is important, but not determinant (at least within the range studied here). We need to extend our intuition on the global distribution of time in the graph. The “centrality” scores described here are low-hanging fruit: simple extensions of graph-based ideas. But the crucial features for population interests will be creating aggregations of these features – something like “centralization” that captures the regularity, asymmetry and temporal role-structure of the network. Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, when the graph is sparse, helps us see the emergence of the graph, but diffusion paths are difficult to see: Consider an example: Romantic Relations at “Jefferson” high school Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, even when the graph is sparse, does not typically help us see the potential flow space, as it‟s just too hard to follow the implication paths with our eyes, so it seems better to plot the implied paths directly. Consider an example: Plotting the reachability matrix can be informative if the graph has clear pockets of reachability: Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, even when the graph is sparse, does not typically help us see the potential flow space, as it‟s just too hard to follow the implication paths with our eyes, so it seems better to plot the implied paths directly. Consider an example: Plotting the reachability matrix can be informative if the graph has clear pockets of reachability: (Good readability example) Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, even when the graph is sparse, does not typically help us see the potential flow space, as it‟s just too hard to follow the implication paths with our eyes, so it seems better to plot the implied paths directly. Consider an example: Edges have discrete start and end times, tagged as days over a 2-year window: so first contact between nodes 10 and 4 was on day 40, last contact on day 72. Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, even when the graph is sparse, does not typically help us see the potential flow space, as it‟s just too hard to follow the implication paths with our eyes, so it seems better to plot the implied paths directly. Consider an example: Here we plot the reachability matrix over the coordinates for the direct network. . Direct ties are retained as green lines, if node i can reach node j, then a directed arrow joins the two nodes. Here I mark cases where two nodes can reach each other with red, purely asymmetric with blue. This is accurate, but hard to read when reachability paths are long. (poor readability example) Network Dynamics & Flow How can we visualize such graphs? Animation of the edges, even when the graph is sparse, does not typically help us see the potential flow space, as it‟s just too hard to follow the implication paths with our eyes, so it seems better to plot the implied paths directly. Consider an example: Various weightings of the indirect paths also don‟t help in an example like this one. Here I weight the edges of the reachability graph as 1/d, and plot using FR. You get some sense of nodes who reach many (size is proportional to out- reach). Here you really miss the asymmetry in reach (the correlation between number reached and number reached by is nearly 0). Network Dynamics & Flow How can we visualize such graphs? Another tack is to shift our attention from nodes to edges, by plotting the line graph (thanks to Scott Feld for making this suggestion). The idea is to identify an ordering to the vertical dimension of the graph to capture the flow through the network. Consider an example: So now we: 1) Convert every edge to a node 2) Draw a directed arc between edges that (a) share a node and (b) precede each other in time. Network Dynamics & Flow How can we visualize such graphs? Another tack is to shift our attention from nodes to edges, by plotting the line graph (thanks to Scott Feld for making this suggestion). The idea is to identify an ordering to the vertical dimension of the graph to capture the flow through the network. Consider an example: So now we: 1) Convert every edge to a node 2) Draw a directed arc between edges that (a) share a node and (b) precede each other in time. 3) Concurrent edges (such as {13-8 and 13-5} or {1-16,2-16} will be connected with a bi-directed edge (they will form completely connected cliques) while the remainder of the graph will be asymmetric & ordered in time. Network Dynamics & Flow Further Complications, that ultimately link us back to the question of “When is a network” 1) Range of temporal activity - When the graph is globally sparse (like the example above), the path-structure will also be sparse. Increasing density will lead to lots of repeated interactions, and thus reachability cycles. - Consider email exchange networks or classroom communication networks vs. sexual networks. In sexual or romantic networks, returning to a partner once the relation has ended is rare, in communication networks it is common. 2) Observed vs. Real - We will often have discrete observations of real-time processes. How do we account for between-wave temporal ordering? What are the limits of observed measures to such inter-wave activity? - The Snijders et. al. Siena modeling approach is an obvious first step here. Network Dynamics & Flow Further Complications, that ultimately link us back to the question of “When is a network” 3) Temporal reachability as higher-order model feature - As the capacity of ERGM models continue to expand, we can start to build temporal sequence rules in to the local models (such as communication triplets, or avoidance of past relations once ended), which then makes it sensible to ask whether the models fit the time-structure of the data. 4) Optimal observation windows Either for data collection or visualization, we often have to decide on a time-range for our analyses. What should that range be? 5) Relational temporal asymmetry. For many types of relations, it is difficult to decide when relations end. This taps a distinction between activated and potential relations.

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 2 |

posted: | 9/3/2011 |

language: | English |

pages: | 115 |

OTHER DOCS BY yaofenji

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.