Document Sample
networks Powered By Docstoc
					    Networks and Graphs

IS 247 Information Visualization and Presentation
                  19 April 2002

                 James Reffell
               Moryma Aydelott
             Jean-Anne Fitzpatrick
            This week‘s papers
• K. M. Fairchild, S. E. Poltrock, and G. W. Furnas, ―SemNet:
  Three-Dimensional Representations of Large Knowledge Bases‖
• S. G. Eick and G. J. Wills, ―Navigating Large Networks with
  Hierarchies‖ (1993)
• R. A. Becker, S. G. Eick, and A. R. Wilks, ―Visualizing Network
  Data ―(1995)
• T. Munzner, ―Exploring Large Graphs in 3D Hyperbolic Space‖
• T. Munzner, F. Guimbretière, and G. Robertson, ―Constellation:
  A Visualization Tool for Linguistic Queries from MindNet‖ (1999)
              Problem Statement
 ―Visual representations of generalized
  graphs of even modest size tend to look
  like a ball of tangled string. While the
  indicated relationships may be logically
  correct, they may also be visually

(From Readings in Information Visualization: Using Vision to Think ,
   Introduction to Section 2.5)
                   Key Goals
• Represent complexity of graphs, including cross-links,
• Scale to large networks (thousands of nodes, millions
  of edges)
• Provide interaction and navigation that assists in
• Position and represent nodes in a way that is clear
  and conveys information
• Display links in a way that is clear and conveys
            Key Goals, Cont‘d
Note that the last two goals each include a
  negative side:
   – Avoid occlusion, avoid edge crossing, avoid
     overwhelming quantity of edges, etc., so that
     visualization is clear
and a positive side:
   – Use positioning, retinal qualities, and
     selection/filtering to actually add to the
     information conveyed

   Rules are made to be broken?
              Purposes / Uses
• Various applications:
  –   Software architecture
  –   Web architecture
  –   Semantic relationship data
  –   Physical network data (Internet traffic)
  –   Large social networks
  –   Email, telephone (Social and physical networks!)
  –   The ―old reliable‖ data set: the Unix file system
  –   More?
• Filtering
• Clustering
• Aggregation
• Focus + context
• Semantic zooming
• Selection of organizing structure (geographic map,
  spanning tree, semantic structure)
• Selection of representation (color, size, glyphs)

    Typically, these techniques reduce the amount of
    detail displayed simultaneously; interaction is key.
           Important point!
• Fairchild notes that visualizations of large
  networks are constrained by the limitations of
  hardware responsiveness and human
  perceptual capabilities:
   – By Moore‘s law, the former can be
     expected to (and has) become less of an
   – The latter will not!
   SemNet// Fairchild et al.
Fairchild, Poltrock & Furnas, SemNet:
  Three-Dimensional Graphic
  Representations of Large Knowledge
  Bases (1988)
– Early work, visualization of large knowledge base
– General overview of problems & solutions for large
  network visualizations
– Directed graphs in three dimensional space
– Labeled rectangles represent groups of rules
  expressed as prolog modules. These are connected
  by colored arcs representing possible paths between
  the groups of rules. Messages are shown as labeled
  objects that move from group to group via the arcs.
   SemNet// Fairchild et al.

Positioning Elements
  - One purpose of a network visualization is to reveal structure:
    ―the details … are de-emphasized and the structure is
  - Large number of arc (edge) crossings a problem—3D helps,
    but layout is an issue
  - Poor positioning can confuse, hide structure instead of
    revealing it
  - No general solution, can be dependent on domain and other
  - Three tactics: mapping, connectivity, manipulation
     SemNet// Fairchild et al.
Positioning Elements (cont‘d)
  - Mapping: Using domain-specific properties to arrange layout.
      - Examples: Geography, dimensions of mammals (size, predacity,
  - Connectivity: Using structural properties to arrange layout.
      - Elements that are more directly connected to other elements should be
        displayed as closer together.
      - Perfection is not possible (b/c of dimension limitations)
      - Technique: Multidimensional scaling
      - Technique: Heuristics
          - Initial conditions—sometimes random
  - Manipulation: User-controlled layout.
      - Can be used in combination with other methods
     SemNet// Fairchild et al.
Coping with too much information
   – One method: Show subsets by type / property
   – Another: Fisheye Views! Approaches include:
      • Clustering: Group elements together. Elements closer to the focal
        point in small sets, elements farther away in progressively larger sets.
        Clusters are represented as rectangles.
           – SemNet adds functions to assist with understanding clusters: color coding
             of rectangles to indicate adjacency to focal point, naming by most-
             connected node, representation of proportion of total nodes included in
             each set
           – Note: this approach is similar to semantic zooming—and they suggest this
             as a possibility for extension!
      • 3D Point Perspective: Implicit in 3D view—nearer objects appear
        larger than those farther away.
      • Sampling Density (not used): Focal point has higher resolution,
        distant nodes fade into nothingness
   – Arcs! Arcs are only shown when an element in connects is
      SemNet// Fairchild et al.
Navigation & Browsing
   – Recognition: Where am I?
       •   Allow users to leave ‗markers‘
       •   Path retracing
       •   Consistency with 3D metaphor is important
       •   Depth—oscillation distracting, small random movement OK
   – Control & Movement: How do I move around?
       • Relative: Movement along 3 rotations plus forward and backward
         movement. Difficult to control and very disorienting.
       • Absolute: Movement using separate (2D each) maps of the space. Fine
         control difficult, and relationship between the dimensions
       • Teleportation: Move directly to a location—normally one already
         visited (so similar effect to path retracing)
       • Hyperspace: Movement by semantic relationship. Links!
       • Moving the space: Rotate and move the structure rather than the
        SemNet// Fairchild et al.
Other issues:
• Dynamic execution
     – Labeled sprites representing pieces of
       words move along arcs.
     – User controls speed
     – Color changes of arcs and elements
       indicate status (used / unused)
•   Application use
     – This application is for debugging—so is
       domain specific, and includes tools for
       doing so.
   Bell Labs Papers: Eick and
  Wills // Becker, Eick and Wilks
S. G. Eick and G. J. Wills, Navigating Large
 Networks with Hierarchies (1993)
• Visualizations of e-mail and software engineering data
• Data used has no a direct spatial layouts

R. A. Becker, S. G. Eick, A. R. Wilks, Visualizing
 Network Data (1995)
• Multiple visualizations of long distance network, internet traffic, and e-mail data.
• Data often has a direct spatial layout.

(Both emphasize user control/ manipulation of display)
   Bell Labs Papers: Eick and
  Wills // Becker, Eick and Wilks
― Since the needs of each user are unique, our visualizations are
task-oriented. Our most successful visualizations help frame
interesting questions as well as answer them. Our
 • Make use of existing data….
 • Focus on real problems with targeted users….
 • Leverage interaction. … Dynamic interaction allows users to
   separate the wheat from the chaff.
 • Are information dense. …
 • Focus on understanding and insight. Results are more
   important than any particular technique.‖

    Navigating Large Networks
         with Hierarchies
                          (Eick and Wills)

• Used to examine large hierarchal networks of 500 – millions of
  nodes (they claim).
• Nodes placed to show relationships (not to convey geographic
  data or use all screen space).
• Node area and color show size and function ; Links show which
  relationships exist and how ―hot‖ they are.
• User controls time period viewed, the ―spread‖ of links colors;
  can change view to hide/ display chosen portions or focus on
  selected regions.
   Navigating Large Networks
        with Hierarchies
                         (Eick and Wills)

• Placement algorithm showed both expected and surprising
• Using hierarchal data for scalable aggregation allowed users to
  reduce the size of and actually interact with the data set without
  reducing content (they claim)
• Zooming allows users to drill down the hierarchy (from module
  to file views)
• Use of linking (for link strengths) and mouseovers (for labels)
  adds to usability.
    Navigating Large Networks
         with Hierarchies
                          (Eick and Wills)

What‘s different (and actually works):
• Not using geography to map data with geographic components
  (their rationale: since geography is known and unchanging why
  bother showing it)
• Not mapping data to give max distance between nodes (their
  rationale: more chance for overlap but should increase
  significance of distance between nodes)
• Not trying to show everything - heavy use of aggregation
  (reducing 8 million links to a much smaller set, then displaying
  only the top 1%) reduced screen clutter yet still allowed users
  to see important data. But who knows if something valuable
  was left out?
    Navigating Large Networks
         with Hierarchies
                          (Eick and Wills)

What‘s different (and not so effective):
• Confusing color mappings (using red for both implies a
  connection between clerical nodes and ―hot‖ links)
• Redundant node messages (box shape and color both say the
  same thing)
• Filtering may result in nodes without links, not considered a
  problem in the paper but may be confusing for a user
• Aesthetic considerations – color combinations are a bit jarring
     Visualizing Network Data
                  (Becker, Eick, Wilks)

SeeNet System -
• A walk through varied representations of the same
  data: AT&T Long Distance Network calls to/ from the
  Bay Area after the October 17, 1989 Loma Prieta

Other Applications of the SeeNet System
• CICNet, E-mail communications, World Internet data
     Visualizing Network Data
                   (Becker, Eick, Wilks)

Similarities to previous system
• Similar uses of color and node proportion
• User can interact with and manipulate display

With some differences…
• (In many cases) nodes placed to reflect their
  geographic/ spatial relationships
• Multiple types of views available
      Visualizing Network Data
                    (Becker, Eick, Wilks)

Representations Used:
• Bi-directional values shown in one line.
• Box width/ height to show numbers of
  inbound/ outbound calls
• Link colors and width reflect statistical
• Matrix position (approximately) reflects
  geographic position
• Matrix box size (if small) and color (if
  not blue) represent call load
         Visualizing Network Data
                             (Becker, Eick, Wilks)

Pro                                        Con
• Shows network connectivity               • Cross country lines obscure middle of the
                                              country data
• Color and line thickness compactly
    convey statistical data                • Longer lines are ―given undue visual
• Half-lines aren‘t so bad when you know
    where at least one half goes           • When they could be to/ from anywhere,
                                              half-lines are hard to follow
                                           • (and NJ/ NY are in the Atlantic!)
       Visualizing Network Data
                           (Becker, Eick, Wilks)

Node Map – Pro                           Matrix – Pro
• Display is uncluttered                 •   All links given same visual importance
                                         •   Easy to see patterns
Node Map – Con
• Aggregation again – only overall       Matrix – Con
   node data available                   •   ―Ambiguity of row and column order‖ -
• Lots of empty space that‘s wasted?         could maybe get a feel for location, but
                                             have to select a data point to really
                                             know what nodes are involved.
         Visualizing Network Data
                               (Becker, Eick, Wilks)
Parameter Focusing
Statistics (using logs or percentages or … ), Levels (selecting data to show or
suppress), Geography (zooming in or panning out), Time (selecting a time period),
Size (changing overall size of symbols or link length/ width), and Color (using color
slider to maximize/ highlight differences )

Direct Manipulation
Identification (mousing over a node or link to see the data behind it), Link Map
Parameter Controls (dynamic manipulation of link color, width and length),
Matrix Display Parameter Controls (dragging and dropping rows and columns),
Node Map Parameter Controls (dynamic manipulation of node color and size),
Animation (controlling speed), Zooming (changing focus, filter by what‘s in the
view), Conditioning (viewing two variables at once), and Sound (to convey state,
frame and selection changes)
          Other SeeNet Applications

Adaptability of the system
demonstrated by the variety of
data it can effectively display

•   (Long Distance Network Load)
•   Internet Packet Flows
•   E-mail Communication
•   Country-Country Internet Traffic
                               Further Work
   (Though the image below is from: Kenneth C. Cox, Stephen G. Eick, and Taosong He, 3D geographic network
                         displays, ACM Sigmod Record, 25(4), 50-54, December 1996)

Internet traffic flows between fifty countries, as measured by the NSFNET backbone in 1993.

• Differing heights has the effect of making the most important (high traffic) links, the highest and
  therefore most visually prominent on the map.
• Overlapping lines and the flat orientation of the map make it difficult to pick out what arcs go from
  where to where.
    Many more interactive infovis systems
       from AT&T Bell Laboratories :
                             (An overview available at

•   SeeData relational data
•   SeeDiff file system differences
•   SeeLib bibliographic databases
•   NicheWorks[1] abstractnetworks
•   SeeLog time-stamped log reports
•   SeeNet[2] linked geographic data
•   SeeSlice[3] program slices and codecoverage
•   SeeSoft[4] lines of text in files
•   SeeSys[5] hierarchical software modules
•   SeeTree[6] hierarchical data
              H3 / H3Viewer

A ―second-generation hyperbolic cone tree‖
• Similar concept to Lamping and Rao‘s hyperbolic
   browser (seen last week), but projects onto sphere
   rather than circle, handles graphs as well as trees
• Also draws on ConeTree (seen several times earlier in
   semester), but distributes child nodes on surface of
   hemisphere rather than circumference of circle
• Basis for 3DXML application (paper by Risden et al.
   discussed in week on empirical evaluation)
             H3 / H3Viewer
• Spanning tree used as backbone for layout
  – A tree that contains all nodes (but only a subset of
    the edges) of the graph
• Crucial point: Choosing the right spanning
  tree requires domain-specific knowledge.
  – ―The key idea is that many non-tree graphs exist
    for which the right spanning tree can provide a
    useful mental model of the entire structure‖
  – Examples: Directory structure as spanning tree for
    web architecture, total execution time data to
    select spanning tree for function calls
                H3 / H3Viewer
• Benefits of hyperbolic projection as noted before:
  large amount of information can be displayed, distant
  objects are automatically reduced to < 1 pixel, but
  even distant trees (when visible) can be perceived as
  dense or sparse.
• Various techniques to preserve orientation
   – Animation
   – Rotate to maintain orientation with ancestors on left,
     descendants on right
   – Child node with the most descendants always at ―pole‖ of
• Static or dynamic attributes of nodes and edges can
  be coded with color, line-width
             H3 / H3Viewer
• Described as providing ―reasonable balance
  between information density and clutter‖
• Interactivity supports exploration
  – Also implies each view need not be ―polished‖ in
    part because user can adjust it, e.g, if lines
    intersect or nodes occlude
• ―Graph as index‖: combine with other views
  to get all details
  – Example: Site Manager
      (Munzner, Guimbretière, and Robertson)

• Very different graph layout
• Special purpose, highly complex subject
  domain! (semantic networks)
• Interesting interface features:
  – hovering for ―light-weight query‖
  – pie menu for selection of constellation to highlight
• Data is calculated ―best paths‖ between
  two words:
  – Words actually on the path linking the start
    and end term
  – Words whose definitions were used in
    constructing the path
  – Different relationships between words are
    encoded (e.g., part of, is a, has a)
• Paths are ordered by plausibility (before
  being graphed)
• Layout uses left-to-right orientation to encode
  plausibility (vs. more typical use of layout to
  show clustering / distance between nodes)
• ―Plausibility gradient‖ is also represented by
• Does not attempt to minimize edge crossings;
  instead, use visual properties such that most
  edges remain in background except when
  part of highlighted ―constellation‖ of nodes
  and edges

• Video makes it clearer! (I hope)
• Illustrates:
  – Rules were made to be broken
  – Intersection of mental model and data
            User Testing?
• Fairchild: Evaluation based on user
  study, but data not included
• Bell Labs: mention users but not testing
• Munzner et al.:
  – Previous study on XML3D
  – Constellations used user-centered design
    process, no formal testing because only 3