Analysis of Fusing Online and Co-presence Social Networks

Document Sample
Analysis of Fusing Online and Co-presence Social Networks Powered By Docstoc
					Juan (Susan) Pan, Daniel Boston, and
Cristian Borcea
Department of Computer Science
New Jersey Institute of Technology
   Traditional social apps               Location-aware social apps

   Socially-aware apps
        BUBBLE Rap
             Use social knowledge to improve packet forwarding in delayed
              tolerant networks
        Tribler
             Use social knowledge to reduce peer-to-peer communication
   Declared by users
     Implicitly, through online social networks
     Explicitly, through surveys
   Extracted from user online interactions
   Extracted from user mobility traces
     Location traces
     Co-presence traces (e.g., using Bluetooth)

   Multiple social graphs (e.g., Facebook and co-presence)
     Vertices -> users
     Edges -> social ties
   Online social networks (OSN) provide relatively stable
    social graph
     Many connections are weak
      ▪ Example: actors have millions of “friends”
     Not all social contacts use OSN apps
   Co-presence social network (CSN) identifies social ties
    grounded on real-world interactions
     Hard to differentiate social connections from passers-by
 Do OSN and CSN just reinforce each other or
  capture different types of social ties?
 Can a fused network take advantage of the
  strengths of both?
     How can we quantify the benefits of this fusion?
     Can we measure the contribution of each source network
      to the fused network?

   Motivation
   Data collection
   Social graph representation
   Analysis of global network parameters
   Analysis of local network parameters
   Conclusions

   One month of CSN data and Facebook data for the
    same set of 104 students
     Volunteers
     Received compensation
     Belong to various departments at NJIT

               User   Seen   Time
               A      B      1:00
               B      A      1:05
               B      C      1:05
               A      B      1:07
               A      C      1:07


                        Max          Mean       Standard Dev.
Meeting Duration    220 hrs 2 min   1hr 16min    7hrs 34 min
Meeting Frequency        51            2.2           3.7
   Subjects gave us permission to
    collect data
     Friends, wall writings, comments,
      photo tags
   Online interaction is wall
    writing, comment or photo tag
     Count number of interactions
      between user pairs

                             Max          Means   Standard Dev.
    Online                    40            2          4
   Motivation
   Data collection
   Social graph representation
   Analysis of global network parameters
   Analysis of local network parameters
   Conclusions

 OSN: Weightonline = number of interactions
 CSN: Weightco-presence = 0.5 х Weightduration +
                           0.5 х Weightfrequency

   How to remove edges due to passers-by in CSN?
   Find duration & frequency thresholds for adding a CSN
     Very short and infrequent co-presence does not indicate the
      presence of a social tie
     Increase thresholds until Edit distance between CSN and OSN
      ▪ Edit distance: number of edge additions/deletions to transform one graph
        into the other
      ▪ Keep OSN unchanged because Facebook friendship confirmations validate
        social ties

Total meeting duration threshold   Total meeting frequency threshold
   α= 160 minutes per month               β= 3 times per month
Co-presence Social                               Online Social
    Network                                       Network

                     Fused Network (51 shared edges)             15
   Motivation
   Data collection
   Social graph representation
   Analysis of global network parameters
     Degree, connectivity, centrality, cohesiveness

   Analysis of local network parameters
   Conclusions

                                                                   3 nodes are
                                                                 social butterflies

                        OSN          Fused Correlation (online, co-presence)
      •    OSN degree follows proximately power law distribution
                                           = 0.202
    Average degree   3.17    3.77     5.96
      • CSN degree does not resemble as strong power-law
   Most nodes have high degree in either CSN or OSN, but not both
         distribution as OSN’s
            • Due to meeting with familiar and OSN
         3 nodes have high degree in both CSN strangers
         • average degree means people for fused network
    IncreasedConsequently, similar result observed meet different sets of
    contacts in the two source networks                                        17
                                    OSN           CSN          Fused Weighted

Number of edges                      165           196          310         N

Size of LCC                           63            84           98         N
(largest connected component)
                                           CSN contributes 27% more edges
                                           than OSN
Diameter of LCC                       7   • Compared to OSN, CSNN
                                                 8         7        has
Average length of shortest path      12.3
                                            55% more connected people
                                               21.98      8.77     Y
                                          • Almost all people connected
                                            in fused network
           • Average weighted shortest path reduced in fused
               • Stronger social connectivity: reason to leverage it in social apps
                                  OSN          CSN          Fused Weighted

Average weight betweenness        49.1        90.13         94.83    Y

Average length of shortest path   12.3        21.98         8.77     Y

Average edge weight               3.02         3.64         1.95     Y
• CSN has much longer average shortest
   path than OSN
Average weighted cluster    0.156      0.122                0.157    Y
    • Hence, average betweenness is high
 • Average edge weight shows that people
• • fusedhas higheraverage shortest path is
  In OSN network, cohesiveness
    interact more in real life than online
  low, but betweenness is highest
      • People become friends when sharing
 • Highly socially active person online is not
         common friends
   • Social centrality is improved
    necessarily highly socially active in real life
  • •OSN contributes more to fused
        Thus, smaller values in fused network
                                                      OSN           CSN
   Motivation
   Data collection
   Social graph representation
   Analysis of global network parameters
   Analysis of local network parameters
     Node, edge, community

   Conclusions
   Calculate Euclidean distance of the degree vector (104
    nodes) and shared edge weight vector (51 edges)
     Similarity is inverse of distance

                 Distance(OSN, CSN) Distance(OSN,   Distance(CSN,
                                        fused)          fused)

Weighted node           0.558             0.306         0.256
Node degree             0.399             0.305         0.225
Edge weight             0.560             0.324         0.295

   CSN more similar to fused network
   How to quantify community similarity across networks?
     Few communities are the same
     Better to quantify community overlapping
 Compute k-clique overlapping clusters on the three
  networks separately
 Use community overlapping matrix to compute distance
  between networks (inverse of similarity)

                                                     K=3    K=4   K=5
                                  Dist(OSN, fused)   2561   142   26.5

                                  Dist(CSN, fused)   2289   135   32.0

 Fused network has larger average size community than OSN and
  CSN (fused=6.1, CSN=4.9, OSN=5.2)
 CSN is closer to the fused network for weaker communities (k=3,4)
 OSN is closer to fused network for stronger communities(k=5)
 OSN contributes stronger social communities than CSN
 CSN and OSN represent two different classes of social
 Applications may benefit from fused network that
  merges CSN and OSN
     CSN increases the fused network connectivity and
      communication strength
     OSN strengthens the community structure and lowers the
      average path length of fused network
     Typical example is friend-of-friend apps

   Decentralized two-tier infrastructure
    for mobile social computing
   P2P tier
     Collects on-line social information
     Manages social state
     Runs user-deployed services to support
      mobile apps
     Dynamically adapts to geo-social context
       ▪ Energy-efficiency, scalability, reliability
   Mobile tier
     Runs mobile applications                         Application scenario: community
                                                         multimedia sharing system
     Collects geo-social information from
      phones                                                                        25
Acknowledgment: NSF Grant CNS-0831753

   Kostakos[2010]
       The networks are very sparse
       Co-presence social ties are based on only one meeting
       Does not consider user interaction (edge weight)
       There is no proper noise reduction
   Eagle[2009], Cranshaw[2010]
     Focused on using co-presence data to predict friendship
   Mtibaa[2008]
     Concluding that the two graphs are similar
     Conference over a single day
     These results cannot be broadened
 Node degrees in real-world large scale social
  networks often follow a power law distribution
 few nodes with many degrees and many others with
  few degrees


Shared By: