Measurement-driven Modeling and Design of Internet

Document Sample
Measurement-driven Modeling and Design of Internet Powered By Docstoc
					OSN Research As If Sociology Mattered

          Krishna P. Gummadi

     Networked Systems Research Group
                 OSN research today

• Computational sociology: A natural sciences approach
   – Gather and analyze OSN data to study problems in sociology
   – Sociologists today use pretty sophisticated computing tools

• Social computing: An engineering approach
   – Build systems that support / leverage human social interactions
   – But, we tend to treat human behavior as annoying noise
      • rather than leverage insights from sociology
                         This talk

• Argues that insights from sociology can help design
  better systems

• Example 1: Dunbar’s number
   – The case for decentralized content sharing in OSNs

• Example 2: Group attachment theory
   – How social network-based Sybil defenses do or don’t work
           Example 1: Dunbar’s number

• Limits the # of stable social relationships a user can have
   – To less than a couple of hundred
   – Linked to size of neo-cortex region of the brain
   – Observed throughout history since hunter-gatherer societies

• Also observed repeatedly in studies of OSN user activity
   – Users might have a large number of contacts
   – But, regularly interact with less than a couple of hundred of them
  User generated content sharing over OSNs

• A very popular activity over Facebook
   – UGC like pictures, videos, and wall posts

• Facebook is building massive datacenters to support UGC
   – Uses Akamai to deliver it

• But, most of Facebook’s UGC is of personal nature
   – Pictures and videos of family and social events

• Content popularity would be limited by Dunbar’s number!

• Do we really need datacenters & CDNs to share this UGC?
  Why not share personal UGC from homes?

• Advantage: Regain control over personal data sharing
   – Better control over what you share & whom you share

• Concerns:
   – Can we get good performance?
      • Yes, due to Dunbar’s limit on popularity

   – Can we get good availability?
      • Yes, using always-on and always-connected gateways
      • They are inexpensive: cheap and low-power
    Example 2: Group attachment theory

• Explains how humans join and relate to groups

• Common-bond based groups
   – Membership based on inter-personal ties, e.g., family or kinship
   – Necessarily small, but tightly-knit and cohesive

• Common-identity based groups
   – Membership based on self- or shared- interest
   – Could be larger, but become less cohesive with scale
             OSN graphs and groups

• Most OSN graphs include all manners of links

• Can extract bond groups from graph structure
   – By looking for highly clustered communities of nodes

• But, not identity groups
   – Loosely-knit, they merge into the rest of the network

• Result: A size limit on detectable graph communities
                            Sybil attack
     • A fundamental problem in distributed systems
     • Attacker creates many fake/sybil identities
     • Many cases of real world attacks : Digg, Youtube

Automated sybil attack on
   Youtube for $147!
         Defending against Sybil attacks

• Traditional solutions rely on central trusted authorities
   – Runs counter to open membership policies of OSNs
• Recent proposals leverage social networks
   – Key Insight: Social links are hard to acquire in abundance
   – Look for small cuts in the graph
   – Conversely, look for communities around known trusted nodes

                                        Links difficult to create
          Lots of research activity recently

All schemes analyse the graph structure to isolate Sybils

SybilGuard [SIGCOMM’06]   • Each optimized under assumptions
SybilLimit [Oakland’08]     about the graph structure
Ostra [NSDI’08]
SumUp [NSDI’09]              – E.g., graphs are fast-mixing
SybilInfer [NDSS’09]      • Each evaluated on different datasets
Whanau [NSDI’10]
MobID [INFOCOM’10]        • Comparative evaluations yield
                            inconsistent results
Sybil resilience & group attachment theory

•   Sybil schemes find bond groups around a trusted node
•   But, these are only a fraction of all honest nodes
•   Bond groups are hard for Sybils to infiltrate
•   Not the case with identity groups

• Graph structure can identify nodes that are non-Sybils

• But, it cannot identify nodes that are Sybils

• Most nodes cannot be classified into either categories

• Does this imply Sybil schemes are useless?
   – No, they can be used conservatively to find content from
     people you trust

• OSN system designers should look to leverage
  insights from sociology

• Presented two examples where some very basic
  knowledge of sociology proved useful

• Lots more ways to leverage sociology in the future
   – Can we leverage strength of ties to set privacy policies or
     prioritizing updates from friends?

Shared By: