Docstoc

Diffusion in Online Product Networks

Document Sample
Diffusion in Online Product Networks Powered By Docstoc
					          THE QUEST FOR CONTENT:
     The integration of Product and Social
       Networks in UGC Environments

              Gal Oestreicher - Singer
                     Tel Aviv University


                       December, 2009

                       Joint Work with
Jacob Goldenberg (Hebrew University and Columbia University)
         and Shachar Reichman (Tel Aviv University)
Motivation
UGC is huge …
• 83.4 million videos on YouTube.com
• Over 5000 pictures uploaded to Flicker.com in a
  minute
• Technorati currently states it is tracking over 112.8
  million blogs in the blogosphere
      How do people find what to consume?
  How will they find it, if they often don’t know
            what they are looking for?
  What kind of structure leads to faster path to
                 “good” content?
When consumption goes online…
A product network is created
Very limited research on product networks

 • Links among content pages (Mayzlin and
   Yoganarasimhan, 2008; Dellarocas, Rand
   and Katona, 2009)

 • The influence of “product”
   recommendations on sales in the Amazon
   product network (Oestreicher-Singer and
   Sundararajan, 2008, 2009)
 When consumption goes online…

 The dual-network
                           Social network




Product network
 When consumption goes online…

 The dual-network           Social network




Product network
The YouTube Dual-Network - Product Node




                                   Related Videos
The YouTube Dual-Network - User
 Node




Subscribers




                             Friends
The YouTube Dual-Network - User
 Node

                                 owner



      Own
      Videos



      Favorites       Comments
  The YouTube Dual-Network

• Nodes:
  – Videos
  – People (Users)


• Edges:
  – Videos network-Related
    Videos
  – Social Network-Contacts
  – Interlinked Connections:
     • Owner
     • Favorites
   The Dual-Network

• Practically all UGC sites
  have product network
• Some of them have
  social networks (e.g.,
  Flickr.com, Digg.com
  and YouTube.com)


 Why would you
 add a social
 network to a UGC
 website?
One example

Some background :
  MTV award 2009 - Kanye West and Taylor Swift


A new clip was uploaded to YouTube mocking the
  event
  – Over 3M views in two weeks
    (one of the most viewed videos of the week)




       Source: http://it.themarker.com/tmit/article/8302
• The first video                Traffic from a
  (~3M views)                    video's page
                                 (product
                                 network)       Traffic from a
                                                personal page
         Intrinsic                              (blogs, facebook)
         demand


  Traffic from YouTube
  users’ pages (social
  network)



• The second video
  (~30K views)

                         59% - Traffic from
                         within YouTube
                         exploration
Ill-Defined Exploration

• When a consumer explores a space of options,
  with no specific pre-defined target in mind
• The process continues until the consumer
  finds an object that matches her taste, or until
  search costs reach a specific level
• Examples: Shopping for a gift; Looking for
  something interesting to watch on TV
• Online exploration is different
   – Different option space
   – Different structure
Research Questions

Do dual networks facilitate ill-defined exploration
               in UGC sites? How?

• Which connections matter more – product or social?

• Do user pages have a special role in ill-defined
  exploration?

• How does the dual network compared to alternatives
  (such as co-consumption, sponsored) in facilitation
  ill-defined exploration? Which kind of network lead to
  faster path to “good” content? To better eventual
  satisfaction among users?
Summary of approach

• Studying the YouTube dual network structure
   – The largest available dual network
   – Using data about ~700,000 videos (with 9M links)
     and 50,000 users (with 120,000 links)
   – Study properties of the product nodes and the user
     nodes
• Network simulation analysis
   – Comparing dual network, co-consumption and
     sponsored networks structures
• Internet experiment
   – Following ~400 (and counting…) users of a
     YouTube-like website
   – Studying what kind of networks lead to faster path
     to “good” content and overall higher satisfaction
Summery of data

YouTube.com

• The largest UGC site (Product network)
  –   84.3 millions videos
  –   30 millions registered users
  –   13 hours of video being uploaded every minute
  –   100 million videos watched per day

• Active social network
• We collected data about ~700,000 videos (with
  9M links) and 50,000 users (with 120,000 links)
The Combined Network
How important are the social nodes?

• Closeness Centrality
  The average shortest path between a node v and all
  other nodes in the network.
• Betweenness Centrality
  a measure to the number of times a node is on the
  shortest paths between any two products in the
  network
                                         n st (v)
                BC (v) =     ∑
                           s ≠ v ≠ t∈V     n st
                           s ≠t


  nst - the number of shortest paths from s to t
  nst(v) - the number of shortest paths from s to t that pass
  through a node v
  The Dual-Network Indices

                          Videos Nodes    Users Nodes      Ratio
                             Average         Average    Users/Videos
 Closeness                6.25           6.75               1.08
 Betweenness              3.89E+06       12.8E+06           3.29
 InDegree                 14.25          3.97               0.28
 OutDegree                14.50          4.49               0.31
 PageRank                 1.91E-06       1.38E-06           1.38
 Clustering Coefficient   0.20           0.10               0.50




Insight 1:
In the dual network users pages act as
brokers in the product network
Is this simply a network with more edges?


   Could we get the same results without the
         investment in a social network?

We compare 3 different networks:

• Product network (co-consumption network)
  The original video network

• Dual network (product and social networks)
  the dual social-product network

• Product network with random nodes added
  (product and sponsored network)
  additional random nodes with random links
Different Networks Indices
                                                 Product
                   Product
                              Social network   network with
                   network
                                               random links
Nodes              688760        688760          688760
Edges              9712812      8533413         9688009
Diameter             14            20              14
Average distance     6.46         6.53            6.07
Betweenness        4.14E+06     13.4E+06        7.73E+06
Closeness           6.4602       6.5214          6.0640


Insight 2:
Adding random nodes reduce distances, but social
nodes are better brokers
Dual networks and ill-defined exploration




User pages are central to the network…

  But do they influence consumption?
The challenge


• We need to have an established dual
  network with massive amount of content
• We need to have click stream data (We
  need to know when the exploration ends
  successfully)


 The solution:
  Create our own YouTube based website
Exploration scenarios from our website
Exploration scenarios from our website
Exploration scenarios from our website
The YouTube based website


• Different explorations scenarios:
  1.Product network only (related videos)

  2.Dual Network (Product and social networks)

  3.Product network and random links
Dual networks and ill-defined exploration


The variable of interest:

How fast does an ill define search
end with a desirable outcome?


   desirable =       OR
Dual networks and ill-defined exploration

 Kaplan-Meier Survival function




                           Insight 3:
                           Users presented
                           with a dual
                           network find a
                           “good” content
                           faster
  Dual networks and ill-defined exploration

    Kaplan-Meier Survival function




          B
social
related   -.532**
          (0.24)

random    -.451**
          (0.23)
 Dual networks and ill-defined exploration

        4.60

        4.40

        4.20

        4.00
                                           Related
        3.80                               Random
                                           Social
        3.60

        3.40

        3.20

        3.00
                    overall satisfaction



Insight 4:
The overall satisfaction of users presented
with a dual network is higher
Dual networks and ill-defined exploration




User pages are central to the network…

  But do they influence consumption?
Dual networks and ill-defined exploration




User pages are central to the network…

   and they influence consumption …

                   How?
  Exploration mileage
• A “related” video is one step away

• A “random” video is 6 clicks away
  (on average)


                                Path Length Distribution
How far are the “social”
recommendations?




                                Six degrees of separation…
  Exploration mileage
• A “related” video is one step away

• A “random” video is 6 clicks away
  (on average)


How far are the “social”
recommendations?

   3 clicks away (on average)
 Exploration mileage


• A “related” video is one step away (on average)

• A “random” video is 6 steps away (on average)

• A “social” video is 3 steps away (on average)


Is it simply the distance or is there more?
Dual networks and ill-defined exploration




Insight 5:
Social steps are more effective than
random steps of the same distance
Conclusion

• The social network is important for the success of
  the product network
   – User page have unique structural characteristics.
   – Specifically, they serve as brokerages in the dual
     networks, thus facilitating ill defined exploration.

• When presented with a social network, consumer
  faster find a product they are interested in
  (compared to co-consumption networks and
  sponsored networks)
• More electronic commerce websites should
  consider facilitating social networking on their
  sites?
Questions & Suggestions?

Gal Oetsreicher-Singer
GalOS@post.tau.ac.il
   The 2 Networks Indices


                                                     The Social Networks
                                          # of nodes      50K
                                          # of edges     120K
                                                        Average    SD     Min   Max
                                          InDegree      2.36       8.62   1     104
           The Videos Network
                                          OutDegree     2.36       8.62   1     104
# of nodes     700K
                                          Density       2.4*10-4
# of edges      9M
                                          Clustering    0.05
             Average    SD    Min Max     Coefficient
InDegree     14.1       51.56 1   10311
OutDegree    14.38      10.06 1   100
Density      4.1*10-5
Clustering 0.20
Coefficient
Average      6.4
distance
   The 2 Networks Indices




                                            Path Length Distribution

           The Videos Network
# of nodes     700K
# of edges      9M
             Average    SD    Min Max
InDegree     14.1       51.56 1   10311
OutDegree    14.38      10.06 1   100
Density      4.1*10-5
Clustering 0.20
Coefficient
Average      6.4
distance                                  Six degrees of separation…

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:2/10/2012
language:
pages:43