An Introduction To Social Network Data by DavidWalker18


More Info
									            An Introduction To
           Social Network Data
                      David M Walker
               Data Management & Warehousing
                         May 2012

May 2012
                           S1           © 2012 Data Management & Warehousing
                       Hi, I’m on Facebook!

   S  I’m one of 900 Million people as of May 2012 that has a
           Facebook account

   S  That’s more than 1 in 8 of every man, woman and child on the
           planet (and the 6 crew of the International Space Station)
           regardless of age, race, religion, location, sexuality, etc.

   S  I’ve also completed my profile – it helps my family & friends find
           and communicate with me
           S  It even reminds people to wish me ‘Happy Birthday’

May 2012
                                           2              © 2012 Data Management & Warehousing
           My Profile Page

May 2012
                  3    © 2012 Data Management & Warehousing
                       But what am I sharing ?

   S  Depending on my privacy settings I will be sharing anything
           from ‘some data’ to ‘everything about my life’

   S  You can edit your privacy settings here:

   S  Remember:
       S  Todays ‘friends’ may not be tomorrows friends
       S  Sharing with family, school/work colleagues can have
           unexpected consequences

May 2012
                                     S4           © 2012 Data Management & Warehousing
                          How is this data used?

   S  Developers use this data to ‘profile’ people

   S  This is both free to use and easy to do

   S  Uses an Application Programming Interface (API) based on
           a URL
           S  Jargon for ‘just connect to the website with the right options’

   S  Try it:

May 2012
                                          S5             © 2012 Data Management & Warehousing
                                      George H Takei

   S      Helmsman Sulu in Star Trek (The Original Series)

   S      Gay Rights and Japanese American Internment Activist

   S      Popular Facebook Page (1,962,290 likes) and secured

   S      Basic Info

   S      Photographs

May 2012
                                                  S6                © 2012 Data Management & Warehousing
                             George H Takei’s
                             photo and its data

                 George Takei posted this photograph

   API Output (Snippet):                               It Tells Me:
                                                     S    Trevor Mullins was one of several
   {                                                       hundred people who commented on this
       "id": "373438362685623_1722672",                    photo
       "from": {
                                                     S    He did so at 03:43:56 GMT on 9th Feb
       "name": "Trevor Mullins",                           2012
       "id": "1024732813"
       },                                            S    Which 3 people liked the comment
       "message": "This. So much this.",
                                                     S    And from his profile:
       "created_time": "2012-02-09T03:43:56+0000",         His username is Ertrov, he describes
       "likes": 3                                          himself as “Agnostic-atheist/Anti-
                                                           theist”, is male, likes SiFi, and is
   }                                                       affiliated to Sinclair Community College,

                                                           Ohio and many, many more things
May 2012                                     7                     © 2012 Data Management & Warehousing
                               Back to me –
                             My profile contains:

   id:         Facebook's unique                 education: Where I went to school
               reference number for me           year:       And when I left
   name:       My Full Name                      type:       And what type of school it was
   username: My Username                         gender:     My Gender
   birthday: My Date of Birth                    relationship_status:
   hometown: Where I was born                                Am I married?
   location: Where I live now                    email:      My private email
   employer: Who I work for                      website:    My website
   employer: Who I used to work for              timezone: My timezone
   projects:   Which projects I worked           locale:     What language I read facebook in
               on for that employer              languages: What languages I speak
   sports:     Which sports I like               verified:   Have I verified my email address
   favorite_teams:                               updated_time:
               Who are my favourite teams                    When did I last update my profile
                                                 type:       What type of user account
                                                             do I have
   These are just some of the fields I could populate and developers could access
May 2012
                                            8                  © 2012 Data Management & Warehousing
                                             I like …

   S  If I ‘like’ a product or brand on Facebook then the owner of that
           brand can use the developers interface to get information about
           me and others who ‘like’ their product

   S  For example the developer can get the age, marital status, gender,
           sexual preference (‘interested in’) and location of the ‘likers’

   S  The developer can then look for groups of people who share the
           same characteristics (e.g. 18-25, single, female, straight, Liverpool)
           S  This is called Cluster Analysis – looking for groups of similar

May 2012
                                           S9              © 2012 Data Management & Warehousing
                           This data is valuable:
                            Very Very Valuable

   S  Once the developer has identified a ‘cluster’ of people they can
           ask Facebook to advertise to others who don’t yet ‘like’ the
           product but share the same characteristics as those that do

   S  For example, based on our previous cluster, a nightclub may want
           to target adverts to similar people in their area

   S  Facebook makes this very easy to do, you just go here:


May 2012
                                            10             © 2012 Data Management & Warehousing
           Very precise targeting –
           know exactly who is going to see your advert

May 2012
                     11              © 2012 Data Management & Warehousing
                        Target audiences
                        using their
                        stated preferences

May 2012
           12   © 2012 Data Management & Warehousing
           Very low cost –
           Know exactly how much you are going to spend
           From an advertisers point of view this is very cost effective
           For Facebook – done at scale - it is very very profitable

May 2012
                                  13               © 2012 Data Management & Warehousing
                   Dealing with the data

   S  We can look at individuals manually

   S  We can deal with ‘small’ data sets with a spread sheet
       S  50,000 rows i.e. 50,000 individuals
       S  250 columns i.e. 250 different characteristics

   S  We can deal with ‘larger’ data sets with statistical tools
       S  There are commercial and open source tool to do the stats
       S  For example: ‘R’ is free and provide direct access to the
           Facebook API and functions to do complex cluster analysis

May 2012
                                    14            © 2012 Data Management & Warehousing
                           Advanced Techniques

   S  Exploiting the social network
           S  Which of my ‘likers’ know each other?
           S  Is it possible to identify an individual in the group who is the
           S  Can the ring-leader be influenced towards my offering/product
           S  Can the ring-leader influence others to follow them?

May 2012
                                          S15             © 2012 Data Management & Warehousing
                                        My Social Network

   Small groups of friends that don’t know each other            Detail – Friends who know each other
                                                                     (initials only for confidentiality)
                                                               This group all worked on a project together

            A group of friends who I watch rugby with

A tight knit group of friends from where I used to work
     May 2012
                                                          16              © 2012 Data Management & Warehousing
                             Sentiment Analysis

   S  Analyse peoples comments and use this to change your interaction
           with the you customer
   S  Use feedback (positive and negative) to respond to customers –
           remember you are looking for the main affect, you will always
           have people who have a minority opinion

   S  Simple Examples
           S  “Don’t like the new flavour”
           S  “Wish the new website had a help button”

   S  There are plenty of more sophisticated examples

May 2012
                                          17              © 2012 Data Management & Warehousing

   S  Facebook also allows users to develop Applications
       S  Socialcam (54M users), Cityville (35M users)
       S  Texas HoldEm (35M users), DrawSomething (29M users)

   S  Allows users to buy virtual tokens with real money
       S  This in itself is a revenue generating stream

   S  Allows developers to place very targeted adverts
       S  Revenue derived from selling targeted marketing

   S  Allows developers to monitor social interactions for new trends
       S  Who do you ‘Draw Something’ with?

May 2012
                                         18              © 2012 Data Management & Warehousing
                               Third Party Vetting

   S  Looking for a new job?
           S  Someone you are friends with may also know someone at your
               new employer – what information will they share?
           S  Your social activities – don’t post that you are out partying and
               then call in sick
           S  Don’t tell the world what you think of your boss, even after you
               leave the organisation – you might need a reference from him or
               your new employer might not want to expose themselves in the

   S  Journalists looking for background
           S  Those grainy news photos are often found on social websites

May 2012
                                            19             © 2012 Data Management & Warehousing
                               Coffee with my son

   S  One day I had coffee with my son, I took this photo and uploaded
           it to Facebook, tagging him and adding the place
   S  Facebook stored the following data:
           S    The exact date, time & GPS location of where I checked in
           S    The details of the person I was with
           S    The application on my iPhone that I used to upload the picture
           S    The people who commented, their comments and their profile
           S    And more

   S  But the photograph told another part of the story …

May 2012
                                             20             © 2012 Data Management & Warehousing
                            Photographic Data

   S  Digital Cameras store data too
       S  This is called Metadata (data about data)
       S  What each device stores varies
       S  But you can download a free tool to read the metadata
           S  Data is stored against images, audio and video files by most
               digital recording devices including cameras, phones, scanners.
               The data is known as EXIF data
           S  This data isn’t protected by your Facebook settings

May 2012
                                          21             © 2012 Data Management & Warehousing
                        What the photo told me:

   S      File name, size and type

   S      Date and Time created

   S      GPS co-ordinates - longitude, latitude & altitude

   S      Make & Model of the device used to take the photo

   S      Technical details about the photo including focal length, exposure, whether a flash was
           used, etc

   S      Whether the photo has subsequently been edited and if so when and by what application

   S      Copyright information could also have be added to the image

May 2012
                                                    22                 © 2012 Data Management & Warehousing
               What does all this add to the
                data stored by Facebook?

   S  I can validate the date, time and location of the check-in on

   S  I can understand what type of device the user carries around

   S  I can understand a breach of copyright for certain materials

May 2012
                                   23           © 2012 Data Management & Warehousing
                   What about other sites?

           Facebook 900M users
                                                       S    This is not a Facebook specific thing
   S      Qzone (China) 480M users
                                                       S    All sites allow developers to access the data
   S      Twitter 300M users
                                                       S    Developer access is key to how organisations
           Sina Weibo (China) 300M users
                                                             make money from social websites
   S      Habbo (31 counties) 200M users
                                                       S    Many people put different data on different
   S      Google+ 170M users                                social websites

           Renren (China) 160M users
                                                       S    Developers can use common data (e.g. an e-
                                                             mail address) to piece together an even deeper
   S      Badoo (Europe & Latin America) 120M users
                                                             picture of an individual
   S      Linkedin 120M user

May 2012
                                                             S24                © 2012 Data Management & Warehousing
                                    (internal) data

   S  Other organisations are gathering lots of data from internal
           sources rather than social networks
           S  Telematics devices for car insurance
           S  Smart metering devices for energy consumption
           S  Credit card transactions for fraud detection

   S  These are being manipulated and analysed using the same
   S  These are the ‘Big Data’ stories you read about in the press

May 2012
                                          25          © 2012 Data Management & Warehousing
                              Telematics Insurance

   S  Buy cheap car insurance in exchange for having a ‘black box’ installed in your
           car, known as a Telematics box

   S  This sends data back to a central computer periodically
       S  Typically every couple of minutes/miles
       S  All the data every 100ms over a 2 second interval when there is an impact

   S  Minimum data set
       S  Longitude, Latitude, Altitude, X-Acceleration, Y-Acceleration, Z-Acceleration,
           Speed, Compass Direction Of Travel

   S  More advance units gather more data
       S  Camera data, Engine data, Service History, etc.

May 2012
                                             26                © 2012 Data Management & Warehousing
                            Telematics Plot

   S  Trip from Wokingham to Walton-Upon-Thames

   S  Rendered on Google Maps with a KML file (Free to use)
May 2012
                                   27           © 2012 Data Management & Warehousing
                          Using Telematics Data

   S  Assess customer driving pattern
           S  Adjust the car insurance premium accordingly

   S  Assess accidents
           S  Can be used to determine fault in collisions
           S  Can be used to determine if whiplash is likely

   S  Assess other types of car insurance fraud

   S  Allows insurance companies to “optimize” premiums
           S  Charge as much as possible but be cheaper than the competition

May 2012
                                            28             © 2012 Data Management & Warehousing
    Telematics Insurers in the UK

May 2012
                                   29   © 2012 Data Management & Warehousing
                      Integrating Social Data
                       and Non-Social Data

   S  Organisations are starting to combine internal data with
           social network data to create an even deeper understanding
           of the customer

   S  All of the above examples given are from real projects that
           we, as a company, have already been involved in

May 2012
                                      30           © 2012 Data Management & Warehousing
                                 Integrated Data

   S  A youth buys cheap telematics insurance …
           S  When he gets it he ‘likes’ the product on on Facebook
               S  Positive Sentiment Analysis – Opportunity to thank customer
           S  When he gets charged for the top-up miles he ‘dislikes’ the cost
               S  Negative Sentiment Analysis – Opportunity to address concerns
           S  When he has an accident and tells his mates what really happened
               S  Fraud detection – Opportunity to check the veracity of the claim

   S  What you say and do socially now will affect your commercial
           transactions in the future

May 2012
                                              31              © 2012 Data Management & Warehousing
                                   Can I Opt-Out?

   S  No – you can limit your exposure but you can’t opt out of big data

   S  You don’t have to join social networks but:
           S  Many social activities are based around Twitter/Facebook
           S  Most business people will want to use LinkedIn
           S  Peer pressure to join, especially for younger people, is high

   S  Your data will be analysed by companies involved in
           S  Marketing, Financial (especially underwriting & fraud),
           S  Energy consumption, and many more
           S  They will source the data internally and from social networks

May 2012
                                             32              © 2012 Data Management & Warehousing
                                      What about crime?

   S      Most uses of social data are positive
           S    Reduce fraud, improve product, more precisely targeted marketing, energy efficiency

   S      But criminals can use this technology too
           S    Most of the technology is either low cost or free
           S    New techniques for exploiting data evolve very quickly

   S      Identity theft is just one possible outcome

   S      It’s an arms race – Can we (the good guys) find ways to protect ourselves and those that
           share their data with us faster than the bad guys develop techniques to exploit this

   S      Make sure you understand what you are sharing and with whom you are sharing data

May 2012
                                                      S33                 © 2012 Data Management & Warehousing

   S  Remember
           S  Set your privacy settings on Facebook
           S  Things that help people communicate with you (data of birth, first
               school, first pet, mothers maiden name, etc.) are also the most
               common security questions for online banking, etc.
           S  Facebook friends are not real friends – beware of ‘friending’
               people you don’t actually know and ‘liking’ dubious groups
           S  Remember your ‘friends’ may not be so in the future or may have
               greater loyalties to others than they do to you
           S  You may get profiled and targeted as a ‘false positive’ i.e. you
               aren’t interested in the product/offering but match the criteria

May 2012
                                           34             © 2012 Data Management & Warehousing
                                       It’s not just
                                     social websites

   S  Other sites also hold complex social information

           S  Directory Websites:,

           S  Family History Websites:,

           S  Large scale online retailers:,,

May 2012
                                            35              © 2012 Data Management & Warehousing
                         Who does this work?

   S  Data Scientists
       S  A data scientist is a job title for an employee or business intelligence (BI)
           consultant who excels at analysing data, particularly large amounts of data, to
           help a business gain a competitive edge
       S  The position is gaining acceptance (and significant salaries) with large
           enterprises who are interested in deriving meaning from big data, the
           voluminous amount of structured, unstructured and semi-structured data that
           a large enterprise produces.
       S  A data scientist possesses a combination of analytic, machine learning, data
           mining and statistical skills as well as experience with algorithms and coding.
           Perhaps the most important skill a data scientist possesses, however, is the
           ability to explain the significance of data in a way that can be easily
           understood by others.
       S  Most often Maths or Computer Studies graduates with Business skills

May 2012
                                              36                © 2012 Data Management & Warehousing
            Notes on this presentation

   S  All trademarks and brand names are the property of their respective owners

   S  This presentation is designed to show capabilities, tools and techniques and is
           in no way condoning or condemning any organisation, product, technology or

   S  Other tools and products are available

   S  Data access may be restricted by user permissions

   S  Data access may be restricted by law

   S  Data access may be restricted by data provider terms & conditions

May 2012
                                            S37             © 2012 Data Management & Warehousing
                       Contact Us

   S  Data Management & Warehousing
       S  Website:
       S  Telephone: +44 (0) 118 321 5930

   S  David Walker
       S  E-Mail:
       S  Telephone: +44 (0) 7990 594 372
       S  Skype: datamgmt
       S  White Papers:

May 2012
                                   38           © 2012 Data Management & Warehousing
                           About Us

     Data Management & Warehousing is a UK based consultancy that has
     been delivering successful business intelligence and data warehousing
                             solutions since 1995.

    Our consultants have worked with major corporations around the world
            including the US, Europe, Africa and the Middle East.

    We have worked in many industry sectors such as telcos, manufacturing,
      retail, financial and transport. We provide governance and project
         management as well as expertise in the leading technologies.

May 2012
                                      39             © 2012 Data Management & Warehousing
             Thank You
           ©2012 - Data Management & Warehousing

May 2012
                            40            © 2012 Data Management & Warehousing

To top