Using SAS Hadoop to Support Marketing Analytics with Big Data by pptfiles


									 Using SAS/Hadoop to Support
Marketing Analytics with Big Data

                                       Kerem Tomak
                      VP, Marketing Analytics,
•   Who is the customer?
•   Life and death of a customer
•   Data galore
•   Crystal Ball
•   What matters the most…
      Who is the customer?
• .com
• stores
           Life of a customer
• Present value of all
  future profits
  obtained from a
  customer over his or
  her life of relationship
  with a firm.
            Customer Lifetime Value
• The CLV of a customer i is the discounted value of the future
  profits yielded by this customer

• Where
 – CFi,t = net cash flow generated by the customer i activity at time t
 – h = time horizon for estimating the CLV
 – d = discount rate

• The CLV is the value added, by an individual customer, to the
             Why is CLV important ?
• By knowing the CLV of the customers, one can
 – Focus on groups of customers of equal wealth
 – Evaluate the budget of a marketing campaign
 – Measure the efficiency of a past marketing campaign by
   evaluating the CLV change it incurred
   • Focus on the most valuable customers, which deserve to be closely
   • Neglect the less valuable ones, to which the company should pay less
 – Use CLV to introduce new segmentation opportunities
           Tapping into the data
•   Data Storage
•   Reporting
•   Analytics
                                   Utilized data
•   Advanced Analytics

                                   Unutilized data
    – Computing with big
      datasets is a                that can be
      fundamentally different      available to
      challenge than doing “big    business
      compute” over a small
Hadoop & RDBMS Analogy
   RDBMS & Hadoop is like car & train
             RDBMS                     Hadoop

   Sports car:                 Cargo train:
   •   refined                 •   rough
   •   has a lot of features   •   missing a lot of “luxury”
   •   accelerates very fast   •   slow to accelerate
   •   pricey                  •   carries almost anything
   •   expensive to maintain
                               •   moves a lot of stuff very
RDBMS & Hadoop Comparison*
                         Traditional RDBMS (Oracle, DB2)          Hadoop

 Maximum Data Capacity   Up to 100’s of TBs                       Up to 10’s of PBs (hundreds times
 Processing Capacity     Up to 10’s of TBs                        Up to 10’s of PBs (thousands times
 Costs                   High software, license and               Cost effective: commodity hardware +
                         hardware/storage costs                   open source software

 Transactional           Yes                                      No (batch process)

 Update Patterns         Supported                                Not Supported Yet

 Schema Complexity       Structured (tables only)                 Structured or Unstructured

 Processing Freedom      SQL                                      MapReduce, SQL (Hive), Streaming,
                                                                  Pig, HBase, etc..

 Scalability             Non-linear scaling                       Fully distributed and linearly scalable

 Reliability             Fault-tolerant at high cost, but without Fault-tolerant and self-healing by
                         self-healing by design                   desing

 Real Time Response      Yes                                      No (HBase required)
                                                      * Cloudera comparison chart
                    Crystal Ball

Source: Forrester                  10
          What matters the most
• Building data infrastructure
   – Fast processing of large amounts of data and deployment of model
     scoring on the same environment
• Business task execution
   – Real-time optimization for customized offer management
• Planning tools
   – Give analytical guidelines to campaign management
• Strategic support
   – Develop robust analytics that look at customer’s environment

 “Making sense out of models” “Deploying in production”
Kerem Tomak


To top