Docstoc

Using SAS Hadoop to Support Marketing Analytics with Big Data

Document Sample
Using SAS Hadoop to Support Marketing Analytics with Big Data Powered By Docstoc
					 Using SAS/Hadoop to Support
Marketing Analytics with Big Data




                                       Kerem Tomak
                      VP, Marketing Analytics, Macys.com
                 Agenda
•   Who is the customer?
•   Life and death of a customer
•   Data galore
•   Crystal Ball
•   What matters the most…
      Who is the customer?
• .com
• stores
           Life of a customer
• Present value of all
  future profits
  obtained from a
  customer over his or
  her life of relationship
  with a firm.
            Customer Lifetime Value
• The CLV of a customer i is the discounted value of the future
  profits yielded by this customer




• Where
 – CFi,t = net cash flow generated by the customer i activity at time t
 – h = time horizon for estimating the CLV
 – d = discount rate

• The CLV is the value added, by an individual customer, to the
  company
             Why is CLV important ?
• By knowing the CLV of the customers, one can
 – Focus on groups of customers of equal wealth
 – Evaluate the budget of a marketing campaign
 – Measure the efficiency of a past marketing campaign by
   evaluating the CLV change it incurred
   • Focus on the most valuable customers, which deserve to be closely
     followed
   • Neglect the less valuable ones, to which the company should pay less
     attention
 – Use CLV to introduce new segmentation opportunities
           Tapping into the data
•   Data Storage
•   Reporting
•   Analytics
                                   Utilized data
•   Advanced Analytics

                                   Unutilized data
    – Computing with big
      datasets is a                that can be
      fundamentally different      available to
      challenge than doing “big    business
      compute” over a small
      dataset
Hadoop & RDBMS Analogy
   RDBMS & Hadoop is like car & train
             RDBMS                     Hadoop




   Sports car:                 Cargo train:
   •   refined                 •   rough
   •   has a lot of features   •   missing a lot of “luxury”
   •   accelerates very fast   •   slow to accelerate
   •   pricey                  •   carries almost anything
   •   expensive to maintain
                               •   moves a lot of stuff very
                                   efficiently
RDBMS & Hadoop Comparison*
                         Traditional RDBMS (Oracle, DB2)          Hadoop


 Maximum Data Capacity   Up to 100’s of TBs                       Up to 10’s of PBs (hundreds times
                                                                  more)
 Processing Capacity     Up to 10’s of TBs                        Up to 10’s of PBs (thousands times
                                                                  more)
 Costs                   High software, license and               Cost effective: commodity hardware +
                         hardware/storage costs                   open source software

 Transactional           Yes                                      No (batch process)

 Update Patterns         Supported                                Not Supported Yet

 Schema Complexity       Structured (tables only)                 Structured or Unstructured

 Processing Freedom      SQL                                      MapReduce, SQL (Hive), Streaming,
                                                                  Pig, HBase, etc..

 Scalability             Non-linear scaling                       Fully distributed and linearly scalable


 Reliability             Fault-tolerant at high cost, but without Fault-tolerant and self-healing by
                         self-healing by design                   desing

 Real Time Response      Yes                                      No (HBase required)
                                                      * Cloudera comparison chart
                    Crystal Ball




Source: Forrester                  10
Toolshed
          What matters the most
• Building data infrastructure
   – Fast processing of large amounts of data and deployment of model
     scoring on the same environment
• Business task execution
   – Real-time optimization for customized offer management
• Planning tools
   – Give analytical guidelines to campaign management
• Strategic support
   – Develop robust analytics that look at customer’s environment




 “Making sense out of models” “Deploying in production”
              Questions?
Kerem Tomak

kerem.tomak@macys.com

4154221408

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:69
posted:9/18/2011
language:English
pages:13
Lingjuan Ma Lingjuan Ma
About