Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Data Mining Techniques

VIEWS: 23 PAGES: 61

									第六篇 顧客關係行銷之
    DATA MINING技巧
Data Mining 的四大步驟

    S    E M M A


SAMPLE       MODIFY        ASSESS
                MODELING
   EXPLORE
Data Mining 分析的六大類型

 Classification
 Estimation
 Prediction
 Affinity Grouping
 Clustering
 Description
Classification
  根據屬性或特性來進行分類
  適用演算法:
   – Decision Trees
   – Memory Based Reasoning
   – Link Analysis
Estimation
 目的與classification相同,但是差異
 點在於classification有明確的屬性,
 而estimation通常用於連續的數值。
 適用演算法
  – Neural Networks
Prediction
  預測尚未發生的事情
  適用演算法
   – Market Basket Analysis
   – Memory Based Reasoning
   – Artificial Neural Networks
Affinity Group
  發現哪些事情總是一起發生的
  適用演算法
   – Association Rules
Clustering
 不須事先定義分類標準,將資料
 分成數個子集合或clusters
 適用演算法
 – Cluster Detection
Description
  從大量的資料中找出一結果,
  來敘述或解釋狀況
    Business Drivers for Data Mining
Customer-focused           Operations-focused
– Life-time value          –   Profitability analysis
– Profiling/segmentation   –   Pricing
– Retention                –   Fraud detection
– Acquisition              –   Risk assessment
– Winback                  –   Portfolio management
– Cross-selling            –   Employee turnover
– Campaign                 –   Cash management
  analysis/management
– Channel development
DATA MINING的分析技巧
 Market Basket Analysis
 Memory Based Reasoning
 Automatic Cluster Detection
 Link Analysis
 Decision Trees
 Artificial Neural Network
 Genetic Algorithms
 OLAP
Market Basket Analysis
(購物籃分析)
What’s Market Basket Analysis
Association rules
– If…then…
Results of Market Basket Analysis
– useful
– trivial
– inexplicable
Market Basket Analysis
(購物籃分析)
The Basic Process
– 選擇正確的items
– 根據co-occurrence的次數來解釋rules
– 克服實際的限制
Improvement
        P(condition and result)
         P(condition)p(result)
 Market Basket Analysis
 (購物籃分析)

Combination           Probability
      A                  45%
      B                 42.5%
      C                  40%
   A and B               25%
   A and C               20%
   B and C               15%
A and B and C             5%
  Market Basket Analysis
  (購物籃分析)

       Rule      p(condition) p(condition and result) confidence
If A and B then C 25%                   5%               0.20
If A and C then B 20%                   5%               0.25
If B and C then A 15%                   5%               0.33
   Market Basket Analysis
   (購物籃分析)

       Rule      support confidence improvement
If A and B then C 5%       0.20%        0.50
If A and C then B 5%       0.25%        0.59
If B and C then A 5%       0.33%        0.74
    If A then B    25%     0.59%        1.31
Market Basket Analysis
(購物籃分析)

 Improvement > 1,預測正確
 Improvement < 1
  – if B and C then A => if B and C the Not A
Market Basket Analysis
(購物籃分析)
 優點
 – 產生明確的且可理解的結果
 – 計算方式簡單易懂
 缺點
 – 更多的問題導致更多的運算
 – 不容易決定正確的item範圍
 – 太少量的資料無法得到正確的分析
Memory Based Reasoning
What’s Memory Based Reasoning
The Three main problem
 如何選擇適當的歷史紀錄
 決定一有效的方法來表示歷史紀錄
 定義distance function、combination function
  和鄰近的個數
Memory Based Reasoning
Applying MBR
 –   Choosing the training set
 –   Determining the distance function
 –   Choosing the number of nearest neighbors
 –   Determining the combination function
Memory Based Reasoning
  Define Distance Function
   –   d(A,B)>=0
   –   d(A,A)=0
   –   d(A,B)=d(B,A)
   –   d(A,B)<=d(A,C)+d(B,C)
Memory Based Reasoning
Recnum Gender    Age       Salary
   1   female     27   $         19,000
   2    male      51   $         64,000
   3    male      52   $        105,000
   4   female     33   $         55,000
   5    male      45   $         45,000

 new    female   45    $       100,000
Memory Based Reasoning
 dsum(A,B)=dgender(A,B)+dage(A,B)+
                    dsalary(A,B)
 dnorm(A,B)=dsum(A,B)/max(dsum)
 deuclid(A,B)=sqrt(dgender(A,B)^2+
                    dage(A,B)^2+
                    dsalary(A,B)^2)
Memory Based Reasoning
      dsum      Dnormalized     Deuclid
1   1,4,5,2,3   1,4,5,2,3     1,4,5,2,3
2   2,5,3,4,1   2,5,3,4,1     2,5,3,4,1
3   3,2,5,4,1   3,2,5,4,1     3,2,5,4,1
4   4,1,5,2,3   4,1,5,2,3     4,1,5,2,3
5   5,2,3,4,1   5,2,3,4,1     5,2,3,4,1
Memory Based Reasoning

          1     2       3     4        5 Neighbors
dsum    1.662 1.659   1.338 1.003    1.64 4,3,5,2,1
dnorm   0.554 0.553   0.446 0.334 0.547 4,3,5,2,1
deuclid 0.781 1.052   1.251 0.494     1    4,1,5,2,3
Memory Based Reasoning

Recnum   Gend     Age    Salary    Attriter
   1     female   27    $19,000      no
   2      male    51    $64,000     yes
   3      male    52    $105,000    yes
   4     female   33    $55,000     yes
   5      male    45    $45,000      no
 new     female   45    $100,000      ?
   Memory Based Reasoning

          Neighbors Neighbor Attrition k=1 k=2 k=3 k=4 k=5
dsum      4,3,5,2,1   Y,Y,N,Y,N        yes yes yes yes yes
dEuclid 4,1,5,2,3     Y,N,N,Y,Y      yes   ?   no   ?   yes

                k=1      k=2     k=3     k=4     k=5
   dsum      yes,100% yes,100% yes,67% yes,75% yes,60%
  dEuclld    yes,100% yes,50%     no,67% yes,50% yes,60%
Memory Based Reasoning
  優點
  – 不只RDBMS可以使用
  – 過程簡單易懂
  缺點
  – 相當花費計算的時間
  – 需要大量的歷史資料
Automatic Cluster Detection
 What’s Automatic Cluster Detection?
 MacQueen 於1976 年提出K-means
 演算法

(i) 在原始的資料集中隨機選取K 個候選
    形心。
(ii) 對於原始資料集內的每個樣本點,計
    算其最為接近的候選組群形心,並將
    其歸屬此一候選組群。
Automatic Cluster Detection

(iii) 執行完一回合的全部資料集的組
   群分配後,對於歸屬每個候選組群
   的所有樣本點再重新計算出其所構
   成的形心。
(iv) 計算錯誤函數(error function)的值不
   再改變時,表示已找到最後收斂的
   形心位置,而完成整個組群化的程
   序;否則回到(ii)。
Automatic Cluster Detection

   優點
   – 適用於多種的資料型態
   – 容易被應用
   缺點
   – 對起始值很敏感
Link Analysis
 What’s Link Analysis ?

 Nodes and Edges

 Traveling Salesman Problem
Link Analysis
 優點
  – 使用視覺化的分析
 缺點
  – 不容易應用於多種資料型態
  – 少量的工具
  – 應用於RDBMS效率不好
Decision Trees
 What’s Decision Trees ?

 Algorithms
  – CART
  – CHAID
  – C4.5
Decision Trees
Decision Trees

  1
                        Decision Tree

                 D1   Income      Income
                      =High       =low
                 D2
                       D1           D2
Decision Trees


  1     2                    Decision Tree

                         Income         Income
                 D1a
                         =High          =low
                 D1b
                             D1          D2
                 D2

                       D1a        D1b
Artificial Neural Networks
 What’s Neural Networks ?



           Neural Network
               Model


  Input                     Output
Artificial Neural Networks

1. Identify the input and output features.
2. Massage the inputs and output so
   their range is between 0 and 1.
3. Set up a network with an appropriate
   topology.
4. Train the network on a representative
   set of training examples.
Artificial Neural Networks

5. Test the network on a test set strictly
   independent from the training
   examples. If necessary, repeat the
   training, adjusting the training set,
   network topology, and parameters.
   Evaluate the network using the
   evaluation set to see how well it
   performs.
Artificial Neural Networks

6. Apply the model generated by the
   network to predict outcomes for
   unknown inputs.
Artificial Neural Networks

 優點
  – 在很複雜的領域中也可以有很好的結果
  – 可同時適用於連續數值或不同等級
  – 已有許多產品
 缺點
  – Input的range必須在0到1之間
  – 不容易解釋結果
Genetic Algorithms
 What’s Genetic Algorithms

 染色體 (chromosome)
 基因 (gene)
 演算法流程
Genetic Algorithms
             獲得起始解




            取前代之優秀染色體
             進行演化過程




     交配運算      突變運算       複製運算




            計算新生成染色體
             之適合度函數值




              是否到達
              最後一代         否


                      是

              取其最佳解
              為最後解答
Genetic Algorithms

 優點
 – 產出可解釋的結果
 – 與其他演算法相比,較容易找出較佳解
 缺點
 – 需花費大量的計算時間
 – 較少套裝軟體
OLAP
Online Analytical Processing

OLAP Data Model
 – Star
 – Snowflake
 Flat table / dimension table
OLAP

 Cube

 MOLAP
 ROLAP
 HOLAP
OLAP

 優點
 – 適合分析時間序列
 – 快速的回應時間
 – 已有許多產品
 缺點
 – 創造cube是困難的
第七篇 CRM的導入成敗
–Estimate 65% of customer relationship
marketing applications fall.
–高階主管對推動CRM的成敗有相當影響
–CRM Software Vendor 的選擇
The 10 Biggest Mistakes in CRM
1.Not Asking What’s in It for the Customer
– The only way to benefit your organization is to
  first benefit your customer.
– You need to be able to seriously imagine your
  customer saying
2.Not Asking What’s in It for You
– The customer may always be right but he is not
  always profitable for your company
The 10 Biggest Mistakes in CRM
3.Not Putting Your Strategy First
– Think CRM is a technology ?
– CRM is a process for managing relationships
  with your customers and partners
4.Not Getting the Data; Not Using What You
Have
– Data is king in the world of CRM
The 10 Biggest Mistakes in CRM

5.Underestimating the Degree of Cross-
company Involvement Required
– A good CRM initiative requires people across
  your company to work differently than they did
  before
6.Not outsourcing When Appropriate
– Microsoft, Pizza Hut.
The 10 Biggest Mistakes in CRM
7.Taking Too Long to Become Operational
– It’s too easy for the project to be killed as a
  money loser if the startup period in too lengthy


8.Not Starting Small
– How do you work with a database of millions
  and millions customers ?
The 10 Biggest Mistakes in CRM

9.Not Testing
– 系統功能隨時間流逝、地區、環境而不同


10.Overestimating What You Need to Begin
– 從核心部分開始導入
– 分階段導入
 Choosing a CRM
            Software Vendor
Organize into three general categories
– First are the enterprise resource planning(ERP)
  vendors
– Second is comprised of the numerous “part-of –
  the-solution” CRM vendor
– Third is the independent, horizontal CRM
  vendors
 12 pieces of an effective
                  CRM System
1.Enterprisewide customer management
– Customer management by department is not enough
2.Web integration for e-business backbone
– Customer interaction
– Real-time access to customer and company data
– Access to company knowledge base and the ability to
  submit,check,update
– Prospect interaction
– Immediate response to information inquiries
12 pieces of an effective
                 CRM System
3. Single, consolidated user interface
– Intuitive interface that gives them rapid,easy
  access to information about customer, partners
  and prospects
4.collaboration among teams
– A good CRM system should provide for intra
  and inter-departmental collaboration
12 pieces of an effective
                 CRM System
5.Usability
– This is really the linchpin of success for all
  collaboration-dependent enterprise applications
6.Process automation technology
–   Literature fulfillment
–   Problem-resolution
–   Knowledge retrieval
–   Marketing campaign execution
  12 pieces of an effective
                   CRM System
7.customer management cycle reduction
– Be able to help you reduce the sales cycle and cut
  customer support response time
8.Low total cost of ownership(TCO)
– The best CRM systems promise a low TCO
   •   Strong internet integration?
   •   Rapid implementation?
   •   Use of industry-standard development tool?
   •   The ability to leverage your existing business application?
   •   Customer reference to verify these capabilities?
  12 pieces of an effective
                   CRM System
9.Self-service
– This functionality is an important tool for
  improving service levels and reducing cost.
10.Knowledge management tools
– To keep up with the variety of resources
  available to customers on your site
  12 pieces of an effective
                   CRM System
11.Integrated marketing automation
– Effective CRM software should offer a complete
  marketing automation capability
   • Integrated campaign management
   • Customer and prospect analysis
   • Feedback(sales,product,customer)
12.Rapid implementation
– This is one effective way to promote the success
  of your CRM projects

								
To top