DATA MINING

Document Sample

Shared by: vivi07
Categories
Tags
Stats
views:
159
posted:
11/6/2009
language:
English
pages:
49
Data Mining

David L. Olson

James & H.K. Stuart Professor in MIS University of Nebraska Lincoln



Korea Telecom: KM1 Data Mining



David L. Olson



Definition

• DATA MINING: exploration & analysis

– by automatic means – of large quantities of data – to discover actionable patterns & rules



• Data mining a way to utilize massive quantities of data that businesses generate



Korea Telecom: KM1 Data Mining



David L. Olson



Political Data Mining

Grossman et al., 10/18/2004, Time, 38



• 2004 Election

– Republicans: VoterVault

• From Mid-1990s • About 165 million voters • Massive get-out-the-vote drive for those expected to vote Republican



– Democrats: Demzilla

• Also about 165 million voters • Names typically have 200 to 400 information items

Korea Telecom: KM1 Data Mining David L. Olson



Medical Diagnosis

J. Morris, Health Management Technology Nov 2004, 20,22-24



• Electronic Medical Records

– Associated Cardiovascular Consultants

• 31 physicians • 40,000 patients per year, southern NJ



– Data mined to identify efficient medical practice – Enhance patient outcomes – Reduced medical liability insurance

Korea Telecom: KM1 Data Mining David L. Olson



Mayo Clinic

Swartz, Information Management Journal Nov/Dec 2004, 8



• IBM developed EMR program

– Complete records on almost 4.4 million patients – Doctors can ask for how last 100 Mayo patients with same gender, age, medical history responded to particular treatments



Korea Telecom: KM1 Data Mining



David L. Olson



Retail Outlets

• Bar coding & Scanning generate masses of data

– – – – – customer service inventory control MICROMARKETING CUSTOMER PROFITABILITY ANALYSIS MARKET BASKET ANALYSIS

David L. Olson



Korea Telecom: KM1 Data Mining



FINGERHUT

• Founded 1948

– – – – today sends out 130 different catalogs to over 65 million customers 6 terabyte data warehouse 3000 variables of 12 million most active customers – over 300 predictive models



• Focused marketing

Korea Telecom: KM1 Data Mining David L. Olson



Fingerhut

• Purchased by Federated Department Stores for $1.7 billion in 1999 (for database) • Fingerhut had $1.6 to $2 billion business per year, targeted at lower-income households • Can mail 400,000 packages per day • Each product line has its own catalog

Korea Telecom: KM1 Data Mining David L. Olson



Fingerhut

• Uses segmentation, decision tree, regression, neural network tools from SAS and SPSS • Segmentation - combines order & demographic data with product offerings

– can target mailings to greatest payoff

• customers who recently had moved tripled their purchasing 12 weeks after the move • send furniture, telephone, decoration catalogs

Korea Telecom: KM1 Data Mining David L. Olson



Data for SEGMENTATION

cluster subj 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 age 53 48 32 26 51 59 43 38 35 27 income 80000 120000 90000 40000 90000 150000 120000 160000 70000 50000 indices marital grocery wife 180 husband 120 single 30 wife 80 wife 110 wife 160 husband 140 wife 80 single 40 wife 130

David L. Olson



dine out 90 110 160 40 90 120 110 130 170 80



savings 30000 20000 5000 0 20000 30000 10000 15000 5000 0



Korea Telecom: KM1 Data Mining



Initial Look at Data

• Want to know features of those who spend a lot dining out • INCLUDE AS MANY ACTIONABLE VARIABLES AS POSSIBLE

– things you can identify



• Manipulate data

– sort on most likely indicator (dine out)

Korea Telecom: KM1 Data Mining David L. Olson



Sorted by Dine Out

cluster subject 1004 1010 1001 1005 1002 1007 1006 1008 1003 1009 age 26 27 53 51 48 43 59 38 32 35 income 40000 50000 80000 90000 120000 120000 150000 160000 90000 70000 indices marital grocery wife 80 wife 130 wife 180 wife 110 husband 120 husband 140 wife 160 wife 80 single 30 single 40

David L. Olson



dine out 40 80 90 90 110 110 120 130 160 170



savings 0 0 30000 20000 20000 10000 30000 15000 5000 5000



Korea Telecom: KM1 Data Mining



Analysis

• Best indicators

– marital status – groceries



• Available

– marital status might be easier to get



Korea Telecom: KM1 Data Mining



David L. Olson



Fingerhut

• Mailstream optimization

– which customers most likely to respond to existing catalog mailings – save near $3 million per year – reversed trend of catalog sales industry in 1998 – reduced mailings by 20% while increasing net earnings to over $37 million



Korea Telecom: KM1 Data Mining



David L. Olson



Banking

• Among first users of data mining • Used to find out what motivates their customers (reduce churn) • Loan applications • Target marketing

• Norwest: 3% of customers provided 44% profits • Bank of America: program cultivating top 10% of customers

Korea Telecom: KM1 Data Mining David L. Olson



CREDIT SCORING

Bank Loan Applications

Age 24 20 20 33 30 55 28 20 20 39 Income 55557 17152 85104 40921 76183 80149 26169 34843 52623 59006 Assets Debts Want 27040 48191 1500 11090 20455 400 0 14361 4500 91111 90076 2900 101162 114601 1000 511937 21923 1000 47355 49341 3100 0 21031 2100 0 23054 15900 195759 161750 600

David L. Olson



On-time 1 1 1 1 1 1 0 1 0 1



Korea Telecom: KM1 Data Mining



Characteristics of Not On-time

Age 28 20 Income Assets Debts Want 26169 47355 49341 3100 52623 0 23054 15900 On-time 0 0



Here, Debts exceed Assets Age Young Income Low BETTER: Base on statistics, large sample supplement data with other relevant variables

Korea Telecom: KM1 Data Mining David L. Olson



CHURN

• Customer turnover • critical to:

– – – – telecommunications banks human resource management retailers



Korea Telecom: KM1 Data Mining



David L. Olson



Identify characteristics of those who leave

Age Time-job Time-town min bal checking years months months $ 27 12 12 549 x 41 18 41 3259 x 28 9 15 286 x 55 301 5 2854 x 43 18 18 1112 x 29 6 3 0 x 38 55 20 321 x 63 185 3 2175 x 26 15 15 386 x 46 13 12 1187 x 37 32 25 1865 x

Korea Telecom: KM1 Data Mining David L. Olson



savings card x x x x x



loan



x x x x x x x x x



x x x



Analysis

• What are the characteristics of those who leave?

– Correlation analysis



• Which customers do you want to keep?

– Customer value - net present value of customer to the firm



Korea Telecom: KM1 Data Mining



David L. Olson



Correlation

Age Age 1.0 Job Town Min-Bal Check Saving Card Loan Time Job 0.6 1.0 Time Town 0.4 0.9 1.0 min-bal check -0.4 -0.6 -0.5 1.0 0.0 0.1 -0.1 -0.2 1.0 saving 0.4 0.6 0.3 0.3 0.5 1.0 card loan 0.2 0.9 0.5 0.6 0.2 0.9 1.0 0.3 -0.2 0.4 -0.1 0.2 0.3 0.5 1.0



Korea Telecom: KM1 Data Mining



David L. Olson



Mortgage Market

• Early 1990s - massive refinancing • need to keep customers happy to retain • contact current customers who have rates significantly higher than market

– a major change in practice – data mining & telemarketing increased Crestar Mortgage’s retention rate from 8% to over 20%

Korea Telecom: KM1 Data Mining David L. Olson



Banking

• Fleet Financial Group

– $30 million data warehouse – hired 60 database marketers, statistical/quantitative analysts & DSS specialists – expect to add $100 million in profit by 2001



Korea Telecom: KM1 Data Mining



David L. Olson



Banking

• First Union

– concentrated on contact-point – previously had very focused product groups, little coordination – Developed offers for customers



Korea Telecom: KM1 Data Mining



David L. Olson



CREDIT SCORING

• Data warehouse including demand deposits, savings,

loans, credit cards, insurance, annuities, retirement programs, securities underwriting, other



• Statistical & mathematical models (regression) to predict repayment



Korea Telecom: KM1 Data Mining



David L. Olson



CUSTOMER RELATIONSHIP MANAGEMENT (CRM)

• understanding value customer provides to firm

– Kathleen Khirallah - The Tower Group

• Banks will spend $9 billion on CRM by end of 1999



– Deloitte

• only 31% of senior bank executives confident that their current distribution mix anticipated customer needs

Korea Telecom: KM1 Data Mining David L. Olson



Customer Value

Middle aged (41-55), 3-9 years on job, 3-9 years in town, savings account year annual purchases profit discounted net 1.3 rate 1 1000 200 153 153 2 1000 200 118 272 3 1000 200 91 363 4 1000 200 70 433 5 1000 200 53 487 6 1000 200 41 528 7 1000 200 31 560 8 1000 200 24 584 9 1000 200 18 603 10 1000 200 14 618

Korea Telecom: KM1 Data Mining David L. Olson



Younger Customer

Young (21-29), 0-2 years on job, 0-2 years in town, no savings account year annual purchases profit discounted net 1.3 1 300 60 46 46 2 360 72 43 89 3 432 86 39 128 4 518 104 36 164 5 622 124 34 198 6 746 149 31 229 7 896 179 29 257 8 1075 215 26 284 9 1290 258 24 308 10 1548 310 22 331

Korea Telecom: KM1 Data Mining David L. Olson



Credit Card Management

• Very profitable industry • Card surfing - pay old balance with new card • promotions typically generate 1000 responses, about 1% • in early 1990s, almost all mass-marketing • data mining improves (lift)

Korea Telecom: KM1 Data Mining David L. Olson



LIFT

• LIFT = probability in class by sample divided by

probability in class by population – if population probability is 20% and sample probability is 30%, LIFT = 0.3/0.2 = 1.5



• best lift not necessarily best

need sufficient sample size as confidence increases, longer list but lower lift

Korea Telecom: KM1 Data Mining David L. Olson



Lift Example

• Product to be promoted • Sampled over 10 identifiable segments of potential buying population

– Profit $50 per item sold – Mailing cost $1 – Sorted by Estimated response rates



Korea Telecom: KM1 Data Mining



David L. Olson



Lift Data

Seg Rate 1 2 3 4 5 Rev Cost Profit $1.10 $0.75 $0.25 Seg Rate 6 7 8 Rev Cost Profit -$0.35 -$0.55 -$0.75 -$0.80 -$0.95



0.042 $2.10 $1 0.035 $1.75 $1 0.025 $1.25 $1 0.017 $0.85 $1 0.015 $0.75 $1



0.013 $0.65 $1 0.009 $0.45 $1 0.005 $0.25 $1 0.004 $0.20 $1 0.001 $0.05 $1



-$0.15 9 -$0.25 10



Korea Telecom: KM1 Data Mining



David L. Olson



Lift Chart

LIFT

1.2 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 8 9 10 Segment Cum Response Random



Korea Telecom: KM1 Data Mining



Cumulative Proportion



David L. Olson



Profit Impact

PROFIT

12 10 8 6 4 2 0 -2 -4 Segment 0 1 2 3 4 5 6 7 8 9 10 Cum Revenue Cum Cost Cum Profit



Korea Telecom: KM1 Data Mining



Dollars



David L. Olson



INSURANCE

• Marketing, as retailing & banking • Special:

– Farmers Insurance Group - underwriting system generating $ millions in higher revenues, lower claims

• 7 databases, 35 million records



– better understanding of market niches

• lower rates on sports cars, increasing business

Korea Telecom: KM1 Data Mining David L. Olson



Insurance Fraud

• Specialist criminals - multiple personas • InfoGlide specializes in fraud detection products

– similarity search engine

• link names, telephone numbers, streets, birthdays, variations • identify 7 times more fraud than exact-match systems

Korea Telecom: KM1 Data Mining David L. Olson



Insurance Fraud - Link Analysis

claim type amount physician back 50000 Welby neck 80000 Frank arm 40000 Barnard neck 80000 Frank leg 30000 Schmidt multiple 120000 Heinrich neck 80000 Frank back 60000 Schwartz arm 30000 Templer internal 180000 Weiss

Korea Telecom: KM1 Data Mining



attorney McBeal Jones Fraser Jones Mason Feiffer Jones Nixon White Richards

David L. Olson



Insurance Fraud

• Analytics’ NetMap for Claims

– uses industry-wide database – creates data mart of internal, external data – unusual activity for specific chiropractors, attorneys



• HNC Insurance Solutions

– workers compensation fraud



• VeriComp - predictive software (neural nets)

Korea Telecom: KM1 Data Mining David L. Olson – saved Utah over $2 million



TELECOMMUNICATIONS

• Deregulation - widespread competition

– churn

• 1/3rd poor call quality, 1/2 poor equipment



– wireless performance monitor tracking

• reduced churn about 61%, $580,000/year



– cellular fraud prevention – spot problems when cell phones begin to go bad

Korea Telecom: KM1 Data Mining David L. Olson



Telecommunications

• Metapath’s Communications Enterprise Operating System

– help identify telephone customer problems

• dropped calls, mobility patterns, demographics • to target specific customers



– reduce subscription fraud

• $1.1 billion



– reduce cloning fraud

• cost $650 million in 1996

Korea Telecom: KM1 Data Mining David L. Olson



Telecommunications

• Churn Prophet, ChurnAlert

– data mining to predict subscribers who cancel



• Arbor/Mobile

– set of products, including churn analysis



Korea Telecom: KM1 Data Mining



David L. Olson



TELEMARKETING

• MCI uses data marts to extract data on prospective customers

– typically a 2 month program – 20% improvement in sales leads – multimillion investment in data marts & hardware – staff of 45 – trend spotting (which approaches specific customers like) David L. Olson Korea Telecom: KM1 Data Mining



Telemarketing

• Australian Tourist Commission

– maintained database since 1992

• responses to travel inquiries on tours, hotels, airlines, travel agents, consumers • data mine to identify travel agents & consumers responding to various media • sales closure rate at 10% and up • lead lists faxed weekly to productive travel agents



Korea Telecom: KM1 Data Mining



David L. Olson



Telemarketing

• Segmentation

– which customers respond to new promotions, to discounts, to new product offers – Determine who

• to offer new service to • those most likely to commit fraud



Korea Telecom: KM1 Data Mining



David L. Olson



Human Resource Management

• Identify individuals liable to leave company without additional compensation or benefits • Firm may already know 20% use 80% of offered services

– don’t know which 20% – data mining (business intelligence) can identify



• Use most talented people in highest priority(or most profitable) business units

Korea Telecom: KM1 Data Mining David L. Olson



Human Resource Management

• Downsizing

– identify right people, treat them well – track key performance indicators – data on talents, company needs, competitor requirements



• State of Mississippi’s MERLIN network

– 30 databases (finance, payroll, personnel, capital projects) Korea – Cognos Impromptu system - 230 users Telecom: KM1 David L. Olson

Data Mining



CASINOS

• Casino gaming one of richest data sets known • Harrah’s - incentive programs

– about 8 million customers hold Total Gold cards, used whenever the customer spends money in the casino – comprehensive data collection



• Trump’s Taj Card similar

Korea Telecom: KM1 Data Mining David L. Olson



Casinos

• Bellagio & Mandelay Bay

– strategy of luxury visits – child entertainment – change from old strategy - cheap food



• Identify high rollers - cultivate

– identify those to discourage from play – estimate lifetime value of players

Korea Telecom: KM1 Data Mining David L. Olson



ARTS

• computerized box offices leads to high volumes of data • Identify potential consumers for shows • software to manage shows

– similar to airline seating chart software



Korea Telecom: KM1 Data Mining



David L. Olson




Share This Document



Related docs
Other docs by vivi07
Bellissimo Villa Features
Views: 0  |  Downloads: 0
2008~2009学年度第一学期
Views: 92  |  Downloads: 0
Comprehensive Healthcare for Foster Care
Views: 0  |  Downloads: 0
Financial crises
Views: 9  |  Downloads: 1
聯絡人
Views: 102  |  Downloads: 1
Requirement list _xls_ - Smitha Barki
Views: 2  |  Downloads: 0
by registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!