SQL Server Data Mining Presentation
SSAS Mining Model Demo – Target Profitable Potential Customers
Explain the Database: Adventure works is a company that sells bicycles and bike parts.
Problem: Have a new list of potential customers and would like to see which customers would be
most likely to buy bike for a phone call marketing plan.
1. Start Business Intelligence Development Studio
2. Create a new integration services project called “Data Preparation” in a solution call
“Data Mining Demo”.
3. Create a data flow task and open the data flow task.
4. Add an OLE DB Data Source and connect it to the AdventureWorksDW database.
Select the vTargetMail view which has targeted customer information and whether they
bought a bike or not.
5. Add a Percentage Sampling transform to the data flow and connect the output of the
OLEDB data source to it. Use the following Settings:
a. Percentage of Rows = 10
b. Sample Output Name = “Test Data Output”
c. Unselected Output Name = “Training Data Output”
6. Create two (2) OLEDB Destinations. Label the first “Test Data” and the second
7. Map the outputs from the Percentage Sampling transform to the appropriate targets.
8. Connect the OLEDB Destinations to the AdventureWorksDW database and have each
of them create a new table. Name the new tables as follows (they should default to this in
a. [Test Data]
b. [Training Data]
9. Map the columns (should happen automatically when you click on “Mappings”)
10. Run the package and review the tables in Management Studio.
1. Add a new Analysis Services project called “Data Mining” in the same solution.
2. Create a new Data Source and call it Customer Data Source. It should point to the
AdventureWorksDW database. . (Use service account)
3. Create a new Data Source View
4. Add the Test Data and Training Data and ProspectiveBuyer tables to the DSV. No
relationships are necessary and name the view Customer Data.
5. Create a new Mining Structure for an existing relational database.
6. Select the Decision Tree Modeling Technique. Briefly show the other techniques and
note that custom algorithms can be added.
7. Select the DSV you created above (Customer Data) and click NEXT.
8. Select the Training Data table as the Case table and click NEXT.
9. Specify the Training Data as follows:
a. Predictable = BikeBuyer
b. Input = All other columns (except date of first purchase causes new customers
to all have the same probability)
c. Key = CustomerKey
d. Be sure to uncheck BikeBuyer from Input and click NEXT.
10. Specify Columns Content Data Type by using the smart detect (Detect) button. Explain
how the BikeBuyer column changed to Discrete as a result of data sampling.
Note: Discrete means that a numeric column can have only a finite quantity of values.
Continuous means that the any quantity of values (any number) can be in the data.
BikeBuyer is Discrete because it must be either 1 or 0. click NEXT.
11. Name the Mining Structure TargetCustomers. Name the Mining Model
TargetCustomersDT for Decision Tree and click FINISH.
12. View the Mining Structure tab and Mining Models Tab.
13. On the Mining Models Tab, Right Click the white space can select New Mining Model.
14. Name the mining model TargetCustomersNB and select Microsoft Naïve Bayes for the
15. A warning appears that Naïve Bayes does not support continuous data type and those
fields will be ignored in the model. Click YES to continue.
16. Deploy the DataMining Project.
1. Select the Mining Model Viewer Tab and Review the DT Model, Expand all and select 1
for background and select a bar with the darkest blue and look at Mining Legend
description. Select the Dependency Network and use link slider to see importance of
2. Review the NB Model, look at tabs, on the attribute characteristics tab change value to 1.
Look at Attribute Discrimination
3. Show the Mining Accuracy feature by selecting the Mining Accuracy Chart Tab. Select
the Test Data table as the Input Data for comparing the models and note the mappings.
Set the Predict Value to 1 for both models
4. Select the Lift Chart Tab to show how the two models compare to an ideal model.
5. Change the chart type to Profit chart type to compare the two models potential
Population: 5000 new customersFixed Cost: 1000
Individual Cost: 5 Revenue per individual: 10
6. Which model gives maximum profit?
Apply Model Creating a Report
1. Add a new Report Server Project Wizard to the Solution and name it Customers to
2. Select Analysis Services as the type of data source and click EDIT to edit connection
3. type localhost for the server name and select DataMining as the database to connect to.
Test Connection and click OK. And then click Next.
4. Select the Query Builder shows the prediction query builder.
5. Select the decision tree model TargetCustomersDT.
6. Select ProspectiveBuyer for the case table.
7. Fix mappings for education and occupation.
8. Drag firstname, lastname and phone from ProspectiveBuyer to list at bottom.
9. Add a Prediction Function as a where clause and select Predict Probability and give it
an alias of Prob.
10. Add Parameters BikeBuyer = 1 by dragging bikebuyer from model to criteria/argument
11. Add another Prediction Function and select Predict Probability, uncheck show and add
condition for where clause by coping previous criteria and pasting it in to look like:
([TargetCustomersDT].[Bike Buyer],1) > 0.40
Click last tool in query toolbar to view sql.
12. Click the Prepare Query button to test (next to last button)
13. Click Ok and click Next.
14. Select tabular for the report type and click Next.
15. Add all columns to Detail section.
16. select defaults for style.
17. change report server to http://localhost/reportserver
18. Name the report Customers to Call and click Finish.
19. Preview Report.
Optionally, Duplicate report properties using Mining Model Prediction Tab in BIDS to