Docstoc

Dimensions Overview

Document Sample
Dimensions Overview Powered By Docstoc
					The SPSS Portfolio

      Tim Daciuk
      Services Manager, Canada
      SPSS Inc.

       April 15, 2011




                                 1
SPSS At A Glance

   Leadership
       Market leader in Predictive Analytics
       Focus on online & offline customer data acquisition and analysis

   Stability
       30+ year heritage in analytic technologies

   Proven track record
       250,000+ customers worldwide
       NASDAQ: SPSS

   Analytics standard
       80% of Fortune 500 are SPSS customers
       80% plus market share in Survey & Market Research sector
       Ranked #1 Data Mining solution by KD Nuggets
                                                                           2
Public Sector Customers




    Centers for Disease Control        Internal Revenue Service

    DHHS Office of Inspector General   TX Comptroller of Public Accounts

    NY Department of Public Health     UK Gov’t Communications Bureau

    Canada Revenue Agency              Department of Justice

    Centers for Medicare & Medicaid    USAF Leaders Project

    Florida Department of Revenue      US Department of State

    HM Customs & Excise                US Army Recruiting Command

    New York City Human Resources      US Joint Forces Command

    US Dept. of Agriculture

                                                                           3
Predictive Analytics: Defined


 Predictive analysis helps connect data to effective
   action by drawing reliable conclusions about
       current conditions and future events.


              — Gareth Herschel, research director, Gartner group




                                                                    4
Core Technologies
1. Statistical Analysis
2. Data Mining
3. Text Mining
4. Web Analytics
5. Data Collection
6. Deployment




                          5
Statistical Analysis




                       6
Descriptive Analysis
    Analytic software:
                                                       Satisfaction with service 1-10
        Data displays                            80
         (e.g., frequency
         distributions)
        Graphic displays of data                 60
         (e.g. histogram)
        Measures of central
         tendency (e.g., mean,                    40
         median)
        Estimates of variance
                                                  20
         (e.g., standard deviation)
                                      Frequency




                                                                                                           Std. Dev = 1.65
                                                                                                           Mean = 8.3
                                                  0                                                        N = 248.00
                                                        3.0   4.0   5.0    6.0   7.0    8.0   9.0   10.0


                                                       Satisfaction with service 1-10
SPSS product: SPSS Base
                                                                                                                             7
Inferential Analysis

   Predicting numerical or
    categorical outcomes
       Linear regression
       GLM Multivariate/Repeated
        Measures
       Non-linear regression
       Weighted least squares
       Two Stage Least Squares
       Survival Analysis/Cox regression
       Structural Equation Modeling
SPSS products: SPSS Base,
Regression Models, Advanced
Models & AMOS
                                           8
Reporting
   Graphical software
       Visually communicate your
        results
       Create more visually
        compelling information




SPSS products: SPSS Base
(and Trinity)
                                    9
Powerful Data Management

    Control Panel for
     OMS
        Allows users to turn
         output into:
           XML
           HTML
           Text
           SPSS data file




                                10
Data Mining




              11
Where does Data Mining fit?

   Three classes                                 Cluster
    of data mining    What events occur                             Group cases that
                                                                    exhibit similar
                      together? Given a
                                                                    characteristics.
                      series of actions; what
    algorithms        action is likely to occur
                      next?
       Prediction
                                                     Data
       Association
       Clustering                                  Mining
                                                                             Predict
                        Associate
                                                  Predict who is likely to
                                                  exhibit specific
                                                  behavior in the future.




                                                                                       12
Profile and Predict: Supervised Learning

   Build a predictive profile of                                                           Credit ranking (1=default)


                                                                                                 Cat.     %        n
                                                                                                 Bad 52.01        168
                                                                                                 Good 47.99       155




    the historical outcome
                                                                                                 Total (100.00)   323

                                                                                              Paid Weekly/Monthly
                                                                                   P-value=0.0000, Chi-square=179.6665, df=1

                                                                  Weekly pay                                                              Monthly salary


                                                            Cat.    %      n                                                            Cat.    %      n




    using a collection of                                   Bad 86.67 143
                                                            Good 13.33 22
                                                            Total (51.08) 165

                                                           Age Categorical
                                               P-value=0.0000, Chi-square=30.1113, df=1
                                                                                                                                        Bad 15.82 25
                                                                                                                                        Good 84.18 133
                                                                                                                                        Total (48.92) 158

                                                                                                                                       Age Categorical
                                                                                                                           P-value=0.0000, Chi-square=58.7255, df=1




    potential input fields.
                                    Young (< 25);Middle (25-35)                    Old ( > 35)                          Young (< 25)                    Middle (25-35);Old ( > 35)


                                        Cat.    %      n                        Cat.    %        n                  Cat.    %     n                        Cat.    %      n
                                        Bad 90.51 143                           Bad    0.00      0                  Bad 48.98 24                           Bad    0.92    1
                                        Good 9.49      15                       Good 100.00      7                  Good 51.02 25                          Good 99.08 108
                                        Total (48.92) 158                       Total (2.17)     7                  Total (15.17) 49                       Total (33.75) 109

                                                                                                                    Social Class
                                                                                                      P-value=0.0016, Chi-square=12.0388, df=1

                                                                                                 Management;Clerical                     Professional




   Explores all combinations,                                                                   Cat.    %         n                   Cat.    %     n
                                                                                                 Bad    0.00       0                   Bad 58.54 24
                                                                                                 Good 100.00       8                   Good 41.46 17
                                                                                                 Total (2.48)      8                   Total (12.69) 41




    interactions and
    contingencies.
   Use this profile to
    understand and predict
    future cases.
                                                                                                                                                                                     13
Cluster and Associate: Unsupervised
Learning

   Find emerging
    patterns and unusual
    cases.
       Use data mining to examine
        the differences and shifts
        across all dimensions of the
        data.
       Select large groups to identify
        common patterns. Select
        small groups to identify
        unusual patterns.



                                          14
The Product: Clementine




                          15
Read your data in…




                     16
Define your Target and Predictors




                                    17
Build a Rule-based Predictive Model




                                      18
Text Mining




              19
From Concepts to Predictive Analytics
Components




               LexiQuest          LexiQuest
               Mine               Categorize
                 Discover                 Understand
                concepts,                 documents
            relationships    Linguistic   and assign in
               and trends   Terminology   pre-defined
                             Extractor    categories


                      Text Mining for
                      Clementine
                        Add text fields to
                        data mining for
                        better prediction




                                                          20
Underlying Technology is Linguistic
based

 Text is:
    Unstructured
    Ambiguous
    Language dependent

 Linguistic Approach
    Does not treat a document as a bag of words
    Removes ambiguity by extracting structured
    concepts
    Concepts are the DNA of text




                                                  21
Core Technology…
Linguistics Based Search Technology


      SPSS LexiMine
          Concept Extraction and Query Building




Classification Document Management

      SPSS Categorize
          Supervised Learning in Existing Systems

                                                     22
Text Mining for Clementine

Text Mining for Clementine consists of three
 nodes:
   Text Mining Source - uses LexiQuest Mine to automatically extract
    concepts, categories and frequencies from a set of documents

   Text Mining Process - uses LexiQuest Mine to automatically extract
    concepts, categories and frequencies from text data stored in a
    database, and links these results to structured data

   Document Viewer - displays the document or documents selected from
    the Text Mining Source node




                                                                         23
Web Analytics




                24
     Web Measurement Continuum


                            •Recency                •Predict Likelihood to Respond
           •# Users                                 •Automatic User Segments
           •# Visits        •Frequency
                            •Average Visit Streak   •Content Clustering
           •# Page Views                            •Significant Activity Sequences
           •Top Pages       •Campaign Sales
                            •Eventstream            •Content & Activity Associations
           •Top Referrers                           •Textbook Visits
Insight
           •# Errors        •Sectionstream
 Value                                              •Homepage Bouncing
  ROI


                                                         PREDICTIVE
                                                            WEB
                                                         ANALYTICS

                                  WEB                       WEB
                                ANALYTICS                 ANALYTICS

                WEB                WEB                        WEB
               STATS              STATS                      STATS

          Activity Counts   Business Insight         Customer Intimacy

                                                                                       25
Web Mining for Clementine (WM4C)

   Takes web data preparation directly into
    Clementine – removes the need for NetGenesis

   Turns huge volumes of web logs into business
    events data

   Allows for very fast deployment of data mining on
    top of web data




                                                        26
Data Collection




                  27
Dimensions Capabilities




                          28
    Dimensions Objectives

   Software and Development Platform, not
    just a set of applications

   Design Once use Many.

   Powerful: handles the most complex
    survey designs and multimodal
    deployment.

   Centralized: defines and translates
    metadata that drives all downstream
    processes
                                             29
Dimensions Solutions
   Survey Design
       Interview Builder – web based (included w/ mrInterview)

   Data Collection Methods
       mrInterview data collection engine
            web-based and/or Call-Center functions available
       mrPaper – create paper surveys within MS Word
       mrPaper/mrScan – scan solution with Readsoft EHF
       CATI – call center solution software
       mrDialer – call center automated/predictive dialers
       Palm PDA –data collection with Techneos EntryWare




                                                                  30
Dimensions Solutions

   Analysis/ Reporting/ Publishing
       mrTables – web based reporting and publishing
       mrStudio – desktop and script based - automate processes
       Dimensions Component Pack – server processing
       Interview Reporter for real-time reporting of web data

   mrTranslate : Managing Multiple Languages
       Easy to use tool that does not require research knowledge to use
       Writes directly to the questionnaire metadata
       Supports all single and double byte character sets
          European (Spanish, French, etc.)
          Double Byte (Japanese, Mandarin, Korean, etc.)




                                                                           31
Deploying SPSS Models




                        32
  Deployment Solutions
Data Collection     Reporting/Analysis       Deployment
  Existing Data
                                          SPSS Web Deployment
                                               Framework

  Survey Data




                        Text Mining for
  Web Behavior            Clementine

                                              Web Based
                                              Applications
  Text Extraction      Web Mining for
                        Clementine         Predictive Marketing

                                          Predictive Call Center
                                                                  33
Summary

   Predictive analytics

   Wide range of products
       Collect data
       Analyze data
       Mine data
       Score and handle data

   Wide range of applications
       Predict
       Group
       Associate
       Anomalies

                                 34
Questions


            ?


                35
Contacts
   Tim Daciuk                    Angie Mohr
   Services Manager, Canada      SPSS Sales, Canada
   416-410-7921                  613-599-3377
   tdaciuk@spss.com              amohr@spss.com




                   www.spss.com

                                                        36

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:18
posted:4/15/2011
language:English
pages:36