; What is Data Mining - UCLA Computer Science
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

What is Data Mining - UCLA Computer Science

VIEWS: 0 PAGES: 10

  • pg 1
									Knowledge Discovery from
    DataBases (KDD)
  A.K.A. Data Mining &
  by other names as well

          Carlo Zaniolo
          UCLA CS Dept




                           1
            What is Data Mining?
Data mining
  Extraction of interesting (non-trivial, implicit,
   previously unknown & potentially useful) patterns
   or knowledge from huge amount of data.
Alternative names
  Knowledge discovery (mining) in databases
   (KDD), knowledge extraction, data/pattern
   analysis, data archeology, data dredging,
   information harvesting, business intelligence, ...




                                                  2
                 Why Data Mining?

 Explosive growth of data available—the Big-Data
  Revolution
      Business: Web, e-commerce, transactions, stocks, …
      Science: Remote sensing, bioinformatics, scientific
       simulation, …
      Society and everyone: news, digital cameras, ...

 We are drowning in data -- but starving for
  knowledge!
   Knowledge is the key to improve your business and
    operations
   Data Mining tools and techniques: automate knowledge
    discovery from large data sets

                                                             3
                  DM Applications
E.g.: Marketing products to customers:
1.   Find clusters of customers who share the same
     characteristics: interest, income level, spending
     habits, etc.,
2. Determine customer purchasing patterns over time
3. Cross-market analysis—Find associations/co-
   relations between product sales (and predict on
   that basis)
4. Profiling—What types of customers buy what
   products.


                                                         4
                   DM Applications:
              Fraud Detection and Security
 Approaches: Clustering & outlier detection, looking for
  unusual patterns.
 Applications: Health care, retail, credit card service,
  telecomm.
   Auto insurance: ring of collisions
   Money laundering: suspicious monetary transactions
   Medical insurance
       Professional patients, ring of doctors, and ring of references
       Unnecessary or correlated screening tests
   Telecommunications: phone-call fraud
       Phone call model: destination of the call, duration, time of day
        or week. Analyze patterns that deviate from an expected norm
   Anti-terrorism

                                                                     5
                 New Applications
Software Bug Mining
Graph Mining: e.g. finding social networks
Web Mining
Personalization and reccomendations
Mining and Scientific Applications—Biology
Spatio-Temporal and GIS:
  Find geographical clusters.
  Mine for trajectories and travel plans.
Multi Relational Data Mining
  Mining for knowledge and relationship from
   multiple tables, as in
  Inductive Logic Programming.
                                                6
            New Research Topics


Theoretical foundations
Statistical Data Mining
Visual Data Mining
Privacy-Preserving Data Mining




                                  7
          A Historical Perspective

1. Machine Learning (AI)
2. Decision Support Environments:
   Scalability, Integration, Warehousing,
   OLAP (DB)
3. Statistical foundation and synergism with
   other disciplines—e.g., visualization.
4. Mining Streams of sensor & web data




                                               8
                        Work plan


 Introduction
   Core Techniques:
       1. Classification,
       2. Association, and
       3. Clustering

 Process and Systems

 New Applications and Research Directions



                                             9
        Knowledge Discovery (KDD) Process
  Data mining—core of
                                                        Useful New
   knowledge discovery
   process                                              knowledge
                                    Pattern& Rules
                                                     Auditing

               Task-Specific Data
                                       Data Mining
   Data Warehouse
                       Data Selection &
                       preprocessing

               Data Cleaning
               Data Integration


Data Sources: transactional &
                                                                10
operational data

								
To top