Data Mining

Document Sample
Data Mining Powered By Docstoc
					Data Mining
 Mandeep Jandir
    CS157B
       What is Data Mining?
Data mining, or knowledge discovery, is
 the process of discovering hidden patterns
 and relationships in data in order to make
 better and more informed decisions.
Data mining tools predict behaviors and
 future trends, allowing businesses to make
 knowledge-driven decisions.
        Why use Data Mining?
Data mining is technique that helps
 individuals or companies find useful
 information to make better decisions from
 large amounts of data.
     Reduce risks
     Find problems and issues
     Save money
     High confidence predictions
     Simplifies information
          Goals of Data Mining
Prediction
     Data mining can show how certain attributes
      within the data will behave in the future.
     Ex. - certain seismic wave patterns may
      predict an earthquake with high probability.
Identification
     Data patterns can be used to identify the
      existence of an item, an event, or an activity.
  Goals of Data Mining (cont’d)
Classification
     Data mining can partition the data so that
      different classes or categories can be
      identified based on combinations of
      parameters.
     Ex. - customers in a supermarket can be
      categorized into discount-seeking shoppers,
      shoppers in a rush, loyal regular shoppers,
      shoppers attached to name brands, and
      infrequent shoppers.
  Goals of Data Mining (cont’d)
Optimization
     Optimize the use of limited resources such as
      time, space, money, or materials and
      maximize output variables such as sales or
      profits under a given set of constraints.
Types of Knowledge Discovered
      during Data Mining
Knowledge is often classified as inductive
 versus deductive.
     Deductive knowledge deduces new
      information based on applying pre-specified
      logical rules of deduction on the given data.
     Data mining addresses inductive knowledge,
      which discovers new rules and patterns from
      the supplied data.
Types of Knowledge Discovered
   during Data Mining cont’d
It is common to describe knowledge
 discovered during data mining as:
     Association Rules
     Classification hierarchies
     Sequential patterns
     Patterns within time series
     Clustering
   Types of Association Rules
Market-Basket Model, Support, and
 Confidence
Apriori Algorithm
Sampling Algorithm
Frequent-Pattern Tree Algorithm
Partition Algorithm
           Apriori Algorithm
Principle: Any subset of a frequent itemset
 must be frequent.
Generate k-itemsets by joining large k-1-
 itemsets and deleting any that is not large.
Notation:
      Apriori Algorithm cont’d
Input: Database of m transactions, D, and
 a minimum support, mins, represented as
 a fraction of m.
Output: Frequent itemsets, L1,L2,…,Lk
             References
http://en.wikipedia.org/wiki/Data_mining
http://www.megaputer.com/dm/dm101.php
 3#whyuse
www.icaen.uiowa.edu/~comp/Public/Aprior
 .pdf
Elmasri, R. and Navathe, S.:
 Fundementals of Database Systems, 5th
 ed.,Pearson-AddisonWesley

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:22
posted:9/11/2012
language:English
pages:13