Data Mining

Document Sample
Data Mining Powered By Docstoc
					Data Mining

  Dave Maung
What is Data Mining?

  The process of automatically searching large
   volumes of data for patterns.
  Also known as KDD Knowledge-Discovery.
Different types of Data Mining

  Relationaldata mining
  Text mining
  Web mining
Relational Data Mining

  Data mining technique for relational
  Relational data mining algorithms look
   for patterns among multiple tables
  Used classification rules and Association

  Predicting an item class
  Finding rules that partition the given data
   into disjoints groups
  Popular classification Methods is
   decision tree
Decision Tree

  A graph of decisions and their possible
  Decision trees are constructed to help
   making decisions.
  A decision tree used tree structure.
Example of Decision Tree
Text Mining

    Is the process of
      extracting interesting
      non-trivial information

      knowledge from unstructured text
Text Mining (continued)

    Also known as
      intelligent text analysis

      text data mining

      unstructured data management

      or knowledge-discovery in text
Web Mining

  Is the extraction of interesting potentially
   useful patterns
  Implicit information from artifacts
  Activity related to the Worldwide Web
Web Mining (continued)

    Three knowledge discovery domains that
     pertain to web mining
      Web Content Mining,
      Web Structure Mining,

      Web Usage Mining
Web Content Mining

  Is an automatic process that goes
   beyond keyword extraction.
  There are two groups of web content
   mining strategies:
      mine the content of documents
      improve on the content search of other tools
       like search engines.
Web Structure Mining

    Is Worldwide Web can reveal more
     information than just the information
     contained in documents
Web Structure Mining (example)

  Links pointing to a document indicate the
   popularity of the document.
  Links coming out of a document indicate
   the richness or perhaps the variety of
   topics covered in the document.
Web Usage Mining

  Web servers record and accumulate
   data about user interactions whenever
   requests for resources are received.
  Analyzing the web access logs of
   different web sites
Web Usage Mining

  Two main tendencies in Web Usage
  Mining driven:
   General Access Pattern Tracking

   Customized Usage Tracking
General access pattern

  Analyzes the web logs to understand
   access patterns and trends
  Give better structure and grouping of
   resource providers
  Can be used to restructure sites in a
   more efficient grouping, and target
   specific users for specific selling ads
Customized usage tracking

  Analyzes individual trends
  To customize web sites to users
  Success of Application depends on what
   and how much valid and reliable
   knowledge one can discover from the
   large raw log data.
Web Mining Architecture


Shared By: