# Classification

Document Sample

```					               MIS 451

Classification (2)
Data

2
Classification
   Divide and Conquer
   Pick an attribute to divide the data set with the
most entropy reduction

   Stop until no attribute to pick or data in all leaf
nodes are pure (I.e. belong to one class)

3
Classification
   Step 1: there are four attributes to pick:
student, income, age, and credit rating
   E(BD) = 0.940

   E(D|student) = 0.789
   E(D|age) = 0.694
   E(D|income) = 0.911
   E(D|credit) = 0.892

4
Classification
   Step 2: Divide the original data set by age into
subset1 (<=30), subset 2 (31:40) and subset3 (>40)

   Step 3-1: For subset 1, there are three attributes to
pick: income, student, and credit
   E(BD) = 1.17

   E(D|student) = 0
   E(D|income) = ??
   E(D|credit) = ??

5
Classification
   Step 3-2: Divide subset 1 by student into subset1-1
(yes) and subset 1-2 (no)

   Step 4-1: For subset 3, there are three attributes to
pick: income, student, and credit
   E(BD) = 1.17

   E(D|credit) = 0
   E(D|income) = ??
   E(D|student) = ??
   Step 4-2: Divide subset 3 by credit into subset3-1
(fair) and subset 3-2 (excellent)
6
Extract rules from the model
   Each path from the root to a leaf node forms
a IF-THEN rule.

   In this rule, root and internal nodes are
conjuncted to form the IF part.

   Left node denotes the THEN part of the rule.

7

8

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 2 posted: 10/20/2012 language: Latin pages: 8