Docstoc

Mining Rules from Crisp Attributes by Rough Sets on the Fuzzy Class Sets

Document Sample
Mining Rules from Crisp Attributes by Rough Sets on the Fuzzy Class Sets Powered By Docstoc
					                                                    (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                       Vol. 10, No. 4, April 2012



   Mining Rules from Crisp Attributes by Rough
           Sets on the Fuzzy Class Sets

                  Mojtaba MadadyarAdeh#1, Dariush Dashchi Rezaee#2, Ali Soultanmohammadi#3
              #
               Sama Technical and Vocational Training College, Islamic Azad University, Urmia Branch
                                                   Urmia, Iran
                                            1
                                              m.madadyar@iaurmia.ac.ir
                                         2
                                           d_dashchi_rezaee@yahoo.com
                                      3
                                        ali_soultanmohammadi@yahoo.com

Abstract—Machine learning can extract desired                         Examples are Orlowska‘s reasoning with incomplete
knowledge and ease the development bottleneck in                      information, [1] knowledge-base reduction, [9] data
building expert systems. Among the proposed                           mining, Zhong, Dong, [18] rule discovery. Due to the
approaches, deriving classification rules from training               success of the rough-set theory to knowledge
examples is the most common. Given a set of examples,                 acquisition, many researchers in database and
a learning program tries to induce rules that describe                machine learning fields are interested in this new
each class. The rough-set theory has served as a good                 research topic because it offers opportunities to
mathematical tool for dealing with data classification                discover useful information in training examples. [19]
problems. In the past, the rough-set theory was widely
                                                                      Mentioned that the main issue in the rough-set
used in dealing with data classification problems that
data sets were containing crisp attributes and crisp class
                                                                      approach was the formation of good rules. He
sets. This paper thus extends rough-set theory previous               compared the rough-set approach with some other
approach to deal with the problem of producing a set of               classification approaches .The main characteristic of
certain and possible rules from crisp attribute by rough              the rough-set approach lies in that it can use the notion
sets on the fuzzy class sets. The proposed approach                   of inadequacy of available information to perform
combines the rough-set theory and the fuzzy class sets                classification of objects [19][20]. It can also form an
theory to learn. The examples and the approximations                  approximation space for analysis of information
then interact on each other to drive certain and possible             systems. Partial classification may be formed from the
rules. The rules derived can then serve as knowledge                  given objects. Ziarko also mentioned the limitations of
concerning the data sets on the fuzzy class sets.                     the rough-set model. For example, the classification
                                                                      with a controlled degree of uncertainty or
    Keywords-Fuzzy set; Rough set; Data mining; Fuzzy                 misclassification error is outside the realm of the
class sets; Crisp attributes; Certain rule; Possible rule; α-         approach. Overgeneralization is another limitation to
cut                                                                   the rough-set approach. Ziarko thus proposed the
                                                                      variable precision rough-set model to solve the above
                    I.   INTRODUCTION
                                                                      problems .The variable precision rough-set model has
    Machine learning and data mining techniques have                  however only shown how binary or crisp valued
recently been developed to find implicitly meaningful                 training data may be handled. Training data in real-
patterns and ease the knowledge-acquisition                           world applications usually consist of quantitative
bottleneck. Among these approaches, deriving                          values. Although the variable precision rough-set
inference or association rules from training examples                 model can also manage the quantitative values by
is the most common [11], [13]. Given a set of                         taking each quantitative value as an attribute value,
examples and counterexamples of a concept, the                        the rules formed in this way may be too specific. It
learning program tries to induce general rules that                   may also cause humans hard to interpret them.
describe all or most of the positive training instances               Extending the variable precision rough-set model to
and none or few of the counterexamples [6]. If the                    effectively dealing with quantitative values is thus
training instances belong to more than two classes, the               important to real applications of the model. Since the
learning program tries to induce general rules that                   fuzzy set concepts are often used to represent
describe each class. Recently, the rough-set theory has               quantitative data by linguistic terms and membership
been used in reasoning and knowledge acquisition for                  functions because of their simplicity and similarity to
expert systems [3][13]. It was proposed by Pawlak in                  human reasoning [2], we thus attempt to combine the
1982, with the concept of equivalence classes as its                  variable precision rough-set model and the fuzzy set
basic principle. Several applications and extensions of               theory to solve the above problems. The rules mined
the rough-set theory have also been proposed.                         are expressed in linguistic terms, which are more




                                                                112                              http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500
                                                  (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                     Vol. 10, No. 4, April 2012

natural and understandable for human beings. Since                  (equivalence) relation on attribute set B. These
the number of linguistic terms is much less than that               equivalence relations thus partition the object set U
of possible quantitative values, the over-specialization            into disjoint sub sets, denoted by U/B, and the
problem can be avoided. Tzung [7] has successfully                  partition including Obj(i) is denoted by B(Obj(i)). The
proposed a mining algorithm to find fuzzy rules based               set of equivalence classes for subset B is referred to as
on the rough-set model. The variable precision rough-               B-elementary set.
set model can be thought of as a generalization of the
rough-set model. Tzung [10] deal whit the problem of                    Example 1. Table I shows a data set containing
producing a set of certain and possible rules from                  seven objects denoted by U ={ Obj(1) ; Obj(2);...; Obj(7)
incomplete data sets on the crisp class sets.                       }, two attributes denoted by A={Systolic Pressure
                                                                    (SP), Diastolic Pressure (DP)}, and a class set Blood
     In this paper, we thus deals with the problem of               Pressure (BP). Assume the attributes and the classes
producing a set of certain and possible rules from                  set have three possible values: {Low (L), Normal (N)
mining crisp attributes by rough sets on the fuzzy                  and High (H)}.
class sets . A new method, approach combines the
rough-set theory and the fuzzy class sets theory to                              TABLE I.          THE DATA SET FOR EXAMPLE 1.
learn, is thus proposed to solve this problem. It first
transforms each class sets quantitative value into a                   Object         Systolic            Diastoli             Blood
                                                                                    Pressure(SP)        Pressure(DP)        Pressure(BP)
fuzzy set of linguistic terms using membership                         obj(1)            L                    N                   L
functions and converts each of fuzzy class sets by α-
                                                                       obj(2)            H                   N                    H
cut in several crisp subclasses. It second, calculates                 obj(3)            N                   N                    N
the lower and the upper approximations. The certain                    obj(4)            L                    L                    L
and possible rules are then generated based on these
                                                                       obj(5)            H                   H                     H
approximations. This paper thus extends rough-set
theory previous approach to deal with the problem of                   obj(6)            N                   H                     H
                                                                       obj(7)            N                   L                     N
producing a set of certain and a possible rule from
crisp attributes by rough sets on the fuzzy class sets.
The paper thus extends the existing rough-set mining                    Since Obj(1) and Obj(4) have the same attribute
approaches to process quantitative data with tolerance              value (L) for attribute SP, they share an
of noise and uncertainty.                                           indiscernibility relation and thus belong to the same
     The remaining parts of this paper are organized as             equivalence class for SP. The equivalence partitions
follows. In Section 2, the variable precision rough-set             (elementary sets) for singleton attributes can be
model is reviewed. In Section 3, α-cut and fuzzy class              derived as follows:
sets is described. In Section 4, the notation used in               U/{SP} = {{obj(2), obj(5)}{ obj(3), obj(6), obj(7)}{ obj(1),
this paper is described. In Section 5, the proposed                 obj(4)}}, and
algorithm for crisp attributes data sets on the fuzzy               U/{DP} = {{obj(1), obj(2), obj(3)}{ obj(4), obj(7)}{
class sets. In Section 6, an example is given to                    obj(5), obj(6)}},
illustrate the proposed algorithm.
                                                                    Also, {SP}( obj(1)) = {SP}( obj(4)) = { obj(1), obj(4)}.
       II.   REVIEW OF THE ROUGH-SET THEORY                             The rough-set approach analyzes data according to
                                                                    two basic concepts, namely the lower and the upper
    The rough-set theory, proposed by Pawlak in 1982
                                                                    approximations of a set. Let X is an arbitrary subset of
[14], can serve as a new mathematical tool for dealing
                                                                    the universe U, and B is an arbitrary subset of attribute
with data classification problems. It adopts the
                                                                    set A. The lower and the upper approximations for B
concept of equivalence classes to partition training
                                                                    on X denoted B*(X) and B*(X) respectively, are
instances according to some criteria. Two kinds of
                                                                    defined as follows [20] [4]:
partitions are formed in the mining process: lower
approximations and upper approximations, from
which certain and possible rules can easily be derived.                         B*(X) = {x|x ϵ U, B(X)⊆ X}                          (1)
Formally, let U be a set of training examples (objects),
A be a set of attributes describing the examples, C be
a set of classes, and Vj be a value domain of an                       B*(X) = {x|x ϵ U and B(X) ∩ X ≠ Ø}                           (2)
attribute Aj. Also let vj(i) be the value of attribute Aj
for the ith object Obj(i) . When two objects Obj(i) and                 Elements in B*(X) can be classified as members of
Obj(k) have the same value of attribute Aj, (that is, vj(i)         set X with full certainty using attribute set B, so B*(X)
= vj(k) ), Obj(i) and Obj(k) are said to have an                    is called the lower approximation of X. Similarly,
indiscernibility relation (or an equivalence relation) on           elements in B*(X) can be classified as members of the
attribute Aj. Also, if Obj(i) and Obj(k) have the same              set X with only partial certainty using attribute set B,
values for each attribute in subset B of A; Obj(i) and              so B*(X) is called the upper approximation of X.
Obj(k) are also said to have an indiscernibility



                                                              113                                   http://sites.google.com/site/ijcsis/
                                                                                                    ISSN 1947-5500
                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                             Vol. 10, No. 4, April 2012

    Example2. Continuing from Example 1, assume                             real-world applications. In this paper, we thus deal
X={Obj(1) ,Obj(4)}. The lower and the upper                                 with the problem of learning from class quantitative
approximations of attribute DP with respect to X can                        data sets based on rough sets. A learning algorithm is
be calculated as follows:                                                   proposed, which can simultaneously derive certain
DP*(X) = Ø, and                                                             and possible rules from class quantitative data sets.
DP*(X) = {{ obj(1), obj(2), obj(3)}{ obj(4), obj(7)}}.                      Class sets with quantitative values are first
    After the lower and the upper approximations have                       transformed into fuzzy sets of linguistic terms using
been found, the rough-set theory can then be used to                        membership functions. Therefore, convert fuzzy class
derive certain information and induce certain and                           sets with α-cut define to several crisp subclasses.
possible rules from them (Grzymala-Busse, 1988).                            Number of divisions arbitrary, that α-cut perform on
                                                                            the linguistic terms.
                                                                                                       IV.   NOTATION
          III.    Α-CUT AND FUZZY CLASS SETS
                                                                                Notation used in this paper is described as follows:
   An α-level set of a fuzzy set A of X is a non-fuzzy
denoted by [A]α and is defined by,                                              U           universe of all objects
                                                                               n                      total number of training examples
                                                                            (objects) in U
           {t  X | A(t )             if  0
                                                              (3)              Obj(i) ith training example (object), 1 ≤i ≤n
    [A]  
           cl (supp( A))                    if  0
                                                                               A           set of all attributes describing U
                                                                                m          total number of attributes in A
   Where cl (supp(A)) denotes the closure of the
support of A.                                                                   B           an arbitrary subset of A
   Definition 1(Support) Let A be a fuzzy subset of                             Aj          jth attribute, 1≤ j≤ m
X; the support of A, denoted supp(A), is the crisp
subset of X whose element all have nonzero                                      |Aj|        number of attribute values for Aj
membership grades in A.                                                         vj(i)      the value of Aj for Obj(i)
                                                                                d       number of divisions arbitrary , that α-cut
         sup p( A)  {x  X | A( x)  0}.                      (4)          perform on the linguistic terms
                                                                                C           set of classes to be determined
   Definition 2(triangular fuzzy number) A fuzzy set
A is called triangular fuzzy number with peak (or                               c         total number of classes in C
center) a, left width α>0 and right width β>0 if its                            Rk              kth fuzzy region of C,1 ≤k ≤c
membership function has the following from,                                         (i)
                                                                                e          the value of C for Obj(i)
       1  ( a  t ) /                 ifa    t  a                        f(i)        the fuzzy set converted from e(i)
       
       
A(t)  1  (t  a ) /                  ifa  t  a         (5)              fk(i) the membership value of e(i) in region Rk
       
       0
       
                                          otherwise
                                                                                Xl         lth class, 1 ≤ l≤ (c×d)
                                                                                          (i)
                                                                                B(Obj )      the fuzzy incomplete equivalence
    And we use the notation A= (a, α, β). It can easily
                                                                            classes in which Obj(i) exists
be verified that,
                                                                                B*(X) the fuzzy incomplete lower approximation
                                                                            for B on X
  [A]   [a  (1   ) , a  (1   ) ],   [0,1].        (6)
                                                                                B*(X) the fuzzy incomplete upper approximation
                                                                            for B on X
The support of A is (a-α, a+β). In the past, the rough-
set theory was widely used in dealing with data                                 These fuzzy equivalence relations thus partition
classification problems [10]. Most conventional                             the fuzzy object set U into several fuzzy subsets that
mining algorithms based on the rough-set theory                             may overlap, and the result is denoted by U/B. The set
identify relationships among data using crisp class                         of partitions, based on B and including Obj(i) , is
                                                                            denoted B(Obj(i)). Thus, B(Obj(i))= {(B1(Obj(i)) …
sets values. This possible exist class sets with
                                                                            (Br(Obj(r)) }, where r is the number of partitions
quantitative values, however, are commonly seen in                          included in B(Obj(i)).




                                                                     114                                     http://sites.google.com/site/ijcsis/
                                                                                                             ISSN 1947-5500
                                                   (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                      Vol. 10, No. 4, April 2012

    Example 3. Consider the following three objects                        The lower and upper approximations for attribute
shown in Table II. Assume the linguistic terms in the                   DP on XN1 can be similarly derived.
objects are transformed from class sets quantitative
values by membership functions. Furthermore, Obj(1)
is classified as having a (L2 + N1) blood pressure.                          V.   THE PROPOSED ALGORITHM FOR CRISP
Obj(2) and Obj(3) are classified similarly. Assume the
                                                                         ATTRIBUTES ROUGH SETS ON THE FUZZY CLASS SETS
attributes SP, DP have three possible values (L, H, N).
for the class set BP has three possible linguistic terms                    In the section, a learning algorithm based on rough
(L,H,N) , but this three possible values division to                    sets is proposed, which can simultaneously convert
nine subclass sets by three α-cut on the linguistic                     each of fuzzy class set by α-cut in several crisp
terms (L1,L2,L3;H1,H2,H3;N1,N2,N3).                                     subclass and derive certain and possible rules from
                                                                        crisp attributes data sets on the fuzzy class sets. The
                                                                        proposed learning algorithm first transforms each
         TABLE II.        THE DATA SET FOR EXAMPLE 2.
                                                                        class sets quantitative value into a fuzzy set of
Object       Systolic             Diastoli            Blood             linguistic terms using membership functions and
obj(1)
           Pressure(SP)
                L
                                Pressure(DP)
                                     N
                                                   Pressure(BP)
                                                      L2+N1
                                                                        convert each of fuzzy class sets by α-cut in three crisp
obj(2)          H                    N                H3+N1             subclass . The algorithm then calculates lower and
obj(3)          N                    N                   N3             upper approximations. The details of the proposed
                                                                        learning algorithm are described as follows.
   BP=N2 is then formed as (Obj(1), Obj(2) ). The other                     The Mining rules from crisp attributes by rough
fuzzy class sets indiscernibility relations can be                      sets on the fuzzy class sets:
similarly derived.
                                                                           Input: A quantitative data set with n objects, each
    XL2={ Obj(1)}                                                       with m attribute values and a set of membership
    XN1={ Obj(1), Obj(2)}                                               functions for class sets.

    XH3={ Obj(2)}                                                          Output: A set of certain and possible rules.

    XN3={ Obj(3)}                                                             Step 1: Transform the class sets quantitative value
                                                                        e(i) of each object Obj(i) ;i =1 to n, for each class sets
   It is easily observed that an object may exist in                    C, into a fuzzy set f (i) , represented as ( f(i)1/R1 +
more than one subclass of an class sets. In the above                   f(i)2/R2 + … + f(i)i/Ri ) , using the given membership
example, Obj(1) exists in two subclasses for class sets                 functions, where Rk is the kth fuzzy region of class
(XL2,XN1).                                                              sets C ; fk(i) is e(i)‘s fuzzy membership value in region
    Also for attributes, SP=N is then formed as Obj(3) .                Rk, and l (= c×d) is the number of fuzzy regions for C.
The other indiscernibility relations can be similarly                       Step 2: convert fuzzy class sets with α-cut define
derived. U/{SP} has thus been found as follows:                         to several crisp subclass. Number of divisions is
    U/{SP}={ (Obj(1))(Obj(2))(Obj(3))}                                  arbitrary, that α-cut perform on the linguistic terms.

    Similarly,                                                              Step 3: Partition the object sets into disjoint
                                                                        subsets according to subclass labels. Denote each set
    U/{DP}={ (Obj(1),Obj(2),Obj(3))}                                    of objects belonging to the same subclass Cl as XL.
   The lower and upper approximations for B on X,                           Step 4: Find the elementary sets of singleton
denoted B*(X) and B*(X) respectively, are defined as                    attributes.
equation ―(1)‖ and ―(2)‖ .
                                                                            Step 5: Initialize q = 1, where q is used to count
    Assume XN1 = {Obj(1), Obj(2)}. Since equivalence                    the number of attributes currently being processed for
class in U/{SP} is included in XN1, the lower                           lower approximations.
approximation for attribute SP on XN1 is thus:
                                                                           Step 6: Compute the lower approximations of
    SP*(XN1)={( Obj(1))( Obj(2))}                                       each subset B with q attributes for each class XL as:
    The equivalence class in U/{SP} have non-empty
intersections with XN1. Since the second equivalence                        B* (X) = {obj (i) | obj (i) U , B(obj (i) )  X }
class has been included in the lower approximation,                                                                                   (7)
the upper approximation for attribute SP on XN1 is
thus:                                                                       Where B(Obj(i)) is the set of equivalence classes
                                                                        including Obj(i) and derived from attribute subset B.
    SP*(XN1)=Ø
                                                                           Step 7: Compute the upper approximations of
                                                                        each subset B with q attributes for each class Xl as:




                                                                  115                                 http://sites.google.com/site/ijcsis/
                                                                                                      ISSN 1947-5500
                                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                            Vol. 10, No. 4, April 2012

                                                                               Step 1: The quantitative values of each object are
B (X) = {obj | obj U & B(obj )  X  } (8)
  *              (i)    (i)                (i)                             transformed into fuzzy sets. Take the class sets Blood
                                                                           Pressure in Obj(2) as an example. The value ‗‗124‖ is
                                                                           converted into a fuzzy set (0.24/N+04/H) using the
    Where B(Obj(i)) is the set of equivalence classes                      given membership functions. Results for all the
including Obj(i) and derived from attribute subset B.                      objects are shown in Table IV.
   Step 8: Calculate the plausibility measures of each
fuzzy incomplete equivalence class in an upper                              TABLE IV.       THE FUZZY SETS TRANSFORMED FROM THE CLASS
                                                                                                  SETS IN TABLE III.
approximation for each class XL as:
                                                                             Object        Systolic         Diastoli               Blood
                                                                                         Pressure(SP)     Pressure(DP)          Pressure(BP)
                                 | B(obj  X |
                                         (i)
               P(B(obj(i) )                                   (9)            obj(1)          L                 N               0.36/N+0.1/L
                                   | B(obj (i) ) |                            obj(2)          H                 L               0.24/N+0.4/H
                                                                              obj(3)          N                 H               0.32/N+0.2/H
                                                                              obj(4)          L                 L                     1/L
      Step 9: Set q =q+1 and repeat Steps 6–9 until q >                       obj(5)          H                 H                     1/H
m.                                                                            obj(6)          N                 H                0.2/N+0.5/H
                                                                              obj(7)          L                 L                     1/L
   Step10: Derive the certain rules from the fuzzy                            obj(8)          L                 H                 0.2/N+0.5/L
lower approximation B* (XL) of any subset B.                                  obj(9)          H                 N                0.36/N+0.1/H
                                                                           Step 2: convert fuzzy class sets with α-cut define to
    Step 11: Remove the certain rules with the
condition parts more specific. This work performs                          several crisp subclass. number of divisions arbitrary ,
follows intersection together between subclasses. For                      that α-cut perform on the linguistic terms .If α=0.3
example, because ―H3‖ is including ―H2‖ and ―H1‖,                          then subclass label is ―1‖, If α=0.7 then subclass label
those can remove.                                                          is ―2‖ and if α=1 then subclass label is ―3‖ , that with
                                                                           keep α-cut define ―H3‖ is include ―H1‖ and ―H2‖ .
   Step 12: Derive the β-possible rules from the
fuzzy β-upper approximation B*β(X) of any subset B.
    Step 13: Remove the possible rules with the
condition parts more specific. This work performs
follows intersection together between subclasses and
measure plausibility.
      Step 14: Output the certain and possible rules.
                       VI.        AN EXAMPLE
   In this section, an example is given to show how
the proposed algorithm can be used to generate                                  Figure 1. The given membership function of class sets.
maximally general certain and possible shown in Table
1 except that the data class sets are represented as
quantitative values. Assume the membership functions for                      TABLE V.         CONVERT FUZZY CLASS SETS WITH Α-CUT IN
                                                                                                     TABLE IV.
each attribute are given by experts as shown in Fig. 1.
The proposed learning algorithm processes this                                Object        Systolic         Diastoli             Blood
                                                                                          Pressure(SP)     Pressure(DP)        Pressure(BP)
quantitative data set as follows. Rules from class set                         obj(1)          L                 N                N2 + L1
quantitative data. Table III shows a class sets                                obj(2)          H                 L                N1 + H2
quantitative data set, which is similar to that.                               obj(3)          N                 H                N2 + H1
                                                                               obj(4)          L                 L                     L3
                                                                               obj(5)          H                 H                     H3
          TABLE III.          AN QUANTITATIVE DATA SET AS AN                   obj(6)          N                 H                N1 + H2
                                 EXAMPLE.                                     obj(7)      L                 L                       L3
      Object       Systolic            Diastoli         Blood                  obj(8)          L                 H                 N1 + L2
                 Pressure(SP)        Pressure(DP)    Pressure(BP)             obj(9)      H                  N                    N2 + H1
      obj(1)          L                    N               89
      obj(2)          H                    L              124
      obj(3)          N                    H              122                 Step 3: Partition the object set into disjoint subsets
      obj(4)          L                    L                75             according to subclass labels. Denote each set of
      obj(5)          H                    H               135
                                                                           objects belonging to the same subclass Cl as XL.
      obj(6)          N                    H               125
      obj(7)          L                    L                78                 XL1={ Obj(1)} , XL2={ Obj(8) } , XL3={ Obj(4), Obj(7)}
      obj(8)          L                    H                85
      obj(9)          H                    N                121               XN1={ Obj(2), Obj(6), Obj(8)} , XN2={ Obj(1), Obj(3),
                                                                           Obj(9) } , XN3=Ø




                                                                     116                                 http://sites.google.com/site/ijcsis/
                                                                                                         ISSN 1947-5500
                                                    (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                       Vol. 10, No. 4, April 2012

   XH1={ Obj(3), Obj(9)} , XH2={ Obj(2) } , XH3={ Obj(5),                SP,DP*(XL1)={{ Obj(1)}} , SP,DP *(XL2)={{ Obj(8)}} ,
Obj(6)}                                                               SP,DP *(XL3)= {{ Obj(4), Obj(7)}}
    Step 4: Find the elementary sets of singleton                         SP,DP *(XN1)= {{ Obj(2)} { Obj(8)}}    , SP,DP *(XN2)=
attributes.                                                           {{ Obj(1)} { Obj(9)}}
   U/{SP}= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3),                   SP,DP *(XH1)= {{ Obj(9)}} , SP,DP *(XH2)= {{ Obj(2)}}
Obj(6)}{ Obj(2), Obj(5), Obj(9)}} and                                 , SP,DP *(XH3)= {{ Obj(5)}} and
   U/{DP}= {{ Obj(2), Obj(4), Obj(7) }{ Obj(1), Obj(9)}{                  SP,DP *(XL1)= Ø , SP,DP *(XL2)= Ø , SP*(XL3)= Ø
Obj(3), Obj(5), Obj(6) , Obj(8)}}.                                       SP,DP *(XN1)= {{ Obj(3), Obj(6)}} , SP,DP *(XN2)= {{
                                                                      Obj(3), Obj(6)}}

    Step 5: Initialize q = 1, where q is used to count                   SP,DP *(XH1)= {{ Obj(3), Obj(6)}} , SP,DP *(XH2)= Ø ,
                                                                      SP,DP *(XH3)= {{Obj(3), Obj(6)}}
the number of attributes currently being processed for
lower approximations.                                                    Step 10: Derive the certain rules from the fuzzy
                                                                      lower approximation B* (XL) of any subset B.
   Step 6: Compute the lower approximations of
each subset B with q attributes for each class Xl as:                    1. If Diastolic Pressure = Normal Then Blood
      SP*(XL1)=Ø , SP*(XL2)=Ø , SP*(XL3)=Ø
                                                                      Pressure = N2.

      SP*(XN1)=Ø , SP*(XN2)=Ø                                            2. If Systolic Pressure = Low and Diastolic
                                                                      Pressure = Normal Then Blood Pressure = L1.
      SP*(XH1)=Ø , SP*(XH2)=Ø , SP*(XH3)=Ø and
                                                                         3. If Systolic Pressure = Low and Diastolic
      DP*(XL1)=Ø , DP*(XL2)=Ø , DP*(XL3)=Ø                            Pressure = High Then Blood Pressure = L2.
      DP*(XN1)=Ø , DP*(XN2)= {{ Obj(1), Obj(9)}}
                                                                         4. If Systolic Pressure = Low and Diastolic
      DP*(XH1)=Ø , DP*(XH2)=Ø , DP*(XH3)=Ø                            Pressure = Low Then Blood Pressure = L3.
   Step 7: Compute the upper approximations of                           5. If Systolic Pressure = High and Diastolic
each subset B with q attributes for each class Xl as:                 Pressure = Low Then Blood Pressure = N1.
   SP*(XL1)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}}           ,            6. If Systolic Pressure = Low and Diastolic
  *
SP (XL2)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}} , SP*(XL3)= {{         Pressure = High Then Blood Pressure = N1.
Obj(1), Obj(4), Obj(7) , Obj(8)}}
                                                                         7. If Systolic Pressure = Low and Diastolic
   SP*(XN1)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3),             Pressure = Normal Then Blood Pressure = N2.
Obj(6)}{ Obj(2), Obj(5), Obj(9)}} ,         SP*(XN2)= {{
Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3), Obj(6)}{ Obj(2),               8. If Systolic Pressure = High and Diastolic
Obj(5), Obj(9)}}                                                      Pressure = Normal Then Blood Pressure = N2.
   SP*(XH1)= {{ Obj(3), Obj(6)}{ Obj(2), Obj(5), Obj(9)}} ,              9. If Systolic Pressure = High and Diastolic
SP*(XH2)= {{ Obj(2), Obj(5), Obj(9)}}         , SP*(XH3)= {{
                                                                      Pressure = Normal Then Blood Pressure = H1.
Obj(3), Obj(6)}{ Obj(2), Obj(5), Obj(9)}} and
   DP*(XL1)= {{ Obj(1), Obj(9)}} , DP*(XL2)= {{ Obj(3),                  10. If Systolic Pressure = High and Diastolic
Obj(5), Obj(6) , Obj(8)}} , DP*(XL3)= { Obj(2), Obj(4) ,              Pressure = low Then Blood Pressure = H2.
Obj(7)}}                                                                 11. If Systolic Pressure = High and Diastolic
   DP*(XN1)= {{ Obj(2), Obj(4), Obj(7) }{ Obj(3), Obj(5),             Pressure = High Then Blood Pressure = H3.
Obj(6) , Obj(8)}} , DP*(XN2)= { Obj(3), Obj(5), Obj(6) ,
Obj(8)}}                                                                  Step 11: Since the condition parts and intersection
                                                                      together between subclasses of the certain rules 7 and
   DP*(XH1)= {{ Obj(1), Obj(9)}{ Obj(3), Obj(5), Obj(6) ,             8 are more specific and smaller label than those of the
Obj(8)}} , DP*(XH2)= {{ Obj(2), Obj(4), Obj(7) }}       ,             first rule, the tow certain rules are removed from the
DP*(XH3)= {{ Obj(3), Obj(5), Obj(6) , Obj(8)}}.
                                                                      certain rule set.
   Step 8: Calculate the plausibility measures of each                   Step 12: Derive the possible rules from the fuzzy
equivalence class in an upper approximation for each                  upper approximation B* (X) of any subset B.
subclass Xl . for example are subclass L1 as:
                                                                         1. If Systolic Pressure = Low Then Blood
                                                                1
      P(SPL1 (Obj(1) or Obj(4) or Obj(7) or Obj(8) ))                Pressure = L1, with plausibility=0.25.
                                                                4
Step 9: Set q = q+1 and repeat Steps 6–9 until q > m.                    2. If Systolic Pressure = Low Then Blood
                                                                      Pressure = L2, with plausibility=0.25.
   U/{SP,DP}={{ Obj(1)}{ Obj(2)}{ Obj(3), Obj(6)}{ Obj(4),
Obj(7)}{ Obj(5)}{ Obj(8)}{ Obj(9)}}.                                     3. If Systolic Pressure = Low Then Blood
                                                                      Pressure = L3, with plausibility=0.5.



                                                                117                              http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500
                                              (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                 Vol. 10, No. 4, April 2012

   4. If Systolic Pressure = Low Then           Blood              26. If Systolic Pressure = Normal and Diastolic
Pressure = N1 , with plausibility=0.25 .                        Pressure = High Then Blood Pressure = N2 , with
                                                                plausibility=0.5 .
   5. If Systolic Pressure = Normal Then Blood
Pressure = N1, with plausibility=0.5 .                             27. If Systolic Pressure = Normal and Diastolic
                                                                Pressure = High Then Blood Pressure = H1 , with
   6. If Systolic Pressure = High Then          Blood           plausibility=0.5 .
Pressure = N1 , with plausibility=0.33 .
                                                                   28. If Systolic Pressure = Normal and Diastolic
   7. If Systolic Pressure = Low Then           Blood           Pressure = High Then Blood Pressure = H3 , with
Pressure = N2 , with plausibility=0.25 .                        plausibility=0.5 .
   8. If Systolic Pressure = Normal Then Blood                      Step 13: Since the condition parts, plausibility
Pressure = N2 , with plausibility=0.5 .                         measures and intersection together between subclasses
   9. If Systolic Pressure = High Then          Blood           of the possible rules 1 and 2 are more specific and
Pressure = N2 , with plausibility=0.33 .                        smaller than those of the rule 3 are thus removed from
                                                                the possible fuzzy rule set. For remainder rules
   10. If Systolic Pressure = Normal Then Blood                 perform above.
Pressure = H1 , with plausibility=0.5 .
                                                                    Step 14: Output the certain and possible rules .
   11. If Systolic Pressure = High Then Blood
Pressure = H1 , with plausibility=0.33 .                            Certain rules:
   12. If Systolic Pressure = High Then Blood                      1. If Diastolic Pressure = Normal Then Blood
Pressure = H2 , with plausibility=0.33 .                        Pressure = N2 .
   13. If Systolic Pressure = Normal       Then Blood              2. If Systolic Pressure = Low and Diastolic
Pressure = H3 , with plausibility=0.5 .                         Pressure = Normal Then Blood Pressure = L1 .
   14. If Systolic Pressure = High Then Blood                      3. If Systolic Pressure = Low and Diastolic
Pressure = H3 , with plausibility=0.33 .                        Pressure = High Then Blood Pressure = L2 .
   15. If Diastolic Pressure = Normal Then Blood                   4. If Systolic Pressure = Low and Diastolic
Pressure = L1 , with plausibility=0.5 .                         Pressure = Low Then Blood Pressure = L3 .
   16. If Diastolic Pressure = High        Then Blood              5. If Systolic Pressure = High and Diastolic
Pressure = L2 , with plausibility=0.25 .                        Pressure = Low Then Blood Pressure = N1 .
   17. If Diastolic Pressure = Low Then Blood                      6. If Systolic Pressure = Low and Diastolic
Pressure = L3 , with plausibility=0.66 .                        Pressure = High Then Blood Pressure = N1 .
   18. If Diastolic Pressure = Low Then Blood                      7. If Systolic Pressure = High and Diastolic
Pressure = N1 , with plausibility=0.33 .                        Pressure = Normal Then Blood Pressure = H1 .
   19. If Diastolic Pressure = High        Then Blood              8. If Systolic Pressure = High and Diastolic
Pressure = N1 , with plausibility=0.5 .                         Pressure = low Then Blood Pressure = H2 .
   20. If Diastolic Pressure = High Then Blood                     9. If Systolic Pressure = High and Diastolic
Pressure = N2 , with plausibility=0.25 .                        Pressure = High Then Blood Pressure = H3 .
   21. If Diastolic Pressure = Normal Then Blood
Pressure = H1 , with plausibility=0.5 .
                                                                    Possible rules:
   22. If Diastolic Pressure = High Then Blood                     1. If Systolic Pressure = Low             Then      Blood
Pressure = H1 , with plausibility=0.25 .                        Pressure = L3 , with plausibility=0.5 .
   23. If Diastolic Pressure = Low Then Blood                      2. If Systolic Pressure = Low Then                  Blood
Pressure = H2 , with plausibility=0.33 .                        Pressure = N2 , with plausibility=0.25 .
   24. If Diastolic Pressure = High Then Blood                     3. If Systolic Pressure = Normal Then Blood
Pressure = H3 , with plausibility=0.33 .                        Pressure = N2 , with plausibility=0.5 .
   25. If Systolic Pressure = Normal and Diastolic                 4. If Systolic Pressure = High Then                 Blood
Pressure = High Then Blood Pressure = N1 , with                 Pressure = N2 , with plausibility=0.33 .
plausibility=0.5 .
                                                                   5. If Systolic Pressure = Normal           Then Blood
                                                                Pressure = H3 , with plausibility=0.5 .




                                                         118                               http://sites.google.com/site/ijcsis/
                                                                                           ISSN 1947-5500
                                               (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                  Vol. 10, No. 4, April 2012

   6. If Systolic Pressure = High Then           Blood             This research was supported by the Sama
Pressure = H3 , with plausibility=0.33 .                         Technical and Vocational Training College, Islamic
   7. If Diastolic Pressure = Normal      Then Blood             Azad University, Urmia Branch.
Pressure = L1 , with plausibility=0.5 .                                                      REFRENCES
   8. If Diastolic Pressure = High Then Blood
Pressure = L2 , with plausibility=0.25 .                         [1]    Germano,L. T., & Alexandre ,P.(1996).Knowledge-base
                                                                        reduction based on rough set techniques. Canadian
   9. If Diastolic Pressure = Low Then           Blood                  conference on electrical and computer engineering (pp. 278–
Pressure = L3 , with plausibility=0.66 .                                281).
                                                                 [2]    Graham,I.,&Jones,P.L. (1988).Expert systems—knowledge
   10. If Diastolic Pressure = Low Then Blood                           ,uncertainty and decision (pp. 117–158). Boston: Chapman
Pressure = N1 , with plausibility=0.33 .                                and Computing.
   11. If Diastolic Pressure = High       Then Blood             [3]    Grzymala-Busse, J. W. (1988). Knowledge acquisition under
                                                                        uncertainty: A rough set approach. Journal of Intelligent
Pressure = N1 , with plausibility=0.5 .                                 Robotic Systems, 1, 3–16.
   12. If Diastolic Pressure = High Then Blood                   [4]    Hong, T. P., Kuo, C. S., & Chi, S. C. (1999). Mining
Pressure = N2 , with plausibility=0.25 .                                association rules from quantitative data. Intelligent Data
                                                                        Analysis, 3(5), 363–376.
   13. If Diastolic Pressure = Normal Then Blood                 [5]    Hong,T.P.,&Lee,C.Y.(1996).Induction of fuzzy rules and
Pressure = H1 , with plausibility=0.5 .                                 membership functions from training examples. Fuzzy Sets
                                                                        and Systems, 84, 33–47.
   14. If Diastolic Pressure = Low Then Blood                    [6]    Hong, T. P., & Tseng, S. S. (1997). A generalized version
Pressure = H2 , with plausibility=0.33 .                                space learning algorithm for noisy and uncertain data. IEEE
                                                                        Transactions on Knowledge and Data Engineering, 9(2),
   15. If Diastolic Pressure = High Then Blood                          336–340.
Pressure = H3 , with plausibility=0.33 .                         [7]    Hong, T. P., Wang, T. T., & Wang, S. L. (2000). Knowledge
                                                                        acquisition from quantitative data using the rough-set theory.
         VII. DISCUSSION AND CONCLUSION                                 Intelligent Data Analysis, 4, 289–304.
                                                                 [8]    Kodratoff, Y., & Michalski, R. S. (1983). Machine learning:
                                                                        An artificial intelligence artificial intelligence approach, 3.
    In this paper, we have proposed a novel data                        San Mateo, CA: Morgan Kaufmann Publishers.
mining algorithm, which can process on the rough set             [9]    Lingras, P. J., & Yao, Y. Y. (1998). Data mining using
with class sets quantitative data. The algorithm                        extensions of the rough set model. Journal of the American
integrates both the fuzzy set theory and the variable                   Society for Information Science, 49(5), 415–422.
precision rough-set model to discover knowledge .The             [10]   Hong, T. P., Tseng, L. H., & Wang, S. L. (2002). Learning
lower and upper approximations have been defined                        rules from incomplete training examples by rough sets.,
for managing objects in data sets .The interaction                      Expert System with Application, 22, 285–293.
between data and approximations helps derive certain             [11]   Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (1983).
                                                                        Machine Learning: An Artificial Intelligence Approach 1.
and possible rules from data sets and fuzzy class sets.                 Los Altos, CA: Morgan Kaufmann Publishers.
The rules thus mined exhibit fuzzy quantitative                  [12]   Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (1983).
regularity in databases and can be used to provide                      Machine learning: An artificial intelligence approach 2. Los
some suggestions to appropriate supervisors. Most                       Altos, CA: Morgan Kaufmann Publishers.
conventional mining algorithms based on the rough-               [13]   Orlowska, E. (1993). Reasoning with incomplete
set theory identify relationships among data using                      information: rough set based information logics. In V.
crisp class sets values. This possible exist class sets                 Alagar, S. Bergler, & F. Q. Dong (Eds.), Incompleteness and
                                                                        uncertainty in information systems (pp. 16–33). Springer.
with quantitative values, however, are commonly seen
in real-world applications. We thus deal with the                [14]   Pawlak, Z. (1982). Rough set. International Journal of
                                                                        Computer and Information Sciences, 341–356.
problem of learning from class quantitative data sets
                                                                 [15]   Rives, J. (1990). FID3: Fuzzy induction decision tree. In The
based on rough sets. A learning algorithm is proposed,                  first international symposium on uncertainty modeling and
which can simultaneously derive certain and possible                    analysis (pp. 457–462).
rules from class quantitative data sets. Class sets with         [16]   Wang, C. H., Hong, T. P., & Tseng, S. S. (1998). Integrating
quantitative values are first transformed into fuzzy                    fuzzy knowledge by genetic algorithms. IEEE Transactions
sets of linguistic terms using membership functions.                    on Evolutionary Computation, 2(4), 138–149.
One aspect of our future research is thus to extend our          [17]   Yuan, Y., & Shaw, M. J. (1995). Induction of fuzzy decision
method with Tzung‘s model for managing data sets                        trees. Fuzzy Sets and Systems, 69, 125–139.
with fuzzy attributes and fuzzy class sets.                      [18]   Zhong, N., Dong, J. Z., Ohsuga, S., & Lin, T. Y. (1998). An
                                                                        incremental, probabilistic rough set approach to rule
                ACKNOWLEDGEMENT                                         discovery. IEEE International Conference on Fuzzy Systems,
                                                                        2, 933–938.
                                                                 [19]   Ziarko, W. (1993). Variable precision rough set model.
                                                                        Journal of Computer and System Sciences, 46, 39–59.




                                                           119                                   http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500
                                                    (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                       Vol. 10, No. 4, April 2012

[20] Hong, T. P., Tseng, L. H., & Chien, B. C. (2010). Mining
     from incomplete quantitative data by fuzzy rough sets.,
     Expert System with Application, 37, 2644–2653.


                 AUTHORS PROFILE

                Mojtaba MadadyarAdeh was born in
                Urmia, Iran in 1983. He earned his BSc
                and MSc degrees from the Islamic
                Azad     University     in    software
                engineering. He worked at Sama
                technical and vocational training
                College, Urmia branch, Iran, as a
                faculty member and he is the director
of computer group. His studies involved research on
distributed systems, neural networks and data mining.


                  Dariush Dashchi Rezaee is working
                  as master of department of computer
                  engineering. He received BSc and
                  MSc from Islamic Azad University in
                  computer architecture. He interested
                  in research on Data mining to rough
                  sets by fuzzy systems.


                  Ali Soultanmohammad. He received
                  BSc and MSc from Islamic Azad
                  University in computer architecture.
                  He interested in research on Data
                  mining to rough sets by fuzzy
                  systems.




                                                                120                              http://sites.google.com/site/ijcsis/
                                                                                                 ISSN 1947-5500

				
DOCUMENT INFO
Shared By:
Stats:
views:64
posted:5/16/2012
language:English
pages:9