VIEWS: 64 PAGES: 9 CATEGORY: Emerging Technologies POSTED ON: 5/16/2012 Public Domain
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 Mining Rules from Crisp Attributes by Rough Sets on the Fuzzy Class Sets Mojtaba MadadyarAdeh#1, Dariush Dashchi Rezaee#2, Ali Soultanmohammadi#3 # Sama Technical and Vocational Training College, Islamic Azad University, Urmia Branch Urmia, Iran 1 m.madadyar@iaurmia.ac.ir 2 d_dashchi_rezaee@yahoo.com 3 ali_soultanmohammadi@yahoo.com Abstract—Machine learning can extract desired Examples are Orlowska‘s reasoning with incomplete knowledge and ease the development bottleneck in information, [1] knowledge-base reduction, [9] data building expert systems. Among the proposed mining, Zhong, Dong, [18] rule discovery. Due to the approaches, deriving classification rules from training success of the rough-set theory to knowledge examples is the most common. Given a set of examples, acquisition, many researchers in database and a learning program tries to induce rules that describe machine learning fields are interested in this new each class. The rough-set theory has served as a good research topic because it offers opportunities to mathematical tool for dealing with data classification discover useful information in training examples. [19] problems. In the past, the rough-set theory was widely Mentioned that the main issue in the rough-set used in dealing with data classification problems that data sets were containing crisp attributes and crisp class approach was the formation of good rules. He sets. This paper thus extends rough-set theory previous compared the rough-set approach with some other approach to deal with the problem of producing a set of classification approaches .The main characteristic of certain and possible rules from crisp attribute by rough the rough-set approach lies in that it can use the notion sets on the fuzzy class sets. The proposed approach of inadequacy of available information to perform combines the rough-set theory and the fuzzy class sets classification of objects [19][20]. It can also form an theory to learn. The examples and the approximations approximation space for analysis of information then interact on each other to drive certain and possible systems. Partial classification may be formed from the rules. The rules derived can then serve as knowledge given objects. Ziarko also mentioned the limitations of concerning the data sets on the fuzzy class sets. the rough-set model. For example, the classification with a controlled degree of uncertainty or Keywords-Fuzzy set; Rough set; Data mining; Fuzzy misclassification error is outside the realm of the class sets; Crisp attributes; Certain rule; Possible rule; α- approach. Overgeneralization is another limitation to cut the rough-set approach. Ziarko thus proposed the variable precision rough-set model to solve the above I. INTRODUCTION problems .The variable precision rough-set model has Machine learning and data mining techniques have however only shown how binary or crisp valued recently been developed to find implicitly meaningful training data may be handled. Training data in real- patterns and ease the knowledge-acquisition world applications usually consist of quantitative bottleneck. Among these approaches, deriving values. Although the variable precision rough-set inference or association rules from training examples model can also manage the quantitative values by is the most common [11], [13]. Given a set of taking each quantitative value as an attribute value, examples and counterexamples of a concept, the the rules formed in this way may be too specific. It learning program tries to induce general rules that may also cause humans hard to interpret them. describe all or most of the positive training instances Extending the variable precision rough-set model to and none or few of the counterexamples [6]. If the effectively dealing with quantitative values is thus training instances belong to more than two classes, the important to real applications of the model. Since the learning program tries to induce general rules that fuzzy set concepts are often used to represent describe each class. Recently, the rough-set theory has quantitative data by linguistic terms and membership been used in reasoning and knowledge acquisition for functions because of their simplicity and similarity to expert systems [3][13]. It was proposed by Pawlak in human reasoning [2], we thus attempt to combine the 1982, with the concept of equivalence classes as its variable precision rough-set model and the fuzzy set basic principle. Several applications and extensions of theory to solve the above problems. The rules mined the rough-set theory have also been proposed. are expressed in linguistic terms, which are more 112 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 natural and understandable for human beings. Since (equivalence) relation on attribute set B. These the number of linguistic terms is much less than that equivalence relations thus partition the object set U of possible quantitative values, the over-specialization into disjoint sub sets, denoted by U/B, and the problem can be avoided. Tzung [7] has successfully partition including Obj(i) is denoted by B(Obj(i)). The proposed a mining algorithm to find fuzzy rules based set of equivalence classes for subset B is referred to as on the rough-set model. The variable precision rough- B-elementary set. set model can be thought of as a generalization of the rough-set model. Tzung [10] deal whit the problem of Example 1. Table I shows a data set containing producing a set of certain and possible rules from seven objects denoted by U ={ Obj(1) ; Obj(2);...; Obj(7) incomplete data sets on the crisp class sets. }, two attributes denoted by A={Systolic Pressure (SP), Diastolic Pressure (DP)}, and a class set Blood In this paper, we thus deals with the problem of Pressure (BP). Assume the attributes and the classes producing a set of certain and possible rules from set have three possible values: {Low (L), Normal (N) mining crisp attributes by rough sets on the fuzzy and High (H)}. class sets . A new method, approach combines the rough-set theory and the fuzzy class sets theory to TABLE I. THE DATA SET FOR EXAMPLE 1. learn, is thus proposed to solve this problem. It first transforms each class sets quantitative value into a Object Systolic Diastoli Blood Pressure(SP) Pressure(DP) Pressure(BP) fuzzy set of linguistic terms using membership obj(1) L N L functions and converts each of fuzzy class sets by α- obj(2) H N H cut in several crisp subclasses. It second, calculates obj(3) N N N the lower and the upper approximations. The certain obj(4) L L L and possible rules are then generated based on these obj(5) H H H approximations. This paper thus extends rough-set theory previous approach to deal with the problem of obj(6) N H H obj(7) N L N producing a set of certain and a possible rule from crisp attributes by rough sets on the fuzzy class sets. The paper thus extends the existing rough-set mining Since Obj(1) and Obj(4) have the same attribute approaches to process quantitative data with tolerance value (L) for attribute SP, they share an of noise and uncertainty. indiscernibility relation and thus belong to the same The remaining parts of this paper are organized as equivalence class for SP. The equivalence partitions follows. In Section 2, the variable precision rough-set (elementary sets) for singleton attributes can be model is reviewed. In Section 3, α-cut and fuzzy class derived as follows: sets is described. In Section 4, the notation used in U/{SP} = {{obj(2), obj(5)}{ obj(3), obj(6), obj(7)}{ obj(1), this paper is described. In Section 5, the proposed obj(4)}}, and algorithm for crisp attributes data sets on the fuzzy U/{DP} = {{obj(1), obj(2), obj(3)}{ obj(4), obj(7)}{ class sets. In Section 6, an example is given to obj(5), obj(6)}}, illustrate the proposed algorithm. Also, {SP}( obj(1)) = {SP}( obj(4)) = { obj(1), obj(4)}. II. REVIEW OF THE ROUGH-SET THEORY The rough-set approach analyzes data according to two basic concepts, namely the lower and the upper The rough-set theory, proposed by Pawlak in 1982 approximations of a set. Let X is an arbitrary subset of [14], can serve as a new mathematical tool for dealing the universe U, and B is an arbitrary subset of attribute with data classification problems. It adopts the set A. The lower and the upper approximations for B concept of equivalence classes to partition training on X denoted B*(X) and B*(X) respectively, are instances according to some criteria. Two kinds of defined as follows [20] [4]: partitions are formed in the mining process: lower approximations and upper approximations, from which certain and possible rules can easily be derived. B*(X) = {x|x ϵ U, B(X)⊆ X} (1) Formally, let U be a set of training examples (objects), A be a set of attributes describing the examples, C be a set of classes, and Vj be a value domain of an B*(X) = {x|x ϵ U and B(X) ∩ X ≠ Ø} (2) attribute Aj. Also let vj(i) be the value of attribute Aj for the ith object Obj(i) . When two objects Obj(i) and Elements in B*(X) can be classified as members of Obj(k) have the same value of attribute Aj, (that is, vj(i) set X with full certainty using attribute set B, so B*(X) = vj(k) ), Obj(i) and Obj(k) are said to have an is called the lower approximation of X. Similarly, indiscernibility relation (or an equivalence relation) on elements in B*(X) can be classified as members of the attribute Aj. Also, if Obj(i) and Obj(k) have the same set X with only partial certainty using attribute set B, values for each attribute in subset B of A; Obj(i) and so B*(X) is called the upper approximation of X. Obj(k) are also said to have an indiscernibility 113 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 Example2. Continuing from Example 1, assume real-world applications. In this paper, we thus deal X={Obj(1) ,Obj(4)}. The lower and the upper with the problem of learning from class quantitative approximations of attribute DP with respect to X can data sets based on rough sets. A learning algorithm is be calculated as follows: proposed, which can simultaneously derive certain DP*(X) = Ø, and and possible rules from class quantitative data sets. DP*(X) = {{ obj(1), obj(2), obj(3)}{ obj(4), obj(7)}}. Class sets with quantitative values are first After the lower and the upper approximations have transformed into fuzzy sets of linguistic terms using been found, the rough-set theory can then be used to membership functions. Therefore, convert fuzzy class derive certain information and induce certain and sets with α-cut define to several crisp subclasses. possible rules from them (Grzymala-Busse, 1988). Number of divisions arbitrary, that α-cut perform on the linguistic terms. IV. NOTATION III. Α-CUT AND FUZZY CLASS SETS Notation used in this paper is described as follows: An α-level set of a fuzzy set A of X is a non-fuzzy denoted by [A]α and is defined by, U universe of all objects n total number of training examples (objects) in U {t X | A(t ) if 0 (3) Obj(i) ith training example (object), 1 ≤i ≤n [A] cl (supp( A)) if 0 A set of all attributes describing U m total number of attributes in A Where cl (supp(A)) denotes the closure of the support of A. B an arbitrary subset of A Definition 1(Support) Let A be a fuzzy subset of Aj jth attribute, 1≤ j≤ m X; the support of A, denoted supp(A), is the crisp subset of X whose element all have nonzero |Aj| number of attribute values for Aj membership grades in A. vj(i) the value of Aj for Obj(i) d number of divisions arbitrary , that α-cut sup p( A) {x X | A( x) 0}. (4) perform on the linguistic terms C set of classes to be determined Definition 2(triangular fuzzy number) A fuzzy set A is called triangular fuzzy number with peak (or c total number of classes in C center) a, left width α>0 and right width β>0 if its Rk kth fuzzy region of C,1 ≤k ≤c membership function has the following from, (i) e the value of C for Obj(i) 1 ( a t ) / ifa t a f(i) the fuzzy set converted from e(i) A(t) 1 (t a ) / ifa t a (5) fk(i) the membership value of e(i) in region Rk 0 otherwise Xl lth class, 1 ≤ l≤ (c×d) (i) B(Obj ) the fuzzy incomplete equivalence And we use the notation A= (a, α, β). It can easily classes in which Obj(i) exists be verified that, B*(X) the fuzzy incomplete lower approximation for B on X [A] [a (1 ) , a (1 ) ], [0,1]. (6) B*(X) the fuzzy incomplete upper approximation for B on X The support of A is (a-α, a+β). In the past, the rough- set theory was widely used in dealing with data These fuzzy equivalence relations thus partition classification problems [10]. Most conventional the fuzzy object set U into several fuzzy subsets that mining algorithms based on the rough-set theory may overlap, and the result is denoted by U/B. The set identify relationships among data using crisp class of partitions, based on B and including Obj(i) , is denoted B(Obj(i)). Thus, B(Obj(i))= {(B1(Obj(i)) … sets values. This possible exist class sets with (Br(Obj(r)) }, where r is the number of partitions quantitative values, however, are commonly seen in included in B(Obj(i)). 114 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 Example 3. Consider the following three objects The lower and upper approximations for attribute shown in Table II. Assume the linguistic terms in the DP on XN1 can be similarly derived. objects are transformed from class sets quantitative values by membership functions. Furthermore, Obj(1) is classified as having a (L2 + N1) blood pressure. V. THE PROPOSED ALGORITHM FOR CRISP Obj(2) and Obj(3) are classified similarly. Assume the ATTRIBUTES ROUGH SETS ON THE FUZZY CLASS SETS attributes SP, DP have three possible values (L, H, N). for the class set BP has three possible linguistic terms In the section, a learning algorithm based on rough (L,H,N) , but this three possible values division to sets is proposed, which can simultaneously convert nine subclass sets by three α-cut on the linguistic each of fuzzy class set by α-cut in several crisp terms (L1,L2,L3;H1,H2,H3;N1,N2,N3). subclass and derive certain and possible rules from crisp attributes data sets on the fuzzy class sets. The proposed learning algorithm first transforms each TABLE II. THE DATA SET FOR EXAMPLE 2. class sets quantitative value into a fuzzy set of Object Systolic Diastoli Blood linguistic terms using membership functions and obj(1) Pressure(SP) L Pressure(DP) N Pressure(BP) L2+N1 convert each of fuzzy class sets by α-cut in three crisp obj(2) H N H3+N1 subclass . The algorithm then calculates lower and obj(3) N N N3 upper approximations. The details of the proposed learning algorithm are described as follows. BP=N2 is then formed as (Obj(1), Obj(2) ). The other The Mining rules from crisp attributes by rough fuzzy class sets indiscernibility relations can be sets on the fuzzy class sets: similarly derived. Input: A quantitative data set with n objects, each XL2={ Obj(1)} with m attribute values and a set of membership XN1={ Obj(1), Obj(2)} functions for class sets. XH3={ Obj(2)} Output: A set of certain and possible rules. XN3={ Obj(3)} Step 1: Transform the class sets quantitative value e(i) of each object Obj(i) ;i =1 to n, for each class sets It is easily observed that an object may exist in C, into a fuzzy set f (i) , represented as ( f(i)1/R1 + more than one subclass of an class sets. In the above f(i)2/R2 + … + f(i)i/Ri ) , using the given membership example, Obj(1) exists in two subclasses for class sets functions, where Rk is the kth fuzzy region of class (XL2,XN1). sets C ; fk(i) is e(i)‘s fuzzy membership value in region Also for attributes, SP=N is then formed as Obj(3) . Rk, and l (= c×d) is the number of fuzzy regions for C. The other indiscernibility relations can be similarly Step 2: convert fuzzy class sets with α-cut define derived. U/{SP} has thus been found as follows: to several crisp subclass. Number of divisions is U/{SP}={ (Obj(1))(Obj(2))(Obj(3))} arbitrary, that α-cut perform on the linguistic terms. Similarly, Step 3: Partition the object sets into disjoint subsets according to subclass labels. Denote each set U/{DP}={ (Obj(1),Obj(2),Obj(3))} of objects belonging to the same subclass Cl as XL. The lower and upper approximations for B on X, Step 4: Find the elementary sets of singleton denoted B*(X) and B*(X) respectively, are defined as attributes. equation ―(1)‖ and ―(2)‖ . Step 5: Initialize q = 1, where q is used to count Assume XN1 = {Obj(1), Obj(2)}. Since equivalence the number of attributes currently being processed for class in U/{SP} is included in XN1, the lower lower approximations. approximation for attribute SP on XN1 is thus: Step 6: Compute the lower approximations of SP*(XN1)={( Obj(1))( Obj(2))} each subset B with q attributes for each class XL as: The equivalence class in U/{SP} have non-empty intersections with XN1. Since the second equivalence B* (X) = {obj (i) | obj (i) U , B(obj (i) ) X } class has been included in the lower approximation, (7) the upper approximation for attribute SP on XN1 is thus: Where B(Obj(i)) is the set of equivalence classes including Obj(i) and derived from attribute subset B. SP*(XN1)=Ø Step 7: Compute the upper approximations of each subset B with q attributes for each class Xl as: 115 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 Step 1: The quantitative values of each object are B (X) = {obj | obj U & B(obj ) X } (8) * (i) (i) (i) transformed into fuzzy sets. Take the class sets Blood Pressure in Obj(2) as an example. The value ‗‗124‖ is converted into a fuzzy set (0.24/N+04/H) using the Where B(Obj(i)) is the set of equivalence classes given membership functions. Results for all the including Obj(i) and derived from attribute subset B. objects are shown in Table IV. Step 8: Calculate the plausibility measures of each fuzzy incomplete equivalence class in an upper TABLE IV. THE FUZZY SETS TRANSFORMED FROM THE CLASS SETS IN TABLE III. approximation for each class XL as: Object Systolic Diastoli Blood Pressure(SP) Pressure(DP) Pressure(BP) | B(obj X | (i) P(B(obj(i) ) (9) obj(1) L N 0.36/N+0.1/L | B(obj (i) ) | obj(2) H L 0.24/N+0.4/H obj(3) N H 0.32/N+0.2/H obj(4) L L 1/L Step 9: Set q =q+1 and repeat Steps 6–9 until q > obj(5) H H 1/H m. obj(6) N H 0.2/N+0.5/H obj(7) L L 1/L Step10: Derive the certain rules from the fuzzy obj(8) L H 0.2/N+0.5/L lower approximation B* (XL) of any subset B. obj(9) H N 0.36/N+0.1/H Step 2: convert fuzzy class sets with α-cut define to Step 11: Remove the certain rules with the condition parts more specific. This work performs several crisp subclass. number of divisions arbitrary , follows intersection together between subclasses. For that α-cut perform on the linguistic terms .If α=0.3 example, because ―H3‖ is including ―H2‖ and ―H1‖, then subclass label is ―1‖, If α=0.7 then subclass label those can remove. is ―2‖ and if α=1 then subclass label is ―3‖ , that with keep α-cut define ―H3‖ is include ―H1‖ and ―H2‖ . Step 12: Derive the β-possible rules from the fuzzy β-upper approximation B*β(X) of any subset B. Step 13: Remove the possible rules with the condition parts more specific. This work performs follows intersection together between subclasses and measure plausibility. Step 14: Output the certain and possible rules. VI. AN EXAMPLE In this section, an example is given to show how the proposed algorithm can be used to generate Figure 1. The given membership function of class sets. maximally general certain and possible shown in Table 1 except that the data class sets are represented as quantitative values. Assume the membership functions for TABLE V. CONVERT FUZZY CLASS SETS WITH Α-CUT IN TABLE IV. each attribute are given by experts as shown in Fig. 1. The proposed learning algorithm processes this Object Systolic Diastoli Blood Pressure(SP) Pressure(DP) Pressure(BP) quantitative data set as follows. Rules from class set obj(1) L N N2 + L1 quantitative data. Table III shows a class sets obj(2) H L N1 + H2 quantitative data set, which is similar to that. obj(3) N H N2 + H1 obj(4) L L L3 obj(5) H H H3 TABLE III. AN QUANTITATIVE DATA SET AS AN obj(6) N H N1 + H2 EXAMPLE. obj(7) L L L3 Object Systolic Diastoli Blood obj(8) L H N1 + L2 Pressure(SP) Pressure(DP) Pressure(BP) obj(9) H N N2 + H1 obj(1) L N 89 obj(2) H L 124 obj(3) N H 122 Step 3: Partition the object set into disjoint subsets obj(4) L L 75 according to subclass labels. Denote each set of obj(5) H H 135 objects belonging to the same subclass Cl as XL. obj(6) N H 125 obj(7) L L 78 XL1={ Obj(1)} , XL2={ Obj(8) } , XL3={ Obj(4), Obj(7)} obj(8) L H 85 obj(9) H N 121 XN1={ Obj(2), Obj(6), Obj(8)} , XN2={ Obj(1), Obj(3), Obj(9) } , XN3=Ø 116 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 XH1={ Obj(3), Obj(9)} , XH2={ Obj(2) } , XH3={ Obj(5), SP,DP*(XL1)={{ Obj(1)}} , SP,DP *(XL2)={{ Obj(8)}} , Obj(6)} SP,DP *(XL3)= {{ Obj(4), Obj(7)}} Step 4: Find the elementary sets of singleton SP,DP *(XN1)= {{ Obj(2)} { Obj(8)}} , SP,DP *(XN2)= attributes. {{ Obj(1)} { Obj(9)}} U/{SP}= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3), SP,DP *(XH1)= {{ Obj(9)}} , SP,DP *(XH2)= {{ Obj(2)}} Obj(6)}{ Obj(2), Obj(5), Obj(9)}} and , SP,DP *(XH3)= {{ Obj(5)}} and U/{DP}= {{ Obj(2), Obj(4), Obj(7) }{ Obj(1), Obj(9)}{ SP,DP *(XL1)= Ø , SP,DP *(XL2)= Ø , SP*(XL3)= Ø Obj(3), Obj(5), Obj(6) , Obj(8)}}. SP,DP *(XN1)= {{ Obj(3), Obj(6)}} , SP,DP *(XN2)= {{ Obj(3), Obj(6)}} Step 5: Initialize q = 1, where q is used to count SP,DP *(XH1)= {{ Obj(3), Obj(6)}} , SP,DP *(XH2)= Ø , SP,DP *(XH3)= {{Obj(3), Obj(6)}} the number of attributes currently being processed for lower approximations. Step 10: Derive the certain rules from the fuzzy lower approximation B* (XL) of any subset B. Step 6: Compute the lower approximations of each subset B with q attributes for each class Xl as: 1. If Diastolic Pressure = Normal Then Blood SP*(XL1)=Ø , SP*(XL2)=Ø , SP*(XL3)=Ø Pressure = N2. SP*(XN1)=Ø , SP*(XN2)=Ø 2. If Systolic Pressure = Low and Diastolic Pressure = Normal Then Blood Pressure = L1. SP*(XH1)=Ø , SP*(XH2)=Ø , SP*(XH3)=Ø and 3. If Systolic Pressure = Low and Diastolic DP*(XL1)=Ø , DP*(XL2)=Ø , DP*(XL3)=Ø Pressure = High Then Blood Pressure = L2. DP*(XN1)=Ø , DP*(XN2)= {{ Obj(1), Obj(9)}} 4. If Systolic Pressure = Low and Diastolic DP*(XH1)=Ø , DP*(XH2)=Ø , DP*(XH3)=Ø Pressure = Low Then Blood Pressure = L3. Step 7: Compute the upper approximations of 5. If Systolic Pressure = High and Diastolic each subset B with q attributes for each class Xl as: Pressure = Low Then Blood Pressure = N1. SP*(XL1)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}} , 6. If Systolic Pressure = Low and Diastolic * SP (XL2)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}} , SP*(XL3)= {{ Pressure = High Then Blood Pressure = N1. Obj(1), Obj(4), Obj(7) , Obj(8)}} 7. If Systolic Pressure = Low and Diastolic SP*(XN1)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3), Pressure = Normal Then Blood Pressure = N2. Obj(6)}{ Obj(2), Obj(5), Obj(9)}} , SP*(XN2)= {{ Obj(1), Obj(4), Obj(7) , Obj(8)}{ Obj(3), Obj(6)}{ Obj(2), 8. If Systolic Pressure = High and Diastolic Obj(5), Obj(9)}} Pressure = Normal Then Blood Pressure = N2. SP*(XH1)= {{ Obj(3), Obj(6)}{ Obj(2), Obj(5), Obj(9)}} , 9. If Systolic Pressure = High and Diastolic SP*(XH2)= {{ Obj(2), Obj(5), Obj(9)}} , SP*(XH3)= {{ Pressure = Normal Then Blood Pressure = H1. Obj(3), Obj(6)}{ Obj(2), Obj(5), Obj(9)}} and DP*(XL1)= {{ Obj(1), Obj(9)}} , DP*(XL2)= {{ Obj(3), 10. If Systolic Pressure = High and Diastolic Obj(5), Obj(6) , Obj(8)}} , DP*(XL3)= { Obj(2), Obj(4) , Pressure = low Then Blood Pressure = H2. Obj(7)}} 11. If Systolic Pressure = High and Diastolic DP*(XN1)= {{ Obj(2), Obj(4), Obj(7) }{ Obj(3), Obj(5), Pressure = High Then Blood Pressure = H3. Obj(6) , Obj(8)}} , DP*(XN2)= { Obj(3), Obj(5), Obj(6) , Obj(8)}} Step 11: Since the condition parts and intersection together between subclasses of the certain rules 7 and DP*(XH1)= {{ Obj(1), Obj(9)}{ Obj(3), Obj(5), Obj(6) , 8 are more specific and smaller label than those of the Obj(8)}} , DP*(XH2)= {{ Obj(2), Obj(4), Obj(7) }} , first rule, the tow certain rules are removed from the DP*(XH3)= {{ Obj(3), Obj(5), Obj(6) , Obj(8)}}. certain rule set. Step 8: Calculate the plausibility measures of each Step 12: Derive the possible rules from the fuzzy equivalence class in an upper approximation for each upper approximation B* (X) of any subset B. subclass Xl . for example are subclass L1 as: 1. If Systolic Pressure = Low Then Blood 1 P(SPL1 (Obj(1) or Obj(4) or Obj(7) or Obj(8) )) Pressure = L1, with plausibility=0.25. 4 Step 9: Set q = q+1 and repeat Steps 6–9 until q > m. 2. If Systolic Pressure = Low Then Blood Pressure = L2, with plausibility=0.25. U/{SP,DP}={{ Obj(1)}{ Obj(2)}{ Obj(3), Obj(6)}{ Obj(4), Obj(7)}{ Obj(5)}{ Obj(8)}{ Obj(9)}}. 3. If Systolic Pressure = Low Then Blood Pressure = L3, with plausibility=0.5. 117 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 4. If Systolic Pressure = Low Then Blood 26. If Systolic Pressure = Normal and Diastolic Pressure = N1 , with plausibility=0.25 . Pressure = High Then Blood Pressure = N2 , with plausibility=0.5 . 5. If Systolic Pressure = Normal Then Blood Pressure = N1, with plausibility=0.5 . 27. If Systolic Pressure = Normal and Diastolic Pressure = High Then Blood Pressure = H1 , with 6. If Systolic Pressure = High Then Blood plausibility=0.5 . Pressure = N1 , with plausibility=0.33 . 28. If Systolic Pressure = Normal and Diastolic 7. If Systolic Pressure = Low Then Blood Pressure = High Then Blood Pressure = H3 , with Pressure = N2 , with plausibility=0.25 . plausibility=0.5 . 8. If Systolic Pressure = Normal Then Blood Step 13: Since the condition parts, plausibility Pressure = N2 , with plausibility=0.5 . measures and intersection together between subclasses 9. If Systolic Pressure = High Then Blood of the possible rules 1 and 2 are more specific and Pressure = N2 , with plausibility=0.33 . smaller than those of the rule 3 are thus removed from the possible fuzzy rule set. For remainder rules 10. If Systolic Pressure = Normal Then Blood perform above. Pressure = H1 , with plausibility=0.5 . Step 14: Output the certain and possible rules . 11. If Systolic Pressure = High Then Blood Pressure = H1 , with plausibility=0.33 . Certain rules: 12. If Systolic Pressure = High Then Blood 1. If Diastolic Pressure = Normal Then Blood Pressure = H2 , with plausibility=0.33 . Pressure = N2 . 13. If Systolic Pressure = Normal Then Blood 2. If Systolic Pressure = Low and Diastolic Pressure = H3 , with plausibility=0.5 . Pressure = Normal Then Blood Pressure = L1 . 14. If Systolic Pressure = High Then Blood 3. If Systolic Pressure = Low and Diastolic Pressure = H3 , with plausibility=0.33 . Pressure = High Then Blood Pressure = L2 . 15. If Diastolic Pressure = Normal Then Blood 4. If Systolic Pressure = Low and Diastolic Pressure = L1 , with plausibility=0.5 . Pressure = Low Then Blood Pressure = L3 . 16. If Diastolic Pressure = High Then Blood 5. If Systolic Pressure = High and Diastolic Pressure = L2 , with plausibility=0.25 . Pressure = Low Then Blood Pressure = N1 . 17. If Diastolic Pressure = Low Then Blood 6. If Systolic Pressure = Low and Diastolic Pressure = L3 , with plausibility=0.66 . Pressure = High Then Blood Pressure = N1 . 18. If Diastolic Pressure = Low Then Blood 7. If Systolic Pressure = High and Diastolic Pressure = N1 , with plausibility=0.33 . Pressure = Normal Then Blood Pressure = H1 . 19. If Diastolic Pressure = High Then Blood 8. If Systolic Pressure = High and Diastolic Pressure = N1 , with plausibility=0.5 . Pressure = low Then Blood Pressure = H2 . 20. If Diastolic Pressure = High Then Blood 9. If Systolic Pressure = High and Diastolic Pressure = N2 , with plausibility=0.25 . Pressure = High Then Blood Pressure = H3 . 21. If Diastolic Pressure = Normal Then Blood Pressure = H1 , with plausibility=0.5 . Possible rules: 22. If Diastolic Pressure = High Then Blood 1. If Systolic Pressure = Low Then Blood Pressure = H1 , with plausibility=0.25 . Pressure = L3 , with plausibility=0.5 . 23. If Diastolic Pressure = Low Then Blood 2. If Systolic Pressure = Low Then Blood Pressure = H2 , with plausibility=0.33 . Pressure = N2 , with plausibility=0.25 . 24. If Diastolic Pressure = High Then Blood 3. If Systolic Pressure = Normal Then Blood Pressure = H3 , with plausibility=0.33 . Pressure = N2 , with plausibility=0.5 . 25. If Systolic Pressure = Normal and Diastolic 4. If Systolic Pressure = High Then Blood Pressure = High Then Blood Pressure = N1 , with Pressure = N2 , with plausibility=0.33 . plausibility=0.5 . 5. If Systolic Pressure = Normal Then Blood Pressure = H3 , with plausibility=0.5 . 118 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 6. If Systolic Pressure = High Then Blood This research was supported by the Sama Pressure = H3 , with plausibility=0.33 . Technical and Vocational Training College, Islamic 7. If Diastolic Pressure = Normal Then Blood Azad University, Urmia Branch. Pressure = L1 , with plausibility=0.5 . REFRENCES 8. If Diastolic Pressure = High Then Blood Pressure = L2 , with plausibility=0.25 . [1] Germano,L. T., & Alexandre ,P.(1996).Knowledge-base reduction based on rough set techniques. Canadian 9. If Diastolic Pressure = Low Then Blood conference on electrical and computer engineering (pp. 278– Pressure = L3 , with plausibility=0.66 . 281). [2] Graham,I.,&Jones,P.L. (1988).Expert systems—knowledge 10. If Diastolic Pressure = Low Then Blood ,uncertainty and decision (pp. 117–158). Boston: Chapman Pressure = N1 , with plausibility=0.33 . and Computing. 11. If Diastolic Pressure = High Then Blood [3] Grzymala-Busse, J. W. (1988). Knowledge acquisition under uncertainty: A rough set approach. Journal of Intelligent Pressure = N1 , with plausibility=0.5 . Robotic Systems, 1, 3–16. 12. If Diastolic Pressure = High Then Blood [4] Hong, T. P., Kuo, C. S., & Chi, S. C. (1999). Mining Pressure = N2 , with plausibility=0.25 . association rules from quantitative data. Intelligent Data Analysis, 3(5), 363–376. 13. If Diastolic Pressure = Normal Then Blood [5] Hong,T.P.,&Lee,C.Y.(1996).Induction of fuzzy rules and Pressure = H1 , with plausibility=0.5 . membership functions from training examples. Fuzzy Sets and Systems, 84, 33–47. 14. If Diastolic Pressure = Low Then Blood [6] Hong, T. P., & Tseng, S. S. (1997). A generalized version Pressure = H2 , with plausibility=0.33 . space learning algorithm for noisy and uncertain data. IEEE Transactions on Knowledge and Data Engineering, 9(2), 15. If Diastolic Pressure = High Then Blood 336–340. Pressure = H3 , with plausibility=0.33 . [7] Hong, T. P., Wang, T. T., & Wang, S. L. (2000). Knowledge acquisition from quantitative data using the rough-set theory. VII. DISCUSSION AND CONCLUSION Intelligent Data Analysis, 4, 289–304. [8] Kodratoff, Y., & Michalski, R. S. (1983). Machine learning: An artificial intelligence artificial intelligence approach, 3. In this paper, we have proposed a novel data San Mateo, CA: Morgan Kaufmann Publishers. mining algorithm, which can process on the rough set [9] Lingras, P. J., & Yao, Y. Y. (1998). Data mining using with class sets quantitative data. The algorithm extensions of the rough set model. Journal of the American integrates both the fuzzy set theory and the variable Society for Information Science, 49(5), 415–422. precision rough-set model to discover knowledge .The [10] Hong, T. P., Tseng, L. H., & Wang, S. L. (2002). Learning lower and upper approximations have been defined rules from incomplete training examples by rough sets., for managing objects in data sets .The interaction Expert System with Application, 22, 285–293. between data and approximations helps derive certain [11] Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (1983). Machine Learning: An Artificial Intelligence Approach 1. and possible rules from data sets and fuzzy class sets. Los Altos, CA: Morgan Kaufmann Publishers. The rules thus mined exhibit fuzzy quantitative [12] Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (1983). regularity in databases and can be used to provide Machine learning: An artificial intelligence approach 2. Los some suggestions to appropriate supervisors. Most Altos, CA: Morgan Kaufmann Publishers. conventional mining algorithms based on the rough- [13] Orlowska, E. (1993). Reasoning with incomplete set theory identify relationships among data using information: rough set based information logics. In V. crisp class sets values. This possible exist class sets Alagar, S. Bergler, & F. Q. Dong (Eds.), Incompleteness and uncertainty in information systems (pp. 16–33). Springer. with quantitative values, however, are commonly seen in real-world applications. We thus deal with the [14] Pawlak, Z. (1982). Rough set. International Journal of Computer and Information Sciences, 341–356. problem of learning from class quantitative data sets [15] Rives, J. (1990). FID3: Fuzzy induction decision tree. In The based on rough sets. A learning algorithm is proposed, first international symposium on uncertainty modeling and which can simultaneously derive certain and possible analysis (pp. 457–462). rules from class quantitative data sets. Class sets with [16] Wang, C. H., Hong, T. P., & Tseng, S. S. (1998). Integrating quantitative values are first transformed into fuzzy fuzzy knowledge by genetic algorithms. IEEE Transactions sets of linguistic terms using membership functions. on Evolutionary Computation, 2(4), 138–149. One aspect of our future research is thus to extend our [17] Yuan, Y., & Shaw, M. J. (1995). Induction of fuzzy decision method with Tzung‘s model for managing data sets trees. Fuzzy Sets and Systems, 69, 125–139. with fuzzy attributes and fuzzy class sets. [18] Zhong, N., Dong, J. Z., Ohsuga, S., & Lin, T. Y. (1998). An incremental, probabilistic rough set approach to rule ACKNOWLEDGEMENT discovery. IEEE International Conference on Fuzzy Systems, 2, 933–938. [19] Ziarko, W. (1993). Variable precision rough set model. Journal of Computer and System Sciences, 46, 39–59. 119 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 10, No. 4, April 2012 [20] Hong, T. P., Tseng, L. H., & Chien, B. C. (2010). Mining from incomplete quantitative data by fuzzy rough sets., Expert System with Application, 37, 2644–2653. AUTHORS PROFILE Mojtaba MadadyarAdeh was born in Urmia, Iran in 1983. He earned his BSc and MSc degrees from the Islamic Azad University in software engineering. He worked at Sama technical and vocational training College, Urmia branch, Iran, as a faculty member and he is the director of computer group. His studies involved research on distributed systems, neural networks and data mining. Dariush Dashchi Rezaee is working as master of department of computer engineering. He received BSc and MSc from Islamic Azad University in computer architecture. He interested in research on Data mining to rough sets by fuzzy systems. Ali Soultanmohammad. He received BSc and MSc from Islamic Azad University in computer architecture. He interested in research on Data mining to rough sets by fuzzy systems. 120 http://sites.google.com/site/ijcsis/ ISSN 1947-5500