Document Sample

A Fuzzy Clustering-Based Algorithm for Fuzzy Modeling GEORGE E. TSEKOURAS*, CHRISTOS KALLONIATIS, EVAGELIA KAVAKLI, AND THOMAS MAVROFIDES Laboratory of Image Processing and Multimedia Applications, Department of Cultural Technology and Communication, University of the Aegean, Faonos & Harilaou Trikoupi Str., 81100, Mytilene, Greece * Tel: +301-2251-0-36631, Fax: +301-2251-0-36609 Abstract: Fuzzy rules have a simple structure within a multidimensional vector space and they are produced by dismembering this space into fuzzy subspaces. The most efficient way to produce fuzzy partitions in a vector space is the use of fuzzy clustering analysis. This paper proposes a fuzzy clustering-based algorithm, which generates fuzzy rules from a set of input-output data. The algorithm is based on the assumption that, with an input fully matching with the premise part of a specific fuzzy rule, the corresponding output should completely participate in the consequent part. In order to accomplish this, certain conditions are derived. The application of the algorithm to a test case, which has been considered as a benchmark in fuzzy modeling applications, shows that the produced models are of compact size, while their performances are very efficient. Key-Words: Fuzzy clustering; Fuzzy modeling; System identification; Model parameter estimation; Fuzzy partition; Crisp partition. 1. Introduction areas where the membership functions determine the The basic issue in fuzzy modeling is the structure of the rules. identification procedure that is employed. Fuzzy In this paper, a novel fuzzy clustering-based method model identification consists of structure is proposed for system identification. The proposed identification and parameter estimation. Structure algorithm is based on decomposing the input space identification is directly related to the determination into a certain number of subspaces (clusters), each of of the appropriate number of rules [1,2]. On the other which is assigned to a specific fuzzy rule. Then, the hand, parameter estimation concerns the calculation output space is relationally dismembered into the of the appropriate model parameter values that same number of clusters in such a way, that certain provide an accurate system description. Structure conditions have to be satisfied. identification and parameter estimation are usually carried out via a training procedure. So far, a wide 2. The Proposed Algorithm spectrum of methods has been proposed to train In this section the proposed algorithm is analyzed in fuzzy systems. Many of these methods use heuristic details. The algorithm is able to efficiently generate approaches [3], self-learning and adaptive schemes fuzzy rules based on a set of n input-output data pairs [4,5], or gradient descent algorithms [6]. of the form ( x k ; y k ) (1 ≤ k ≤ n) . The basic design One of the most efficient fuzzy modeling procedures issues of the proposed method are described within is the utilization of fuzzy clustering analysis. Fuzzy the next subsections. clustering provides a certain advantage over other approaches, since the partition of the input (or the product) space is obtained as a direct result [7]. The 2.1 Partitioning the Input Space by Using method developed in [8] use fuzzy clustering analysis Fuzzy Clustering Analysis to detect multidimensional reference fuzzy areas, A major issue in fuzzy modeling is the reduction of where the number of rules is determined by reducing the computational complexity, and since simplified the model parameters, based on a system fuzzy models use less parameters their usefulness is performance index. In [9] it is proposed an algorithm considerable. In our approach we adopt the simplified that yields clusters in the mapping space by fuzzy model introduced in [3], which is described by incorporating the nature of the functional the following fuzzy rules, relationships into an objective function. In [10] the structure identification is obtained via hyper- i i R i : If x1 is X 1 and x 2 is X 2 and ... and x p is X ip ellipsoidal clustering with simultaneous use of human intuition, while in [11] the hyper-ellipsoidal Then y is b i (1 ≤ i ≤ c) (1) subspaces have been replaced by spherical fuzzy where p is the number of inputs, c in the number of degrees that solve the above constrained optimization i rules, X (1 ≤ i ≤ c;1 ≤ j ≤ p ) are fuzzy sets, and b are i problem are respectively given by the following j equations [12], real numbers. The above fuzzy model can approximate any nonlinear function to arbitrary n accuracy on a compact set [3]. ∑ (uik ) m x k Based on fuzzy reasoning, it is evident that even v i = k =1n , 1≤ i ≤ c (5) when an input linguistic variable is not appearing in the premise part of one rule, a fuzzy set can be ∑ (uik ) m k =1 assigned to it with a firing degree of unity. This remark suggests a uniform structure of the premise and part of the rule base, where all the input linguistic 1 variables participate in all of the fuzzy rules. In u ik = , 1 ≤ i ≤ c, 1 ≤ k ≤ n (6) 2 addition to that, a more credible fuzzy rule base can c || x − v || m −1 be created by assuming that the output variable ∑ || x k − v i participates in each rule with a normal fuzzy set, j =1 k j || meaning that there is at least one element belonging to the fuzzy set with membership degree of unity. By The eqs (7) and (8) constitute an iterative considering that c fuzzy rules are needed to describe optimization procedure. a nonlinear system, the uniform structure of the By applying the above minimization procedure to the premise part of the rule base enables us to partition input training data vectors, these vectors are classified the input space X into c fuzzy subspaces into c fuzzy clusters, where the i-th cluster X 1 , X 2 , ..., X c . Each of these subspaces is corresponds to the i-th fuzzy subspace. Since a single assigned to only one fuzzy rule. Therefore, the fuzzy fuzzy subspace corresponds to a specific fuzzy rule, rule in (1) can be modified as, the number of clusters coincides with the total number of fuzzy rules. Eventually, the membership degree of the training vector xk to the i-th fuzzy R i : If x is X i Then y is b i (1 ≤ i ≤ c) (2) subspace Xi is the membership degree uik. In the rest of the paper, the word «fuzzy cluster» will replace the where x = [ x1 , x 2 , ..., x p ]T and Xi ⊂ X with word «fuzzy subspace», meaning that these two words are referred to the same concept. i i X i = { X 1 , X 2 , ..., X ip } . Since our model is described by fuzzy rules of the form (2), we can produce a 2.2 Model Parameter Initialization constrained fuzzy c-partition of the input space X by Based on the analysis presented in the previous applying the well-known fuzzy c-means algorithm on section, the premise part of each rule consists of the input training data set. The fuzzy c-means is multidimensional fuzzy clusters, the membership based on the minimization of the following objective functions of which are given in eq (6). The form of function [12], this equation indicates that the membership function is interpreted as the membership degree that is n c assigned to the input vector xk by the center element Jm = ∑ ∑ (u ik ) m || x k − v i || 2 (3) vi of the cluster X i. Thus, the width of the cluster Xi k =1 i =1 is not included in the membership function and therefore, it is not taken into account in the parameter under the next equality constraint, estimation either. Another important issue is the presence of the parameter m. This parameter controls c the fuzziness of the resulted partition and thus, it ∑ uik =1 , ∀k (4) affects the overlapping degree between the i =1 multidimensional fuzzy clusters. More specifically, as where n is the number of training data vectors, c is this parameter increases, the overlapping degree also the number of clusters, u ik is the membership degree increases. This means that for a specific value of the parameter m the overlapping degree between the of the k-th training vector to the i-th cluster, clusters is known, and therefore, the locations of the m ∈ (1, ∞) is a factor to adjust the membership degree cluster centers indicate the distances between the weighting effect, xk ∈ℜ p are the input training data clusters. Thus, the premise parameter identification only concerns the estimation of the appropriate vectors, and vi ∈ℜ p are the cluster centers. The cluster centers. To this end, the premise parameter cluster centers and the respective membership estimation is based on iteratively applying the eqs (5) and (6) to the input training data, where the resulted With the premise parameters known, the respective cluster centers provide the fuzzy rule premise consequent parameters can be obtained by parameters and the respective membership degrees minimizing the J1 over the n input-output data pairs. provide the firing degrees of the fuzzy rules. Using eq. (7), eq. (8) gives that, Therefore, the output of the fuzzy model can be calculated as, n c J 1 = ∑ ( y k − ∑ u ik b i ) 2 (9) c c k =1 i =1 ~ = u bi y k ∑ ik ∑ u ik (1 ≤ k ≤ n) i =1 i =1 One feasible way to minimize J1 is to employ the well-known least squares algorithm. However, the Taking into account the eq (4), the above equation is utilization of this algorithm does not guarantee that modified as follows, the conditions 1 and 2 will be satisfied. Therefore, we introduce the following procedure. c ~ = u b i (1 ≤ k ≤ n) y k ∑ ik (7) i =1 Theorem 1 If m →1+ then the objective function J1, given in eq. With the fuzzy c-partition of the input space (9) can be calculated as, introduced, the output space should be partitioned in a similar way. Moreover, this partition should be n c based on the following conditions [11, 12], J1 = ∑ ∑ (u ik ) 2 ( y k − b i ) 2 (10) k =1i =1 Condition 1: If in the i-th fuzzy rule the vector xk is the center element of the cluster X i then the output yk Proof should satisfy the rule’s consequence by concluding a For 1 ≤ i ≤ c and 1 ≤ k ≤ n , from eq. (6) we obtain truth degree equal to unity. that, Condition 2: If in the i-th fuzzy rule the vector xk is 2 /( m −1) −1 c not the center element of the cluster Xi then the || x − v i || output yk should satisfy the rule’s consequence by lim+ u ik = ∑ k = m→1 j =1 || x k − v j || concluding a truth degree less than unity. The above conditions are referred to the matching 1, if || xk − vi || < || xk − v j || ∀ i ≠ j degree between the premise and the consequent part = of each fuzzy rule. One feasible way to satisfy these 0, otherwise two conditions is to perform clustering analysis in the Thus, as m → 1+ the membership degrees in the input product space (i.e. the input-output space) and then space are given as follows, induce fuzzy sets by projecting the resulted clusters on each dimension. Such kinds of approaches are 1, if x k ∈ X i investigated in [7,8,9]. However, the main drawback u ik = (11) of these approaches is that the consequent parameters 0, otherwise are not calculated by the use of an optimizing criterion. In order to solve this problem, we introduce where X = X 1 ∪ X 2 ∪ ... ∪ X c is a crisp partition of the following condition, X. From eq. (11) it follows that, there are k1 input data Condition 3: The consequent parameters should be vectors that belong to the cluster X1, k2 data vectors estimated by minimizing the sum of the square errors that belong to the cluster X2, …., and kc data vectors (SSE) criterion. that belong to the cluster Xc, such that, The above condition has to be satisfied together with k1 + k 2 + ... + k c = n (12) the conditions 1 and 2. The SSE criterion is given as, Therefore, the following relation holds, n J 1 = ∑ ( y k − ~k ) 2 y (8) c k =1 ( y k − ∑ u ik b i ) 2 = ( y k − u li k b li ) 2 i =1 = (u li k ) 2 ( y k − b li ) 2 (13) where the index li corresponds to the crisp cluster Setting the partial derivative ∂J 1 ∂ b i equal to zero, X li at which the xk belongs to. Based on eqs (11), and solving with respect to bi, we can easily derive (12), and (13) the objective function in (9) can be the eq. (16). This completes the proof of theorem 2. modified as follows, k1 k2 Summarizing, the premise parameters are calculated J 1 = ∑ (u1k ) 2 ( y k − b1 ) 2 + ∑ (u 2 k ) 2 ( y k − b 2 ) 2 by the eq (5) and the consequent parameters by the k =1 k =1 kc eq. (16). + .... + ∑ (u ck ) 2 ( y k − b c ) 2 k =1 2.3 Fine Tuning of the Model Parameters In this section the model parameters, obtained in the which means that, previous step, are further tuned by using a gradient descent approach. The objective function that is used c ki for this purpose is given as, J1 = ∑ ∑ (uik ) 2 ( yk − bi ) 2 (14) i =1 k =1 1 n J2 = ∑ ( y k − ~k ) 2 2 n k =1 y The i-th crisp cluster Xi includes ki training vectors and therefore the rest (n-ki) training data vectors are assigned by Xi membership degrees equal to zero. By substituting eq. (7) into the above function we Therefore, the following relation holds, obtain that, ki 1 n c ∑ (uik ) 2 ( yk − b ) = i 2 J2 = ∑ ( y k − ∑ u ik b i ) 2 2 n k =1 (17) i =1 k =1 ki n − ki In order to minimize J2 the premise parameters have = ∑ (uik ) 2 i 2 ( yk − b ) + ∑ 2 i 2 (uik ) ( yk − b ) to be adjusted as follows, k =1 k =1 n β1 n ∂ u ik ∑ ( y k − ~k ) 2 = ∑ (uik ) 2 ( yk − bi ) 2 (15) ∆v i = n y bi ∂ vi (18) k =1 k =1 Replacing eq. (15) into eq. (14) we can derive the eq. where, based on (6), the partial derivative is given as, (10). This completes the proof of theorem 1. 2 c || x − v || m −1 ∑ || x k − v i k || j =1 j The next theorem provides the values of the ∂ u ik 2 j ≠i consequent parameters that minimize the objective = (19) ∂ v i (m − 1) ( x k − v i ) 2 2 function in (10). c || x k − v i || m −1 ∑ Theorem 2 j =1 || x k − v j || For 1 ≤ i ≤ c ; If the values of the membership Relationally, the learning rule for the consequent degrees u ik ( 1 ≤ k ≤ n) are fixed, then the values of parameters is as follows, the consequent parameters bi that minimize the β2 n objective function J1, given in eq. (10), are calculated ∆ bi = n ∑ [( y k − ~k ) u ik ] y (20) as, k =1 n In the above equations, the parameters β1 and β2 are ∑ (u ik ) 2 yk the gradient descent learning parameters. k =1 bi = n (16) ∑ (u ik ) 2 2.4 The Identification Algorithm k =1 Based on the previous analysis, the proposed fuzzy modeling algorithm is now given as follows. Proof The Proposed Algorithm 65 Suppose we are given n input-output data pairs of the Model Predictions form ( x k ; y k ) (1 ≤ k ≤ n) . Initially select a small Original Data 60 value for the parameter m, which is close to unity. Set the number of rules c=2, and select a value for 55 the terminal condition parameters ε1 and ε2. Output Step 1). Randomly, initialize the premise parameters 50 v i (1 ≤ i ≤ c) and the consequent parameters 45 b i (1 ≤ i ≤ c) . 40 Step 2). For k = 1, 2, ..., n and i = 1, 2, ..., c ; Use the 1 15 29 43 57 71 85 99 113 127 141 eq (6) to calculate the membership degrees uik. Sample Step 3). For i=1, 2, …, c; Update the premise Figure 1: Original and predicted values for the training parameters vi using the eq. (5). data set of the Box and Jenkins system (Case 1). Step 4). For i=1, 2, …, c; Calculate the consequent with 6 inputs: x(k), x(k-1), x(k-2), y(k-1), y(k-2), y(k- parameters using the eq. (16). 3) and one output: y(k). In order to compare our method with other approaches, we performed two Step 5). Calculate the distance || b − bp || where experimental cases namely, case 1 and case 2. 65 b = [ b1 , b 2 , ..., b c ]T and bp the previous state of b. 60 Step 6). If || b − bp ||≤ ε 1 then go to step 7; else go 55 to step 2. Output Step 7). Employ the gradient descent approach to 50 minimize J2, where the model parameter learning 45 Model Predictions rules are given by the eqs (18) and (20). Original Data Step 8). Calculate the performance index of the 40 n 149 163 177 191 205 219 233 247 261 275 289 model: PI = ∑ ( yk − ~k ) 2 n . If PI ≤ ε 2 then stop; y Sample k =1 Else set c= c+1 and go to step 1. Figure 2: Original and predicted values for the test data set of the Box and Jenkins system (Case 1). The final result of the above iterative optimization is In case 1 we used the first 148 input-output data as that, with an input fully matching with one of rules’ training data to build the fuzzy model and the last 148 premise part, the corresponding output satisfies the as test data to validate its performance. The terminal consequence completely, meaning that the truth conditions were selected as ε1=10-4 and ε2=10-2, and degree of each fuzzy rule is equal to unity. Thus, the the learning rates for the gradient descent method eq. (7) can be used for inference of the output from a were: β1 = β2= 0.55. The final number of rules was specific input data vector. equal to c=3.The predicted and the original output values for the training data are given in Fig.1, where 3. Simulation Study the corresponding Mean Square Error (MSE) was In this subsection the proposed algorithm is applied equal to 0.045. Fig. 2 shows the predicted and the to the well-known Box and Jenkins data set [2], actual values for the validation data for which, the which consists of 296 input-output measurements of MSE was equal to 0.251. The MSEs, which were a gas-furnace process, obtained using a sampling obtained for the same case study by the method ratio of 9 s. At each sampling time k the input x(k) of developed in [14] were 0.071 for that training data, this process is the gas flow rate and the output y(k) is and 0.261 for the test data, meaning that our model the output CO2 concentration. The proposed method performs better than this method. was used to design a fuzzy model for this process 65 very efficient performance, while keeping the size of the model within reasonable and acceptable levels. 60 References 55 [1] T. Takagi, and M. Sugeno, Fuzzy identification of Output systems and its application to modeling and control, 50 IEEE Trans. Systems Man Cybern., Vol. 15 (1), 1985, pp. 116-132. Model Predictions [2] M. Sugeno, M., and Yasukawa, T., A fuzzy-logic- 45 Original Data based approach to qualitative modeling, IEEE Trans. 40 Fuzzy Syst., Vol. 1 (2), 1993, pp 7-31. 1 43 85 127 169 211 253 295 [3]K. Nozaki, H. Ishibuchi, and H. Tanaka, A simple but powerful method for generating fuzzy rules from Sample numerical data, Fuzzy Sets and Systems, Vol. 86, Figure 3: Original and predicted values for the Box 1997, pp 251-270. and Jenkins system (Case 2). [4] J.S.R Jang, ANFIS: Adaptive-Network-based Fuzzy Inference Systems, IEEE Trans Systems Man In case 2 we used all the data set to build the fuzzy and Cybern., Vol. 23 (3), 1993, pp. 665-685. model and to validate its performance. The terminal [5] C.W. Xu, and Y.Z. Lu, Fuzzy Model conditions were selected as ε1=10-4 and ε2=10-2, and Identification and Self-Learning for Dynamic the learning rates for the gradient descent method Systems, IEEE Trans. Syst. Man & Cyber., Vol. 17 were: β1 = β2= 0.3. The final number of rules was (4), 1987, pp. 683-689. equal to c=4. Fig.3 depicts the predicted and the [6] E. Kim, H. Lee, M. Park, and M. Park, A simply actual values, where the MSE was equal to 0.1398. identified Sugeno-type fuzzy model via double Table 4 compares the performance of the produced clustering, Information Sciences,Vol. 110, 1998, pp. fuzzy model to other models that can be found in the 25-39. literature. From this table we can easily notice that [7] A.G. Skarmeta, M. Delgado, and M. Vila, About our model achieves the best performance. the use of fuzzy clustering techniques for fuzzy model identification, Fuzzy Sets and Systems, Vol. Table 4: Comparison results for the Box-Jenkins example 106, 1999, pp 179-188. (Case 2) [8] A. Kroll, Identification of functional fuzzy Number Model of rules MSE models using multidimensional reference fuzzy sets, Box and Jenkins[13] --- 0.2020 Fuzzy Sets and Systems, Vol. 80, 1996, pp 149-158. Chen et al. [11] 3 0.2678 [9] K. Hirota, and W. Pedrycz, Directional fuzzy Sugenoand Yasukawa [3] 6 0.1900 clustering and its application to fuzzy modeling, Xu and Lu [5] 25 0.3280 Fuzzy Sets and Systems, Vol. 80, 1996, pp 315-326. Gomez-Skarmeta et al. [8] 2 0.1570 [10] Y. Nakamori, and M. Ryoke, Identification of Kroll [8] 2 0.1495 fuzzy prediction models through hyper-ellipsoidal Our Model 4 0.1398 clustering, IEEE Trans. Syst. Man and Cybern., Vol. 24(8), 1994, pp.1153-1173. [11] J. Chen, Y. Xi and Z. Zhang, A clustering 4. Conclusions algorithm for fuzzy model identification, Fuzzy Sets In this paper we have proposed a novel method to and Systems, Vol. 98, 1998, pp 319-329. train fuzzy models. The method is developed so that [12] J.C. Bezdek, Pattern Recognition with Fuzzy emphasis is given on both the accuracy and the size Objective Function Algorithms, Plenum Press, N. Y., of the produced model. In order to achieve these 1981. targets, the method follows a number of steps, which [13] G.E.Box, and G.M. Jenkins, Time series are independent each other, so that the result of each Analysis, forecasting and control, San Francisco, CA: step becomes the input of the next step. The basic Holden Day, 1970. design issue of the algorithm is that both the premise [14] Y. Lin, and G.A. Cunningham, A new approach and the consequent parts appear an equal contribution to fuzzy-neural modeling, IEEE Trans. Fuzzy to the firing degree of each rule. In order to Systems, Vol. 3, 1995, pp 190-197. accomplish this, certain conditions are taken into account. The application of the algorithm to a test case shows that the algorithm is able to achieve a

DOCUMENT INFO

Shared By:

Categories:

Tags:
fuzzy clustering, fuzzy sets, fuzzy model, fuzzy rules, clustering algorithm, data set, membership functions, fuzzy systems, fuzzy c-means, cluster centers, data points, ieee trans, objective function, fuzzy models, fuzzy system

Stats:

views: | 24 |

posted: | 2/8/2010 |

language: | English |

pages: | 6 |

OTHER DOCS BY she20208

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.