VIEWS: 47 PAGES: 4 POSTED ON: 9/24/2010
A Novel Neuro-Fuzzy Classifier Based on Weightless Neural Network Raida Al-Alawi Department of Computer Engineering College of Information Technology University of Bahrain P. O. Box 32038 Bahrain Abstract: - A single layer weightless neural network is used as a feature extraction stage of a new neuro-fuzzy classifier. The extracted feature vector measures the similarity of an input pattern to the different classification groups. A fuzzy inference system uses the feature vectors of the training set to generate a set of rules required to classify unknown patterns. The resultant architecture is called Fuzzy Single Layer Weightless Neural Network (F- SLWNN). The performance of the F-SLWNN is contrasted with the performance of the original Winner-Takes-All Single Layer Weightless Neural Network (WTA-SLWNN). Comparative experimental results highlight the superior properties of F-SLWNN classifier. Key-Words: - Weightless neural networks, Fuzzy inference system, pattern recognition, n-tuple networks. conclusion and future work is addressed in the last 1. Introduction section. Weightless neural networks (WNN) have received extensive research attention and regarded as powerful 2. The F-SLWNN Classifier learning machines, in particular, as excellent pattern The F-SLWNN is a two stage system in which the first classifier [1,2]. WNNs possess many prominent stage is a single layer weightless neural network features such as their simple one-shot learning scheme (SLWNN) used to extract the similarity feature vectors [3], fast execution time and readiness to hardware from the training data, while the second stage is a implementation [4, 5, 6]. The simplest form of WNN fuzzy Inference system (FIS) that builds the fuzzy is the n-tuple classifier or the Single Layer Weightless classification rules from the knowledge learnt by the Neural Networks (SLWNN) proposed by Bledsoe and trained SLWNN. Figure 1 shows a block diagram of Browning in 1959 [7]. The neurons in a SLWNN are the F-SLWNN. RAM-like cells whose acquired knowledge is stored in The SLWNN is a multi-discriminator Look-Up tables (LUT). The reason for referring to classifier as shown in figure 1. these models as weightless networks is that adaptation Each discriminator consists of M RAM-like neurons to new training patterns are performed by changing the (weightless neurons or n-tuples) with n address lines, contents of the LUTs rather than adjusting the weights 2n storage locations (sites) and 1-bit word length. Each in conventional neural network models. Although, RAM randomly samples n pixels of the input image, many weightless neural networks models have been as shown in figure 2. Each pixel must be sampled by at proposed in the literature, few combine these networks least one RAM. Therefore, the number of RAMs with fuzzy logic. The work presented in this paper is a within a discriminator (M) depends on the n-tuple size new approach in combining the excellent features of (n) and the size of the input image (I) and is given by: SLWNN with a fuzzy rule base system, to generate a I neuro-fuzzy classifier. M ...... (1) This paper will first give an overview to the F- n The RAM‟s input pattern forms an address to a SLWNN classifier. Section 3, describes the training specific site (memory location) within the RAM. The and testing methodology of the proposed F-SLWNN. outputs of all RAMs in each discriminator are summed Section 4 presents the experimental results conducted together to give its response. to test the performance of the F-SLWNN and The jth discriminator response (zj) is given by: contrasted with the WTA-SLWNN. Finally, M an unseen pattern, in other words, the discriminator zj o kj ...... (2) with the highest response will specify the class to k 1 which the test pattern belongs. k where, the output of the k th RAM of the jth o j is n-tuple address lines Weightless Neuron discriminator. Normalized 16x16 o1 j pixel image 1 1 0 Discriminator 1 F-Net 1 d1(x) o2 j x1 1 0 zj 0 F-Net 2 d2(x) Discriminator Discriminator 2 x2 Response Adder MAX Identified oM j Class 0 1 1 dN(x) address 11......10 00......00 00......01 11......11 Site Input Pattern Discriminator N xN F-Net N (P) WSLNN Feature FIS vector (x) Figure 2. The architecture of the j th discriminator. Figure 1. Architecture of F-SLWNN. In this paper, the set of feature vectors generated by Before training, the contents of all RAMs sites are the SLWNN for a given training set will be used as cleared (stores “0”). The training set consists of an input data to the fuzzy rule-based system. Fuzzy rules equal number of patterns from each class. Training is a will be extracted from the feature vector using the one-shot process in which each discriminator is trained methodology described in [8]. This method is an individually on a set of patterns that belongs to it. expansion to the fuzzy min-max classifier neural When presenting a training pattern to the network, all network described in [9]. the addressed sites of the discriminator which the Figure 1 shows the general architecture for the second training pattern belongs are set to “1”. stage of our F-SLWNN system. Every class has a When all training patterns are presented, the data dedicated network (F-Net) that calculates the degree of stored inside the discriminators RAMs will give the membership of the input pattern to its class. Each WNN its generalization ability. Fuzzy network (F-Neti ) consists of a number of neural- Testing the SLWNN is performed by presenting an like subnetworks as illustrated in figure 3. input pattern to the discriminators inputs and summing The number of subnetworks in F-neti depends on the the RAMs outputs of each discriminator to get its number of classes that overlap with class i. response. Each node in the first layer calculates the degree of The normalized response of class j discriminator xj , is membership of the input vector to its class (denoted defined by: by d fij ( l ) ( x ) ). Detailed analysis for computing the zj degree of membership of an input pattern x to a xj ...... (3) M specific rule is found in [8]. The value of xj can be used as a measure of the The second layer neurons find the maximum degree of similarity of the input pattern to the j th class training membership of the input vector x among the degree of patterns.The normalized response vector generated by membership values generated by the nodes in the first all discriminators for an input pattern is given by: layer, given by: x [ x1 , x2 ,......., x N ] , 0 x j 1 ...... (4) d f ij ( x ) max ( d f ij ( l ) ( x )) ... (5) This response vector can be regarded as a feature l 1,..... vector that measures the similarity of an input pattern The output node in network i finds the degree of to all classes. membership of x to class i, denoted by d i ( x ) , and is In the classical SLWNN (n-tuple networks), a Winner- Takes-All (WTA) decision scheme is used to classify given by: d i ( x ) min ( d fij ( x )) ...... (6) Finally, an unknown vector x will be classified to the will generate a feature vector (x) that describes the class that generates the maximum degree of belongingness of the input vector to each class. Fuzzy membership, rules will then be applied on this feature vector to find the class that generate the maximum degree of i.e.; x class i if d i ( x ) max (d j ( x )) (7) membership and associate the unknown pattern to that j 1, 2, . . . . , N overlap between class i and class j class. sub-net df (x) ij ( 1 ) 4. Discussion and Experimental results df (x) MAX The classification ability of the F-SLWNN has ij ( 2 ) df (x) ij been tested on the classification of handwritten x1 numerals using NIST standard database SD19 [10]. A 10-discriminator SLWNN is used in the first stage of x2 di ( x ) the system. The discriminator consists of 32 neurons, MIN each with n-tuple size of 8 and sampling binary images of 32x32 bit size. The parameters of the SLWNN (n- tuple size, mapping method, number of training overlap between class i and class k patterns) are fixed in the experiments conducted. The xN df ik (1) (x) df (x) effect of changing these parameters on the ik Feature performance of the network is beyond the scope of this MAX vector (x) df ik (2) (x) work. The complete training set consists of 150 handwritten numerals from each class (A total of 1500 patterns). The subset T1 which is used to train the SLWNN contains 100 patterns from each class. The Figure 3. Detailed architecture of binary images from the database were normalized and the ith fuzzy network (F-Neti ). resizes into a 16x16 pixel grid. The performance of the F-SLWNN is compared with 3. Hybrid Training Algorithm the performance of the standard WTA-SLWNN. The Training the F-SLWNN is performed in two WTA-SLWNN is trained on the same set of patterns steps. The first step is performed in order to generate T1. Table 1 summarizes the results obtained when initial clusters from a subset of the training set. In this testing the two classifiers on different sets. The first phase, a subset of training patterns T1 is used to train experiments are conducted to verify the behavior of the SLWNN. Each discriminator will be trained to the two models on the tra ining sets T1 and T. Both recognize one class. After completing this phase, the systems produce 100% correct recognition for the information stored into the discriminators‟ RAMs will training set T1. However, for training set T, the WTA- give the SLWNN its generalization capability. The SLWNN gives an average of 91.7% correct second phase of the learning algorithm is the recognition compared with 100% correct recognition generalization phase in which the complete training set for the F-SLWNN. In this case, the WTA-SLWNN T (including T1 ) is applied to the trained SLWNN. used its encoded data to generalize for the unseen Each pattern will produce a feature vector that patterns in the set T, while the fuzzy inference stage describes its similarity to all classes. The generated set used the SLWNN generated response vectors to of feature vectors will be used to extract the fuzzy extract the rules required to correctly classify them. rules for each fuzzy network (F-Net) based on the The second experiment is conducted to test the overlapping regions of its own cluster with the other generalization ability of the two classifiers. A new set classes clusters. of patterns that consists of 500 unseen patterns Once the generalization phase is completed, the selected equally from the 10 numeral classes are SLWNN stage has encoded in its memory sites presented to the two networks. The performance of the information about the training patterns subset T1, and F-SLWNN is higher than that of the WTA-SLWNN as the fuzzy inference stage has extracted the set of rules indicated in table 1. from the whole training set training set T. When an The Final experiments are conducted to test the noise unknown pattern is applied to the system, the SLWNN tolerance of the two models. A „salt and pepper‟ noise function supported by MATLAB is added to the training set T, with different values of noise densities. numeral. Experimental results reveal that the F- Figure 4 shows the percentage of correct recognition SLWNN classifier outperforms the WTA-SLWNN. as a function of noise density for both classifiers. The Future work investigates the effect of varying the n- results show that the performance of both models is tuple size and the size of the training sets T1 and T on comparable when the noise density is low. the performance of the system. Percentage of Correct References Applied Classification [1] T. M. Jørgensen, “Classification of patterns WTA- F- Handwritten Digits Using a RAM Neural Net SLWNN SLWNN Architecture,” Int’l J. Neural Systems, vol. 8, Set T1 100 100 Classifiers Verification Set T 91.7 100 no. 1, pp. 17-25, 1997. Generalization ability Test Set 92.2 97.6 [2] T. G. Clarkson et al. ,“Speaker Identification for Security Systems Using Reinforcement- Table 1. A comparison of Percentage of correct Trained pRAM Neural Network recognition between F-SLWNN and WTA-SLWNN. Architectures,” IEEE Transaction On Systems, Man and Cybernatics-Part C: Application and However, the performance of the WTA-SLWNN Reviews, Vol. 31, No. 1, pp. 65-76, Feb. 2001. degrades more rapidly than that of the F-SLWNN as [3] J. Austin. RAM-Based Neural Networks, the noise density increases. Singapore: World Scientific, 1998. [4] I. Aleksander, W. Thomas, P. Bowden, "WISARD, a radical new step forward in image recognition," Sensor Review 4, 29-40, 1984. [5] E. V. Simões, L. F. Uebel and D. A. C. Barone, “Hardware Implementation of RAM Neural Networks,” Pattern Recognition Letters, Vol. 17, No. 4, pp. 421-429, 1996. [6] R. Al-Alawi, “FPGA Implementation of a Pyramidal Weightless Neural Networks Learning System” International Journal of Neural Systems, Vol. 13, No. 4, pp.225-237, 2003. [7] W. W. Bledsoe and I. Browning, “Pattern Recognition and Reading by Machine,” in Proc. Eastern Joint Computer Conference, Figure 4. Percentage of correct recognition as Boston, MA, pp. 232–255, 1959. a function of added noise density. [8] S. Abe and M. S. Lan, “A Method for Fuzzy Rule Extraction Directly from Numerical Data and its application to Pattern Classification,” 5. Conclusions IEEE Transaction on Fuzzy Systems, Vol. 3, A new two stage neuro-fuzzy is introduced. The first no. 1, pp. 18-28, Feb. 1995. stage of the classifier utilizes a Single Layer [9] P. K. Simpson, “Fuzzy Min-Max Neural Weightless Neural Network (SLWNN) as a feature Networks-Part 1: Classification,” IEEE extractor. The second stage is a fuzzy inference system Transaction on Neural Networks, Vol. 3, no. whose rules are extracted from the feature vectors 5, pp. 776-786, Sept. 1992. generated by the trained SLWNN. This approach is an [10] National Institute of Standards and alternative to the traditiona l crisp WTA n-tuple Technology, NIST Special Data Base 19, classifiers. NIST Handprinted Forms and Characters The effectiveness of the proposed system has been Database, 2002. validated using NIST database SD19 for handwritten