; RAM-discriminators
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

RAM-discriminators

VIEWS: 47 PAGES: 4

  • pg 1
									   A Novel Neuro-Fuzzy Classifier Based on Weightless Neural Network
                                                 Raida Al-Alawi
                                       Department of Computer Engineering
                                        College of Information Technology
                                              University of Bahrain
                                                 P. O. Box 32038
                                                      Bahrain

Abstract: - A single layer weightless neural network is used as a feature extraction stage of a new neuro-fuzzy
classifier. The extracted feature vector measures the similarity of an input pattern to the different classification
groups. A fuzzy inference system uses the feature vectors of the training set to generate a set of rules required to
classify unknown patterns. The resultant architecture is called Fuzzy Single Layer Weightless Neural Network (F-
SLWNN). The performance of the F-SLWNN is contrasted with the performance of the original Winner-Takes-All
Single Layer Weightless Neural Network (WTA-SLWNN). Comparative experimental results highlight the
superior properties of F-SLWNN classifier.

Key-Words: - Weightless neural networks, Fuzzy inference system, pattern recognition, n-tuple networks.

                                                            conclusion and future work is addressed in the last
1. Introduction                                             section.
Weightless neural networks (WNN) have received
extensive research attention and regarded as powerful       2. The F-SLWNN Classifier
learning machines, in particular, as excellent pattern      The F-SLWNN is a two stage system in which the first
classifier [1,2]. WNNs possess many prominent               stage is a single layer weightless neural network
features such as their simple one-shot learning scheme      (SLWNN) used to extract the similarity feature vectors
[3], fast execution time and readiness to hardware          from the training data, while the second stage is a
implementation [4, 5, 6]. The simplest form of WNN          fuzzy Inference system (FIS) that builds the fuzzy
is the n-tuple classifier or the Single Layer Weightless    classification rules from the knowledge learnt by the
Neural Networks (SLWNN) proposed by Bledsoe and             trained SLWNN. Figure 1 shows a block diagram of
Browning in 1959 [7]. The neurons in a SLWNN are            the F-SLWNN.
RAM-like cells whose acquired knowledge is stored in                 The SLWNN is a multi-discriminator
Look-Up tables (LUT). The reason for referring to           classifier as shown in figure 1.
these models as weightless networks is that adaptation      Each discriminator consists of M RAM-like neurons
to new training patterns are performed by changing the      (weightless neurons or n-tuples) with n address lines,
contents of the LUTs rather than adjusting the weights      2n storage locations (sites) and 1-bit word length. Each
in conventional neural network models. Although,            RAM randomly samples n pixels of the input image,
many weightless neural networks models have been            as shown in figure 2. Each pixel must be sampled by at
proposed in the literature, few combine these networks      least one RAM. Therefore, the number of RAMs
with fuzzy logic. The work presented in this paper is a     within a discriminator (M) depends on the n-tuple size
new approach in combining the excellent features of         (n) and the size of the input image (I) and is given by:
SLWNN with a fuzzy rule base system, to generate a                         I
neuro-fuzzy classifier.                                              M                               ...... (1)
This paper will first give an overview to the F-
                                                                           n
                                                            The RAM‟s input pattern forms an address to a
SLWNN classifier. Section 3, describes the training
                                                            specific site (memory location) within the RAM. The
and testing methodology of the proposed F-SLWNN.
                                                            outputs of all RAMs in each discriminator are summed
Section 4 presents the experimental results conducted
                                                            together to give its response.
to test the performance of the F-SLWNN and
                                                            The jth discriminator response (zj) is given by:
contrasted with the WTA-SLWNN. Finally,
                   M                                                                           an unseen pattern, in other words, the discriminator
     zj            o kj                                          ...... (2)                  with the highest response will specify the class to
              k 1                                                                             which the test pattern belongs.
              k
where,         the output of the k th RAM of the jth
            o j is                                                                                                     n-tuple address lines
                                                                                                                                                                 Weightless Neuron
discriminator.                                                                                  Normalized 16x16
                                                                                                                                                                                                           o1
                                                                                                                                                                                                            j
                                                                                                   pixel image
                                                                                                                                          1 1                                                  0


                            Discriminator 1             F-Net 1
                                                                    d1(x)                                                                                                                                 o2
                                                                                                                                                                                                           j
                                               x1
                                                                                                                                         1 0                                                                                    zj

                                                                                                                                                                                                                 
                                                                                                                                                                                               0

                                                        F-Net 2     d2(x)                                                                                                                                                Discriminator
                            Discriminator 2    x2
                                                                                                                                                                                                                          Response
                                                                                                                                                                                                                Adder
                                                                            MAX
                                                                                  Identified                                                                                                              oM
                                                                                                                                                                                                           j
                                                                                    Class
                                                                                                                                          0 1                                                 1
                                                                    dN(x)




                                                                                                                             address




                                                                                                                                                                                11......10
                                                                                                                                       00......00
                                                                                                                                                    00......01




                                                                                                                                                                                             11......11
                                                                                                                               Site
   Input Pattern            Discriminator N    xN       F-Net N
        (P)
                               WSLNN          Feature     FIS
                                              vector
                                                (x)                                              Figure 2. The architecture of the j th discriminator.
            Figure 1. Architecture of F-SLWNN.
                                                                                               In this paper, the set of feature vectors generated by
Before training, the contents of all RAMs sites are                                            the SLWNN for a given training set will be used as
cleared (stores “0”). The training set consists of an                                          input data to the fuzzy rule-based system. Fuzzy rules
equal number of patterns from each class. Training is a                                        will be extracted from the feature vector using the
one-shot process in which each discriminator is trained                                        methodology described in [8]. This method is an
individually on a set of patterns that belongs to it.                                          expansion to the fuzzy min-max classifier neural
When presenting a training pattern to the network, all                                         network described in [9].
the addressed sites of the discriminator which the                                             Figure 1 shows the general architecture for the second
training pattern belongs are set to “1”.                                                       stage of our F-SLWNN system. Every class has a
When all training patterns are presented, the data                                             dedicated network (F-Net) that calculates the degree of
stored inside the discriminators RAMs will give the                                            membership of the input pattern to its class. Each
WNN its generalization ability.                                                                Fuzzy network (F-Neti ) consists of a number of neural-
Testing the SLWNN is performed by presenting an                                                like subnetworks as illustrated in figure 3.
input pattern to the discriminators inputs and summing                                         The number of subnetworks in F-neti depends on the
the RAMs outputs of each discriminator to get its                                              number of classes that overlap with class i.
response.                                                                                      Each node in the first layer calculates the degree of
The normalized response of class j discriminator xj , is                                       membership of the input vector to its class (denoted
defined by:                                                                                    by d fij ( l ) ( x ) ). Detailed analysis for computing the
                   zj                                                                          degree of membership of an input pattern x to a
     xj                                                          ......    (3)
                   M                                                                           specific rule is found in [8].
The value of xj can be used as a measure of the                                                The second layer neurons find the maximum degree of
similarity of the input pattern to the j th class training                                     membership of the input vector x among the degree of
patterns.The normalized response vector generated by                                           membership values generated by the nodes in the first
all discriminators for an input pattern is given by:                                           layer, given by:
     x  [ x1 , x2 ,......., x N ] , 0  x j  1 ...... (4)                                     d f ij ( x )      max
                                                                                                                    ( d f ij ( l ) ( x )) ...   (5)
This response vector can be regarded as a feature                                                                  l 1,.....
vector that measures the similarity of an input pattern                                        The output node in network i finds the degree of
to all classes.                                                                                membership of x to class i, denoted by d i ( x ) , and is
In the classical SLWNN (n-tuple networks), a Winner-
Takes-All (WTA) decision scheme is used to classify                                            given by: d i ( x )  min ( d fij ( x ))                                                                         ......     (6)
Finally, an unknown vector x will be classified to the                                         will generate a feature vector (x) that describes the
class that generates the maximum degree of                                                     belongingness of the input vector to each class. Fuzzy
membership,                                                                                    rules will then be applied on this feature vector to find
                                                                                               the class that generate the maximum degree of
i.e.; x  class i if d i ( x )                       max                  (d j ( x )) (7)
                                                                                               membership and associate the unknown pattern to that
                                                   j  1, 2, . . . . , N
              overlap between class i and class j
                                                                                               class.
  sub-net
                   df              (x)
                        ij ( 1 )
                                                                                               4. Discussion and Experimental results
                   df              (x)      MAX                                                         The classification ability of the F-SLWNN has
                        ij ( 2 )                               df (x)
                                                                 ij                            been tested on the classification of handwritten
 x1
                                                                                               numerals using NIST standard database SD19 [10]. A
                                                                                               10-discriminator SLWNN is used in the first stage of
 x2
                                                                                    di ( x )   the system. The discriminator consists of 32 neurons,
                                                                             MIN               each with n-tuple size of 8 and sampling binary images
                                                                                               of 32x32 bit size. The parameters of the SLWNN (n-
                                                                                               tuple size, mapping method, number of training
               overlap between class i and class k                                             patterns) are fixed in the experiments conducted. The
 xN                  df
                           ik
                                (1)
                                      (x)
                                                                  df (x)
                                                                                               effect of changing these parameters on the
                                                                      ik
Feature                                                                                        performance of the network is beyond the scope of this
                                             MAX
vector
  (x)
                    df
                          ik
                               (2)
                                     (x)
                                                                                               work. The complete training set consists of 150
                                                                                               handwritten numerals from each class (A total of 1500
                                                                                               patterns). The subset T1 which is used to train the
                                                                                               SLWNN contains 100 patterns from each class. The
             Figure 3. Detailed architecture of                                                binary images from the database were normalized and
               the ith fuzzy network (F-Neti ).                                                resizes into a 16x16 pixel grid.
                                                                                               The performance of the F-SLWNN is compared with
3. Hybrid Training Algorithm                                                                   the performance of the standard WTA-SLWNN. The
          Training the F-SLWNN is performed in two                                             WTA-SLWNN is trained on the same set of patterns
steps. The first step is performed in order to generate                                        T1. Table 1 summarizes the results obtained when
initial clusters from a subset of the training set. In this                                    testing the two classifiers on different sets. The first
phase, a subset of training patterns T1 is used to train                                       experiments are conducted to verify the behavior of
the SLWNN. Each discriminator will be trained to                                               the two models on the tra ining sets T1 and T. Both
recognize one class. After completing this phase, the                                          systems produce 100% correct recognition for the
information stored into the discriminators‟ RAMs will                                          training set T1. However, for training set T, the WTA-
give the SLWNN its generalization capability. The                                              SLWNN gives an average of 91.7% correct
second phase of the learning algorithm is the                                                  recognition compared with 100% correct recognition
generalization phase in which the complete training set                                        for the F-SLWNN. In this case, the WTA-SLWNN
T (including T1 ) is applied to the trained SLWNN.                                             used its encoded data to generalize for the unseen
Each pattern will produce a feature vector that                                                patterns in the set T, while the fuzzy inference stage
describes its similarity to all classes. The generated set                                     used the SLWNN generated response vectors to
of feature vectors will be used to extract the fuzzy                                           extract the rules required to correctly classify them.
rules for each fuzzy network (F-Net) based on the                                              The second experiment is conducted to test the
overlapping regions of its own cluster with the other                                          generalization ability of the two classifiers. A new set
classes clusters.                                                                              of patterns that consists of 500 unseen patterns
Once the generalization phase is completed, the                                                selected equally from the 10 numeral classes are
SLWNN stage has encoded in its memory sites                                                    presented to the two networks. The performance of the
information about the training patterns subset T1, and                                         F-SLWNN is higher than that of the WTA-SLWNN as
the fuzzy inference stage has extracted the set of rules                                       indicated in table 1.
from the whole training set training set T. When an                                            The Final experiments are conducted to test the noise
unknown pattern is applied to the system, the SLWNN                                            tolerance of the two models. A „salt and pepper‟ noise
                                                                                               function supported by MATLAB is added to the
training set T, with different values of noise densities.        numeral. Experimental results reveal that the F-
Figure 4 shows the percentage of correct recognition             SLWNN classifier outperforms the WTA-SLWNN.
as a function of noise density for both classifiers. The         Future work investigates the effect of varying the n-
results show that the performance of both models is              tuple size and the size of the training sets T1 and T on
comparable when the noise density is low.                        the performance of the system.

                                         Percentage of Correct    References
                            Applied          Classification      [1]     T. M. Jørgensen, “Classification of
                            patterns   WTA-            F-                Handwritten Digits Using a RAM Neural Net
                                       SLWNN           SLWNN
                                                                         Architecture,” Int’l J. Neural Systems, vol. 8,
                            Set T1     100             100
 Classifiers Verification
                            Set T      91.7            100               no. 1, pp. 17-25, 1997.
 Generalization ability     Test Set   92.2            97.6      [2]     T. G. Clarkson et al. ,“Speaker Identification
                                                                         for Security Systems Using Reinforcement-
   Table 1. A comparison of Percentage of correct                        Trained       pRAM         Neural      Network
  recognition between F-SLWNN and WTA-SLWNN.                             Architectures,” IEEE Transaction On Systems,
                                                                         Man and Cybernatics-Part C: Application and
However, the performance of the WTA-SLWNN                                Reviews, Vol. 31, No. 1, pp. 65-76, Feb. 2001.
degrades more rapidly than that of the F-SLWNN as                [3]     J. Austin. RAM-Based Neural Networks,
the noise density increases.                                             Singapore: World Scientific, 1998.
                                                                 [4]     I. Aleksander, W. Thomas, P. Bowden,
                                                                         "WISARD, a radical new step forward in
                                                                         image recognition," Sensor Review 4, 29-40,
                                                                         1984.
                                                                 [5]     E. V. Simões, L. F. Uebel and D. A. C.
                                                                         Barone, “Hardware Implementation of RAM
                                                                         Neural Networks,” Pattern Recognition
                                                                         Letters, Vol. 17, No. 4, pp. 421-429, 1996.
                                                                 [6]     R. Al-Alawi, “FPGA Implementation of a
                                                                         Pyramidal Weightless Neural Networks
                                                                         Learning System” International Journal of
                                                                         Neural Systems, Vol. 13, No. 4, pp.225-237,
                                                                         2003.
                                                                 [7]     W. W. Bledsoe and I. Browning, “Pattern
                                                                         Recognition and Reading by Machine,” in
                                                                         Proc. Eastern Joint Computer Conference,
     Figure 4. Percentage of correct recognition as                      Boston, MA, pp. 232–255, 1959.
           a function of added noise density.                    [8]     S. Abe and M. S. Lan, “A Method for Fuzzy
                                                                         Rule Extraction Directly from Numerical Data
                                                                         and its application to Pattern Classification,”
5. Conclusions                                                           IEEE Transaction on Fuzzy Systems, Vol. 3,
A new two stage neuro-fuzzy is introduced. The first                     no. 1, pp. 18-28, Feb. 1995.
stage of the classifier utilizes a Single Layer                  [9]     P. K. Simpson, “Fuzzy Min-Max Neural
Weightless Neural Network (SLWNN) as a feature                           Networks-Part 1: Classification,” IEEE
extractor. The second stage is a fuzzy inference system                  Transaction on Neural Networks, Vol. 3, no.
whose rules are extracted from the feature vectors                       5, pp. 776-786, Sept. 1992.
generated by the trained SLWNN. This approach is an              [10]    National Institute of Standards and
alternative to the traditiona l crisp WTA n-tuple                        Technology, NIST Special Data Base 19,
classifiers.                                                             NIST Handprinted Forms and Characters
The effectiveness of the proposed system has been                        Database, 2002.
validated using NIST database SD19 for handwritten

								
To top