Introduction to Pattern Recognition by ewghwehws


   Neural Networks

   Kamal Nasrollahi, Assistant Professor
Computer Vision and Media Technology Lab
            Aalborg University
A new sort of computer
 What are (everyday) computer systems good at...
  and not so good at?

          Good at                 Not so good at
  Rule-based systems:       Dealing with noisy data
  doing what the programmer
                            Dealing with unknown
  wants them to do
                            environment data
                            Massive parallelism
                             Fault tolerance
                             Adapting to circumstances
A new sort of computer
when we can't formulate an algorithmic
when we can get lots of examples of the
 behavior we require.
         ‘learning from experience’
when we need to pick out the structure
 from existing data.
Neural networks

Neural network: information processing
 paradigm inspired by biological nervous
 systems, such as our brain
Structure: large number of highly
 interconnected processing elements
 (neurons) working together
Like people, they learn from experience
 (by example)
Neural networks

Neural networks are configured for a
 specific application, such as pattern
 recognition or data classification, through
 a learning process
In a biological system, learning involves
 adjustments to the synaptic connections
 between neurons
Inspiration from Neurobiology
 A neuron: many-inputs /
  one-output unit
 output can be excited or
  not excited
 incoming signals from other
  neurons determine if the
  neuron shall excite ("fire")
 Output subject to
  attenuation in the synapses,
  which are junction parts of
  the neuron
Inspiration from Neurobiology
 The human
  brain contains
  about 10
  billion nerve

 Each neuron
  is connected
  to the others
 Synapse concept

  The synapse resistance to the incoming signal can be
   changed during a "learning" process [1949]

                        Hebb’s Rule:
   If an input of a neuron is repeatedly and persistently
 causing the neuron to fire, a metabolic change happens in
the synapse of that particular input to reduce its resistance

   Learning in biological systems involves adjustments to the
   synaptic connections that exist between the neurons. This is
   true of ANNs as well.
Inspiration from Neurobiology

                              Terminal Branches
                                   of Axon

What is an artificial neuron ?

Definition : Non linear, parameterized function
 with restricted output range

               y                    n 1
                        y  f  w0   wi xi 
w0                                  i 1    

     x1   x2       x3
Activation functions




      12                                                                          Linear





              0        2         4    6     8   10   12   14   16   18      20





                                                                                       1  exp(  x)



           -10         -8    -6       -4   -2   0    2    4    6    8       10



     1                                                                            Hyperbolic tangent
                                                                                       exp( x)  exp(  x)


                                                                                       exp( x)  exp(  x)



     -10          -8        -6       -4    -2   0    2    4    6        8    10
Activation functions
Artificial Neural Networks
   A mathematical model to solve engineering problems
       Group of highly connected neurons to realize compositions of
        non linear functions
   Tasks
       Function Approximation
       Classification
       Time Series Prediction
       Data Mining
   2 types of networks
       Feed forward Neural Networks
       Recurrent Neural Networks
   Feed Forward Neural
                         The information is
                          propagated from the
                          inputs to the outputs
                         Computations of No non
2nd hidden                linear functions from n
layer                     input variables by
                          compositions of Nc
1st hidden
                          algebraic functions
                         Time has no role (NO
                          cycle between outputs
                          and inputs)
   x1 x2     …..   xn
Recurrent Neural
                          Can have arbitrary topologies
                          Can model systems with
                           internal states (dynamic ones)
                          Delays are associated to a
    0            1         specific weight
                 0        Training is more difficult
             1            Performance may be
   0     1                   Stable Outputs may be more
                              difficult to evaluate
                             Unexpected behavior
    x1       x2               (oscillation, chaos, …)

The procedure that consists in estimating
 the parameters of neurons so that the
 whole network can perform a specific task

2 types of learning
 The supervised learning
 The unsupervised learning
Supervised learning

Present the network a number of inputs
 and their corresponding outputs
See how closely the actual outputs match
 the desired ones
Modify the parameters to better
 approximate the desired outputs
Unsupervised learning

Idea : group typical input data in function of
 resemblance criteria un-known a priori
Data clustering
No need for a teacher
   The network finds itself the correlations between the
  Examples of such networks :
     Kohonen feature maps
Properties of Neural

Supervised networks are universal
Theorem : Any limited function can be
 approximated by a neural network with a finite
 number of hidden neurons to an arbitrary
Type of Approximators
  Linear approximators
  Non-linear approximators
Other properties

  Adapt weights to environment and retrained easily
Generalization ability
  May provide against lack of data
Fault tolerance
  Graceful degradation of performances if damaged
     the information is distributed within the entire net.
What do we need to use NN ?

Determination of the number of inputs and
Collection of data for the learning and testing
 phase of the neural network
Finding the optimum number of hidden nodes
Estimate the parameters (Learning)
Evaluate the performances of the network
IF performances are not satisfactory then
 review all the precedent points
Classical neural architectures

Multi-Layer Perceptron
Radial Basis Function (RBF)
Kohonen Features maps
Other architectures
  An example : Shared weights neural
                                  + +
Rosenblatt (1962)                  + +
                                     +               y  1
Linear separation               + + ++
                                  +     + +
Inputs :Vector of real             + + ++
                                 + + + + + ++
  values                              + + +
                                          + +
Outputs :1 or -1                      +     +
                                y  1      + + +
 y  sign(v)                                   +

                                          c0  c1 x1  c2 x2  0
              v  c0  c1 x1  c2 x2
       c0             c2
      1                    x2
Learning with a perceptron
 Perceptron:     yout  cT x
          1           2               N
 Data: ( x , y1 ), ( x , y 2 ),...,( x , y N )
 Error:   E (t )  ( y (t ) out  yt )  (c(t ) x  yt )
                                          2            T    t       2

 Learning:                            E (t )                  (c(t )T x t  yt ) 2
           ci (t  1)  ci (t )  K           ci (t )  K 
                                       ci                            ci
           ci (t  1)  ci (t )  K  (c(t )T x t  yt )  xit
           c(t )T x   c j (t )  x tj
                        j 1

 A perceptron is able to learn a linear function.

The perceptron algorithm converges if
 examples are linearly separable
description of a simple
 An artificial neuron is a device with many inputs and one output.
 The neuron has two modes of operation; the training mode and the
  using mode.
 In the training mode, the neuron can be trained to fire (or not), for
  particular input patterns.
 In the using mode, when a taught input pattern is detected at the
  input, its associated output becomes the current output.
 If the input pattern does not belong in the taught list of input
  patterns, the firing rule is used to determine whether to fire or not.
Firing rules - How neurons
make decisions

 The firing rule is an important concept in neural networks and
  accounts for their high flexibility. A firing rule determines how
  one calculates whether a neuron should fire for any input
  pattern. It relates to all the input patterns, not only the ones
  on which the node was trained.
 A simple firing rule can be implemented by using Hamming
  distance technique. The rule goes as follows:
   Take a collection of training patterns for a node, some of which
    cause it to fire (the 1-taught set of patterns) and others which
    prevent it from doing so (the 0-taught set).
   Then the patterns not in the collection cause the node to fire if,
    on comparison , they have more input elements in common with
    the 'nearest' pattern in the 1-taught set than with the 'nearest'
    pattern in the 0-taught set. If there is a tie, then the pattern
    remains in the undefined state.
Firing rules - How neurons
make decisions

For example, a 3-input neuron is taught
 to output 1 when the input (X1,X2 and
 X3) is 111 or 101 and to output 0 when
 the input is 000 or 001. Then, before
 applying the firing rule, the truth table is;
Firing rules - How neurons
make decisions

 As an example of the way the firing rule is applied, take
  the pattern 010. It differs from 000 in 1 element, from
  001 in 2 elements, from 101 in 3 elements and from 111
  in 2 elements.
 Therefore, the 'nearest' pattern is 000 which belongs in
  the 0-taught set. Thus the firing rule requires that the
  neuron should not fire when the input is 001.
Firing rules - How neurons
make decisions

On the other hand, 011 is equally distant
 from two taught patterns that have
 different outputs and thus the output
 stays undefined (0/1).
Firing rules - How neurons
make decisions

• By applying the firing in every column the following truth table is
• The difference between the two truth tables is called the generalization
  of the neuron.
• Therefore the firing rule gives the neuron a sense of similarity and
  enables it to respond 'sensibly' to patterns not seen during training.
   Multi-Layer Perceptron

                          One or more hidden
layer                     Sigmoid activations
2nd hidden                 functions
1st hidden

             Input data
Back-propagation algorithm
  Most common method of obtaining the many weights in
   the network
     A form of supervised training
The basic backpropagation algorithm is based on
minimizing the error of the network using the
derivatives of the error function
Prone to local minima issues
Error Back-Propagation
Error Back-Propagation-
             Different non linearly
             separable problems
                         Types of                      Exclusive-OR    Classes with Most General
                     Decision Regions                    Problem      Meshed regions Region Shapes
Single-Layer              Half Plane                   A        B
                          Bounded By                                    B
                          Hyperplane                   B        A

Two-Layer               Convex Open                    A        B
                             Or                                         B
                       Closed Regions                  B        A

Three-Layer               Abitrary
                                                       A        B
                        (Complexity                                     B
                       Limited by No.                                          A
                                                       B        A
                         of Nodes)
 Neural Networks – An Introduction Dr. Andrew Hunter

 Andrew L. Nelson, Introduction to Artificial Neural
 Nicolas Galoppo von Borries, Introduction to Artificial
  Neural Networks.

To top