VIEWS: 5 PAGES: 37 POSTED ON: 3/25/2012 Public Domain
Artificial Neural Networks Kamal Nasrollahi, Assistant Professor Computer Vision and Media Technology Lab Aalborg University kn@create.aau.dk A new sort of computer What are (everyday) computer systems good at... and not so good at? Good at Not so good at Rule-based systems: Dealing with noisy data doing what the programmer Dealing with unknown wants them to do environment data Massive parallelism Fault tolerance Adapting to circumstances A new sort of computer when we can't formulate an algorithmic solution. when we can get lots of examples of the behavior we require. ‘learning from experience’ when we need to pick out the structure from existing data. Neural networks Neural network: information processing paradigm inspired by biological nervous systems, such as our brain Structure: large number of highly interconnected processing elements (neurons) working together Like people, they learn from experience (by example) Neural networks Neural networks are configured for a specific application, such as pattern recognition or data classification, through a learning process In a biological system, learning involves adjustments to the synaptic connections between neurons Inspiration from Neurobiology A neuron: many-inputs / one-output unit output can be excited or not excited incoming signals from other neurons determine if the neuron shall excite ("fire") Output subject to attenuation in the synapses, which are junction parts of the neuron Inspiration from Neurobiology The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000 synapses Synapse concept The synapse resistance to the incoming signal can be changed during a "learning" process [1949] Hebb’s Rule: If an input of a neuron is repeatedly and persistently causing the neuron to fire, a metabolic change happens in the synapse of that particular input to reduce its resistance Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well. Inspiration from Neurobiology Terminal Branches Dendrites of Axon Axon What is an artificial neuron ? Definition : Non linear, parameterized function with restricted output range y n 1 y f w0 wi xi w0 i 1 x1 x2 x3 Activation functions 20 18 16 14 12 Linear yx 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 2 1.5 0.5 1 Logistic1 0 y 1 exp( x) -0.5 -1 -1.5 -2 -10 -8 -6 -4 -2 0 2 4 6 8 10 2 1.5 1 Hyperbolic tangent exp( x) exp( x) 0.5 y 0 exp( x) exp( x) -0.5 -1 -1.5 -2 -10 -8 -6 -4 -2 0 2 4 6 8 10 Activation functions Artificial Neural Networks A mathematical model to solve engineering problems Group of highly connected neurons to realize compositions of non linear functions Tasks Function Approximation Classification Time Series Prediction Data Mining 2 types of networks Feed forward Neural Networks Recurrent Neural Networks Feed Forward Neural Networks The information is propagated from the Output inputs to the outputs layer Computations of No non 2nd hidden linear functions from n layer input variables by compositions of Nc 1st hidden algebraic functions layer Time has no role (NO cycle between outputs and inputs) x1 x2 ….. xn Recurrent Neural Networks Can have arbitrary topologies Can model systems with internal states (dynamic ones) Delays are associated to a 0 1 specific weight 0 0 Training is more difficult 1 Performance may be problematic 0 0 1 Stable Outputs may be more difficult to evaluate Unexpected behavior x1 x2 (oscillation, chaos, …) Learning The procedure that consists in estimating the parameters of neurons so that the whole network can perform a specific task 2 types of learning The supervised learning The unsupervised learning Supervised learning Present the network a number of inputs and their corresponding outputs See how closely the actual outputs match the desired ones Modify the parameters to better approximate the desired outputs Unsupervised learning Idea : group typical input data in function of resemblance criteria un-known a priori Data clustering No need for a teacher The network finds itself the correlations between the data Examples of such networks : Kohonen feature maps Properties of Neural Networks Supervised networks are universal approximators Theorem : Any limited function can be approximated by a neural network with a finite number of hidden neurons to an arbitrary precision Type of Approximators Linear approximators Non-linear approximators Other properties Adaptivity Adapt weights to environment and retrained easily Generalization ability May provide against lack of data Fault tolerance Graceful degradation of performances if damaged the information is distributed within the entire net. What do we need to use NN ? Determination of the number of inputs and outputs Collection of data for the learning and testing phase of the neural network Finding the optimum number of hidden nodes Estimate the parameters (Learning) Evaluate the performances of the network IF performances are not satisfactory then review all the precedent points Classical neural architectures Perceptron Multi-Layer Perceptron Radial Basis Function (RBF) Kohonen Features maps Other architectures An example : Shared weights neural networks Perceptron + + Rosenblatt (1962) + + + y 1 Linear separation + + ++ + + + Inputs :Vector of real + + ++ + + + + + ++ + values + + + + + Outputs :1 or -1 + + y 1 + + + y sign(v) + c0 c1 x1 c2 x2 0 v c0 c1 x1 c2 x2 c0 c2 c1 x1 1 x2 Learning with a perceptron Perceptron: yout cT x 1 2 N Data: ( x , y1 ), ( x , y 2 ),...,( x , y N ) Error: E (t ) ( y (t ) out yt ) (c(t ) x yt ) 2 T t 2 Learning: E (t ) (c(t )T x t yt ) 2 ci (t 1) ci (t ) K ci (t ) K ci ci ci (t 1) ci (t ) K (c(t )T x t yt ) xit m c(t )T x c j (t ) x tj j 1 A perceptron is able to learn a linear function. Perceptron The perceptron algorithm converges if examples are linearly separable description of a simple neuron An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation; the training mode and the using mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not. Firing rules - How neurons make decisions The firing rule is an important concept in neural networks and accounts for their high flexibility. A firing rule determines how one calculates whether a neuron should fire for any input pattern. It relates to all the input patterns, not only the ones on which the node was trained. A simple firing rule can be implemented by using Hamming distance technique. The rule goes as follows: Take a collection of training patterns for a node, some of which cause it to fire (the 1-taught set of patterns) and others which prevent it from doing so (the 0-taught set). Then the patterns not in the collection cause the node to fire if, on comparison , they have more input elements in common with the 'nearest' pattern in the 1-taught set than with the 'nearest' pattern in the 0-taught set. If there is a tie, then the pattern remains in the undefined state. Firing rules - How neurons make decisions For example, a 3-input neuron is taught to output 1 when the input (X1,X2 and X3) is 111 or 101 and to output 0 when the input is 000 or 001. Then, before applying the firing rule, the truth table is; Firing rules - How neurons make decisions As an example of the way the firing rule is applied, take the pattern 010. It differs from 000 in 1 element, from 001 in 2 elements, from 101 in 3 elements and from 111 in 2 elements. Therefore, the 'nearest' pattern is 000 which belongs in the 0-taught set. Thus the firing rule requires that the neuron should not fire when the input is 001. Firing rules - How neurons make decisions On the other hand, 011 is equally distant from two taught patterns that have different outputs and thus the output stays undefined (0/1). Firing rules - How neurons make decisions • By applying the firing in every column the following truth table is obtained; • The difference between the two truth tables is called the generalization of the neuron. • Therefore the firing rule gives the neuron a sense of similarity and enables it to respond 'sensibly' to patterns not seen during training. Multi-Layer Perceptron One or more hidden layers Output layer Sigmoid activations 2nd hidden functions layer 1st hidden layer Input data Learning Back-propagation algorithm Most common method of obtaining the many weights in the network A form of supervised training The basic backpropagation algorithm is based on minimizing the error of the network using the derivatives of the error function Simple Slow Prone to local minima issues Error Back-Propagation Error Back-Propagation- Weights Different non linearly separable problems Types of Exclusive-OR Classes with Most General Structure Decision Regions Problem Meshed regions Region Shapes Single-Layer Half Plane A B Bounded By B A Hyperplane B A Two-Layer Convex Open A B Or B A Closed Regions B A Three-Layer Abitrary A B (Complexity B Limited by No. A B A of Nodes) Neural Networks – An Introduction Dr. Andrew Hunter References http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop. html http://www.cse.wustl.edu/~bayazit/courses/cs527a/ Andrew L. Nelson, Introduction to Artificial Neural Networks. Nicolas Galoppo von Borries, Introduction to Artificial Neural Networks.