# ann

Document Sample

This simple type ofnetwork is interesting because the hiddenunits are free to construct their own
representations of the input. The weights between the input and hidden units determine when each hidden unit is
active,and so by modifying these weights, a hidden unit can choose what it represents. We also distinguish single-layer
and multi-layer architectures. The single-layer organization, in which all units are connected to oneanother, constitutes
the most general case
and is of more potential computational power than hierarchically structured multilayer
organizations. In multi-layer networks, units are often numbered by layer, instead of
following a global numbering

3.1 Artificial Neural Networks
In this paper we extend the analysis to use
the concept of Neural networks([14],[15],[16])
A neural network is composed of a number of
interconnected units (artificial neurons).Each
unit has an input/output(I/O) characteristics
and implements a local computation or function.
The output of any unit is determined by
the I/O characteristics, its interconnection to
other units and (possibly) the external inputs.
The applications of Neural Networks are enumerated
in [18] and [19]
Figure 2: An ’Artificial Neuron’
A single neuron by itself is not a very useful
pattern recognition tool. The real power
of neural networks comes when we combine
neurons into the multilayer structures, called
neural networks.
Figure 3: A ’Simple Neural Network’
The Neuron has:
Set of nodes that connect it to inputs, output,
or other neurons, also called synapses. A Linear
Combiner, which is a function that takesall inputs and produces a single value. A simple
way of doing it is by adding together the
Input multiplied by the Synaptic Weight.
An Activation Function. It will take ANY input
from minus infinity to plus infinity and
squeeze it into the -1 to 1 or into 0 to 1 interval.
Finally, the threshold defines the INTERNAL
ACTIVITY of a neuron, when there is no input.
In general, for the neuron to fire, the sum
should be greater than threshold. For simplicity,
threshold can be replaced with an EXTRA
input, with weight that can change during
the learning process and the input fixed
and always equal (-1). The first layer is known
as the input layer, the middle layer is known
as hidden layer and the last layer is the O/P

3.2.1 Feed forward Dynamics
When a BackProp network is cycled, the activations
of the input units are propagated forward
to the output layer through the connecting
weights.
netj =Xwjai (2)
where ai is the input activation from unit i
and wji is the weight connecting unit i to unit
j. However, instead of calculating a binary
output, the net input is added to the unit’s
bias and the resulting value is passed through
a sigmoid function:
F(netj) =
1
1 + e−netj+j (3)
The sigmoid function is sometimes called a
“squashing” function because it maps its inputs
onto a fixed range.
Figure 4: Sigmoid Activation Function

INTRODUCTION
There is a vital need for speaker identification in all spheres of life. The most important
being that this system will enable people to have secure access to information and property. It
has significant advantage that in electronic banking and Internet access. Countless money is lost
each year due to white-collar crime, fraud and embezzlement. In today’s complex economic times,
businesses and individuals are both falling victims to these devastating crimes. Employees embezzle
funds or steal goods from employers, then disappear or hide behind legal issues. Individuals can
easily become helpless victims of identity theft, stock schemes and other scams that rob them of their
money.
One solution to avoid such white-collar crimes and shorten the lengthy time in locating and
serving perpetrators with a judgment is by use of biometrics techniques for verifying individuals.
Artificial neural network (ANN) are intelligence systems that are related in some way to a
simplified biological model of human brain. Attenuation and distortion of voice signals exists over
telephone lines and artificial neural network, despite a nonlinear, noisy and un -stationary environment, is
still good at recognizing and verifying

unique      characteristics   of       signal   such as    speech.    Speaker     recognition involves
speakeridentification or speaker verification based on his\her voice in the form of speech.

SPEAKER RECOGNITION:
There is a vital need for speaker identification in all spheres of life. The most
important being that this system will enable people to have secure access to information
and property. It has significant advantage that in electronic banking and Internet access.
Countless money       is lost each year due to white-collar crime, fraud and embezzlement. In today’s
complex economic times, businesses and individuals are both falling victims to these devastating
crimes. Employees embezzle funds or steal goods from employers, then disappear or hide behind legal
issues. Individuals can easily become helpless victims of identity theft, stock schemes and other
scams that rob them of their money. One solution to avoid such white-collar crimes and shorten the
lengthy time in locating and serving perpetrators with a judgment is by use of biometrics
techniques for verifying individuals. Artificial neural network (ANN) are intelligence systems that
are related in some way to a simplified biological model of human brain. Attenuation and distortion
of voice signals exists over telephone lines and artificial neural network, despite a nonlinear, noisy and
un -stationary environment, is still good at recognizing and verifying unique characteristics of signal
such as speech. Speaker recognition involves speaker identification or speaker verification based on
his\her voice in the form of speech.
Speaker recognition is the generic term used for two related problems:
1. Speaker Identification: the problem is to determine the identity of a
speaker from a known group of (N) possible speakers.
2. Speaker Verification: basically the same problem as speaker identification, except
that claimed identity is also given and the problems are “merely” to confirm or
disconfirm the identity claim.
Speaker recognition problem using ANN is divided into two parts
i) Feature extraction     ii) Pattern matching.

SYSTEM CONCEPT:
The speaker recognition system over telephone lines is investigated in this paper using
artificial neural network shown in figure 1.

Figure1: Block Diagram of the Speaker recognition system using an ANN
In this paper, the speaker recognition system reported is a text-dependant type.
The system
is trained on a group of people to be identified by each person speaking out of
same phrase.
The voices is recorded on a standard 16-bit computer sound card from telephone
handset
receiver. Although the frequency of human voice ranges from 0 KHz to 20 KHz,
most of signal
content lies in 0.3 KHz to 4 KHz range. The frequency over the telephone lines is
limited to 0.3
KHz to 3.4 KHz and this is the frequency band of interest in this paper. Therefore, a
sampling
rate of 16 KHz satisfying the Nyquist criteria is used. The voices are stored as sound
files on
the computer. Digital processing techniques are used to convert sound files to a
presentable
form as an input vectors to neural network. The output of neural network verifies the
speaker in
the group.

Feature extraction

Feature extraction involves information

retrieval from the audio signal.
Linear Predictive Coding
In this paper we propose to use LPC, which
is a modification of DFT. LPC analyzes the
speech signal by estimating the formants,
removing their effects from the speech signal,
and estimating the intensity and frequency of
the remaining buzz. The method employed
is a difference equation, which expresses
each sample of the signal as a linear combination
of previous samples. Such an
equation is called a linear predictor, which
is why this is called Linear Predictive Coding.
The basic assumption behind LPC is the correlation
between the n-th sample and the p
previous samples of the target signal. Namely,
the n-th signal sample is represented as a linear
combination of the previous p samples,
plus a residual representing the prediction error:
x(n) = −a1x(n−1)−a2x(n−2)−...−apx(n− p) + e(n)
The equation is an autoregressive formulation
of the target signal.
The coefficients of the difference equation (the
prediction coefficients) characterize the formants,
so the LPC system needs to estimate
these coefficients. Minimizing the mean square
error between the predicted signal and
the actual signal does the estimate.
It is more accurate1 than DFT.

DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 12 posted: 7/17/2011 language: English pages: 6
How are you planning on using Docstoc?