Fast Detection of H1N1 and H1N5 Viruses in DNA Sequence by using High Speed Time Delay Neural Networks (updated)

Document Sample
Fast Detection of H1N1 and H1N5 Viruses in DNA Sequence by using High Speed Time Delay Neural Networks (updated) Powered By Docstoc
					                                                                    (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                  Vol. 9, No. 11, November 2011



   Fast Detection of H1N1 and H1N5 Viruses in DNA
   Sequence by using High Speed Time Delay Neural
                       Networks
                              Hazem M. El-Bakry                                                Nikos Mastorakis

             Faculty of Computer Science & Information Systems,                         Technical University of Sofia,
                         Mansoura University, EGYPT                                             BULGARIA
                           helbakry20@yahoo.com



Abstract—Fast detection of biological viruses in DNA sequence is very       machinery the host cell would ordinarily use to reproduce its
important for investigation of patients and overcome diseases. First, an    own DNA. Then the host cell is forced to encapsulate this viral
intelligent algorithm to completely retrieve DNA sequence is presented.     DNA into new protein shells; the new viruses created are then
DNA codes that may be missed during the splitting process are retrieved     released, destroying the cell [32-35].
by using Hopfield neural networks. Then, a new approach for fast
detection of biological viruses like H1N1 and H1N5 in DNA
sequence is presented. Such algorithm uses high speed time delay            All living things are susceptible to viral infections plants,
neural networks (HSTDNNs). The operation of these networks                  animals, or bacteria can all be infected by a virus specific for
relies on performing cross correlation in the frequency domain              that type of organism. Moreover, within an individual species
between the input DNA sequence and the input weights of neural              there may be a hundred or more different viruses which can
networks. It is proved mathematically and practically that the              infect that species alone. There are viruses which infect only
number of computation steps required for the presented                      humans (for example, smallpox), viruses which infect humans
HSTDNNs is less than that needed by conventional time delay                 and one or two additional kinds of animals (for example,
neural networks (CTDNNs). Simulation results using MATLAB                   influenza), viruses which infect only a certain kind of plant
confirm the theoretical computations.
                                                                            (for example, the tobacco mosaic virus), and some viruses
    Keywords- High Speed Neural Networks; Cross Correlation;                which infect only a particular species of bacteria (for example,
Frequency Domain; H1N1 and H1N5 Detection                                   the bacteriophage which infects E. coli) [32-35].
                                                                            Sometimes when a virus reproduces, mutations occur. The
                         I.    INTRODUCTION                                 offspring that have been changed by the mutation may no
                                                                            longer be infectious. But a virus replicates itself thousands of
A virus is a tiny bundle of genetic material - either DNA or                times, so there will usually be some offspring that are still
RNA - carried in a shell called a viral coat, or capsid, which is           infectious, but sufficiently different from the parent virus so
made up of protein. Some viruses have an additional layer                   that vaccines no longer work to kill it. The influeza virus can
around this coat called an envelope. When a virus particle                  do this, which is why flu vaccines for last year's flu don't work
enters a cell and begins to reproduce itself, this is called a viral        the next year. The common cold virus changes so quickly that
infection. The virus is usually very, very small compared to                vaccines are useless; the cold you have today will be a
the size of a living cell. The information carried in the virus's           different strain than the cold you had last month! [31-34]
DNA allows it to take over the operation of the cell,
                                                                            For efficient treatment of patients in real-time, it is important
converting it to a factory to make more copies of itself. For
                                                                            to detect biological viruses like H1N1 and H1N5. Recently,
example, the polio virus can make over one million copies of
                                                                            time delay neural networks have shown very good results in
itself inside a single, infected human intestinal cell [32-35].
                                                                            different areas such as automatic control, speech recognition,
All viruses only exist to make more viruses. With the possible              blind equalization of time-varying channel and other
exception of bacterial viruses, which can kill harmful bacteria,            communication applications. The main objective of this
all viruses are considered harmful, because their reproduction              research is to reduce the response time of time delay neural
causes the death of the cells which the viruses entered. If a               networks. The purpose is to perform the testing process in the
virus contains DNA, it inserts its genetic material into the host           frequency domain instead of the time domain. Our approach
cell's DNA. If the virus contains RNA, it must first turn its               was successfully applied for fast detection of computer viruses
RNA into DNA using the host cell's machinery, before                        as shown in [4]. Sub-image detection by using fast neural
inserting it into the host DNA. Once it has taken over the cell,            networks (FNNs) was proposed in [5,6]. Furthermore, it was
viral genes are then copied thousands of times, using the                   used for fast face detection [7,10,12], and fast iris detection
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                           Vol. 9, No. 11, November 2011
[11]. Another idea to further increase the speed of FNNs             given even a poor photograph of that person we are quite good
through image decomposition was suggested in [10]. In                at reconstructing the persons face quite accurately. This is very
addition it was applied for fast prediction of new data as           different from a traditional computer where specific facts are
described in [1,3].                                                  located in specific places in computer memory. If only partial
                                                                     information is available about this location, the fact or
FNNs for detecting a certain code in one dimensional serial
                                                                     memory cannot be recalled at all [35-42].
stream     of    sequential   data    were     described    in
[1,2,3,4,8,14,15,20,23,27,28,29]. Compared with conventional         Theoretical physicists are an unusual lot, acting like
neural networks, FNNs based on cross correlation between the         gunslingers in the old West, anxious to prove themselves
tested data and the input weights of neural networks in the          against a really good problem. And there aren’t that many
frequency domain showed a significant reduction in the               really good problems that might be solvable. As soon as
number of computation steps required for certain data                Hopfield pointed out the connection between a new and
detection [1-29]. Here, we make use of the theory of FNNs            important problem (network models of brain function) and an
implemented in the frequency domain to increase the speed of         old and well-studied problem (the Ising model), many
time delay neural networks for biological virus detection [2].       physicists rode into town, so to speak, with the intention of
The idea of moving the testing process from the time domain          shooting the problem full of holes and then, the brain
to the frequency domain is applied to time delay neural              understood, riding off into the sunset looking for a newer,
networks. Theoretical and practical results show that the            tougher problem. (Who was that masked physicist?).
proposed HSTDNNs are faster than CTDNNs. Retrieval of
                                                                     Hopfield made the portentous comment: ‘This case is
missed DNA codes by using Hopfield neural networks is
                                                                     isomorphic with an Ising model,’ thereby allowing a deluge of
introduced in section II. Section III presents HSTDNNs for
                                                                     physical theory (and physicists) to enter neural network
detecting of biological viruses in DNA sequence.
                                                                     modeling. This flood of new participants transformed the field.
Experimental results for fast biological virus detection by
                                                                     In 1974 Little and Shaw made a similar identification of neural
using HSTDNNs are given in section IV.
                                                                     network dynamics with the Ising model, but for whatever
                                                                     reason, their idea was not widely picked up at the time.
      II.   RETRIEVAL OF MISSED DNA CODES BY USING                   Unfortunately, the problem of brain function turned out to be
               HOPFIELD NEURAL NETWORKS                              more difficult than expected, and it is still unsolved, although
                                                                     a number of interesting results about Hopfield nets were
One of the most important functions of our brain is the laying
                                                                     proved. At present, many of the traveling theoreticians have
down and recall of memories. It is difficult to imagine how we       traveled on [38].
could function without both short and long term memory. The
absence of short term memory would render most tasks                 The Hopfield neural network is a simple artificial network
extremely difficult if not impossible - life would be punctuated     which is able to store certain memories or patterns in a manner
by a series of one time images with no logical connection            rather similar to the brain - the full pattern can be recovered if
between them. Equally, the absence of any means of long term         the network is presented with only partial information.
memory would ensure that we could not learn by past                  Furthermore there is a degree of stability in the system - if just
experience. Indeed, much of our impression of self depends on        a few of the connections between nodes (neurons) are severed,
remembering our past history [36-40].                                the recalled memory is not too badly corrupted - the network
                                                                     can respond with a "best guess". Of course, a similar
Our memories function in what is called an associative or            phenomenon is observed with the brain - during an average
content-addressable fashion. That is, a memory does not exist
                                                                     lifetime many neurons will die but we do not suffer a
in some isolated fashion, located in a particular set of neurons.    catastrophic loss of individual memories - our brains are quite
All memories are in some sense strings of memories - you             robust in this respect (by the time we die we may have lost 20
remember someone in a variety of ways - by the color of their
                                                                     percent of our original neurons) [44-57].
hair or eyes, the shape of their nose, their height, the sound of
their voice, or perhaps by the smell of a favorite perfume.          The nodes in the network are vast simplifications of real
Thus memories are stored in association with one another.            neurons - they can only exist in one of two possible "states" -
These different sensory units lie in completely separate parts       firing or not firing. Every node is connected to every other
of the brain, so it is clear that the memory of the person must      node with some strength. At any instant of time a node will
be distributed throughout the brain in some fashion. Indeed,         change its state (i.e start or stop firing) depending on the
PET scans reveal that during memory recall there is a pattern        inputs it receives from the other nodes [44-57].
of brain activity in many widely different parts of the brain
                                                                     If we start the system off with a any general pattern of firing
[36-43].
                                                                     and non-firing nodes then this pattern will in general change
Notice also that it is possible to access the full memory (all       with time. To see this think of starting the network with just
aspects of the person's description for example) by initially        one firing node. This will send a signal to all the other nodes
remembering just one or two of these characteristic features.        via its connections so that a short time later some of these
We access the memory by its contents not by where it is stored       other nodes will fire. These new firing nodes will then excite
in the neural pathways of the brain. This is very powerful;          others after a further short time interval and a whole cascade
                                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                               Vol. 9, No. 11, November 2011
of different firing patterns will occur. One might imagine that          •    Activation function on each neuron i is:
the firing pattern of the network would change in a
complicated perhaps random way with time. The crucial                                                      ⎧ 1 if net > 0 ⎫
property of the Hopfield network which renders it useful for                           f(net) = sgn(net) = ⎨               ⎬             (1)
simulating memory recall is the following: we are guaranteed                                               ⎩- 1 if net < 0 ⎭
that the pattern will settle down after a long enough time to                where:
some fixed pattern. Certain nodes will be always "on" and                                                      neti = Σwij xj         (2)
others "off". Furthermore, it is possible to arrange that these          •    If net = 0, then the output is the same as before, by
stable firing patterns of the network correspond to the desired               convention.
memories we wish to store! [44-57].                                      •    There are no separate thresholds or biases. However,
                                                                              these could be represented by units that have all weights =
The reason for this is somewhat technical but we can proceed
                                                                              0 and thus never change their output.
by analogy. Imagine a ball rolling on some bumpy surface. We
imagine the position of the ball at any instant to represent the         •    The energy function is defined as:
activity of the nodes in the network. Memories will be                                          E(y1, y2, …, yn) = - Σ Σ wij yiyj        (3)
represented by special patterns of node activity corresponding
to wells in the surface. Thus, if the ball is let go, it will execute          where (y1, y2, …, yn) is outputs, wij is the weight neuron i,
some complicated motion but we are certain that eventually it                              and the double sum is over i and j.
will end up in one of the wells of the surface. We can think of
the height of the surface as representing the energy of the ball.        Different DNA patterns are stored in Hopfield neural network.
We know that the ball will seek to minimize its energy by                In the testing process, the missed codes (if any) are retrieved.
seeking out the lowest spots on the surface -- the wells.
Furthermore, the well it ends up in will usually be the one it                 III.   FAST BIOLOGICAL VIRUS DETECTION BY USING
started off closest to. In the language of memory recall, if we                                   HSTDNNS
start the network off with a pattern of firing which
approximates one of the "stable firing patterns" (memories) it           Finding a biological virus like H1N1 or H1N5 in DNA
will "under its own steam" end up in the nearby well in the              sequence is a searching problem. First neural networks are
energy surface thereby recalling the original perfect memory.            trained to classify codes which contain viruses from others
The smart thing about the Hopfield network is that there exists          that do not and this is done in time domain. In biological virus
a rather simple way of setting up the connections between                detection phase, each position in the DNA sequence is tested
nodes in such a way that any desired set of patterns can be              for presence or absence of biological virus code. At each
made "stable firing patterns". Thus any set of memories can be           position in the input DNA one dimensional matrix, each sub-
burned into the network at the beginning. Then if we kick the            matrix is multiplied by a window of weights, which has the
network off with any old set of node activity we are                     same size as the sub-matrix. The outputs of neurons in the
guaranteed that a "memory" will be recalled. Not too                     hidden layer are multiplied by the weights of the output layer.
surprisingly, the memory that is recalled is the one which is            When the final output is 10, this means that the sub-matrix
"closest" to the starting pattern. In other words, we can give           under test contains H1N1. When the final output is 01 this
the network a corrupted image or memory and the network                  means that H1N5 is detected. Otherwise, there is no virus.
will "all by itself" try to reconstruct the perfect image. Of            Thus, we may conclude that this searching problem is a cross
course, if the input image is sufficiently poor, it may recall the       correlation between the incoming serial data and the weights
incorrect memory - the network can become "confused" - just              of neurons in the hidden layer.
like the human brain. We know that when we try to remember               The convolution theorem in mathematical analysis says that a
someone's telephone number we will sometimes produce the                 convolution of f with h is identical to the result of the
wrong one! Notice also that the network is reasonably robust -           following steps: let F and H be the results of the Fourier
if we change a few connection strengths just a little the                Transformation of f and h in the frequency domain. Multiply F
recalled images are "roughly right". We don't lose any of the            and H* in the frequency domain point by point and then
images completely [44-57].                                               transform this product into the spatial domain via the inverse
As with the Linear Associative Memory, the “stored patterns”             Fourier Transform. As a result, these cross correlations can be
are represented by the weights. To be effective, the patterns            represented by a product in the frequency domain. Thus, by
should be reasonably orthogonal. The basic Hopfield model                using cross correlation in the frequency domain, speed up in
can be described as follows [38]:                                        an order of magnitude can be achieved during the detection
                                                                         process [1-29]. Assume that the size of the biological virus
•   N neurons, fully connected in a cyclic fashion:                      code is 1xn. In biological virus detection phase, a sub matrix I
•   Values are +1, -1.                                                   of size 1xn (sliding window) is extracted from the tested
•   Each neuron has a weighted input from all other neurons.             matrix, which has a size of 1xN. Such sub matrix, which may
•   The weight matrix w is symmetric (wij=wji) and diagonal              be biological virus code, is fed to the neural network. Let Wi
    terms (self-weights wii = 0).                                        be the matrix of weights between the input sub-matrix and the
                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                           Vol. 9, No. 11, November 2011
hidden layer. This vector has a size of 1xn and can be               required for computing the 1D-FFT of the weight matrix at
represented as 1xn matrix. The output of hidden neurons h(i)         each neuron in the hidden layer.
can be calculated as follows [1-7]:                                  2- At each neuron in the hidden layer, the inverse 1D-FFT is
                                                                     computed. Therefore, q backward and (1+q) forward
                            ⎛ n                 ⎞
                      hi = g⎜ ∑ Wi (k)I(k) + bi ⎟            (4)     transforms have to be computed. Therefore, for a given matrix
                            ⎜                   ⎟                    under test, the total number of operations required to compute
                            ⎝ k =1              ⎠
                                                                     the 1D-FFT is (2q+1)Nlog2N.
where g is the activation function and b(i) is the bias of each      3- The number of computation steps required by HSTDNNs is
hidden neuron (i). Equation 4 represents the output of each          complex and must be converted into a real version. It is known
hidden neuron for a particular sub-matrix I. It can be obtained      that, the one dimensional Fast Fourier Transform requires
to the whole input matrix Z as follows [1-6]:                        (N/2)log2N complex multiplications and Nlog2N complex
                      ⎛ n/2                   ⎞                      additions [30]. Every complex multiplication is realized by six
               hi(u)=g⎜ ∑ Wi(k) Z(u + k) +b i ⎟
                      ⎜                       ⎟              (5)     real floating point operations and every complex addition is
                      ⎜k= − n/2               ⎟                      implemented by two real floating point operations. Therefore,
                      ⎝                       ⎠                      the total number of computation steps required to obtain the
Eq.5 represents a cross correlation operation. Given any two         1D-FFT of a 1xN matrix is:
functions f and d, their cross correlation can be obtained by                               ρ=6((N/2)log2N) + 2(Nlog2N)             (10)
[31]:
                                ⎛ ∞            ⎞                     which may be simplified to:
                   d(x)⊗ f(x) = ⎜ ∑f(x + n)d(n)⎟
                                ⎜ n= − ∞       ⎟
                                                             (6)
                                                                                                      ρ=5Nlog2N                     (11)
                                ⎝              ⎠
Therefore, Eq. 5 may be written as follows [1-7]:                    4- Both the input and the weight matrices should be dot

                                (              )
                                                                     multiplied in the frequency domain. Thus, a number of
                        h i = g Wi ⊗ Z + b i                 (7)     complex computation steps equal to qN should be considered.
                                                                     This means 6qN real operations will be added to the number
where hi is the output of the hidden neuron (i) and hi (u) is the    of computation steps required by HSTDNNs.
activity of the hidden unit (i) when the sliding window is           5- In order to perform cross correlation in the frequency
located at position (u) and (u) ∈ [N-n+1].                           domain, the weight matrix must be extended to have the same
                                                                     size as the input matrix. So, a number of zeros = (N-n) must be
Now, the above cross correlation can be expressed in terms of
                                                                     added to the weight matrix. This requires a total real number
one dimensional Fast Fourier Transform as follows [1-7]:
                                                                     of computation steps = q(N-n) for all neurons. Moreover, after
                                    (
                   Wi ⊗ Z = F −1 F(Z)• F * Wi   ( ))         (8)     computing the FFT for the weight matrix, the conjugate of this
                                                                     matrix must be obtained. As a result, a real number of
Hence, by evaluating this cross correlation, a speed up ratio        computation steps = qN should be added in order to obtain the
can be obtained comparable to conventional neural networks.          conjugate of the weight matrix for all neurons. Also, a
Also, the final output of the neural network can be evaluated        number of real computation steps equal to N is required to
as follows:                                                          create butterflies complex numbers (e-jk(2Πn/N)), where 0<K<L.
                                                                     These (N/2) complex numbers are multiplied by the elements
                          ⎛ q                       ⎞                of the input matrix or by previous complex numbers during the
                  O(u) = g⎜ ∑ Wo (i) h i (u ) + b o ⎟
                          ⎜                         ⎟
                                                             (9)     computation of FFT. To create a complex number requires two
                          ⎝ i=1                     ⎠                real floating point operations. Thus, the total number of
where q is the number of neurons in the hidden layer. O(u) is        computation steps required for HSTDNNs becomes:
the output 2D matrix (corresponding to two output neurons) of                σ=(2q+1)(5Nlog2N)+6qN+q(N-n)+qN+N                      (12)
the neural network when the sliding window located at the
position (u) in the input matrix Z. Wo is the weight matrix          which can be reformulated as:
between hidden and output layer.
                                                                                   σ=(2q+1)(5Nlog2N)+q(8N-n)+N                      (13)
       IV.   COMPLEXITY ANALYSIS OF HSTDNNS FOR                      6- Using sliding window of size 1xn for the same matrix of
              BIOLOGICAL VIRUS DETECTION                             1xN pixels, q(2n-1)(N-n+1) computation steps are required
                                                                     when using CTDNNs for biological virus detection or
The complexity of cross correlation in the frequency domain          processing (n) input data. The theoretical speed up factor η
can be analyzed as follows:                                          can be evaluated as follows:
1- For a tested matrix of 1xN elements, the 1D-FFT requires a
number equal to Nlog2N of complex computation steps [30].                                     q(2n - 1)(N- n + 1)
                                                                                η=                                                  (14)
Also, the same number of complex computation steps is                                (2q + 1)(5Nlog2 N) + q(8N- n) + N
                                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                      Vol. 9, No. 11, November 2011
CTDNNs and HSTDNNs are shown in Figures 1 and 2                                 [7] Hazem M. El-Bakry, "New Fast Principal Component Analysis For Real-
                                                                                     Time Face Detection," MG&V Journal, vol. 18, no.4, 2009, pp. 405-426.
respectively.                                                                   [8] Hazem M. El-bakry, and Mohamed Hamada “High speed time delay
                                                                                     Neural Networks for Detecting DNA Coding Regions,” Springer, Lecture
Time delay neural networks accept serial input data with fixed                       Notes on Artificial Intelligence (LNAI 5711), 2009, pp. 334-342.
size (n). Therefore, the number of input neurons equals to (n).                 [9] Hazem M. El-Bakry, "New Faster Normalized Neural Networks for Sub-
Instead of treating (n) inputs, the proposed new approach is to                      Matrix Detection using Cross Correlation in the Frequency Domain and
collect all the incoming data together in a long vector (for                         Matrix Decomposition, " Applied Soft Computing journal, vol. 8, issue 2,
                                                                                     March 2008, pp. 1131-1149.
example 100xn). Then the input data is tested by time delay                     [10] Hazem M. El-Bakry, "Face detection using fast neural networks and
neural networks as a single pattern with length L (L=100xn).                         image decomposition," Neurocomputing Journal, vol. 48, 2002, pp. 1039-
Such a test is performed in the frequency domain as described                        1046.
before.                                                                         [11] Hazem M. El-Bakry, "Human Iris Detection Using Fast Cooperative
                                                                                     Modular Neural Nets and Image Decomposition," Machine Graphics &
The theoretical speed up ratio for searching short successive                        Vision Journal (MG&V), vol. 11, no. 4, 2002, pp. 498-512.
(n) code in a long input vector (L) using time delay neural                     [12] Hazem M. El-Bakry, "Automatic Human Face Recognition Using
networks is listed in tables I, II, and III. Also, the practical                     Modular Neural Networks," Machine Graphics & Vision Journal
                                                                                     (MG&V), vol. 10, no. 1, 2001, pp. 47-73.
speed up ratio for manipulating matrices of different sizes (L)                 [13] Hazem M. El-Bakry, "A New Neural Design for Faster Pattern Detection
and different sized weight matrices (n) using a 2.7 GHz                              Using Cross Correlation and Matrix Decomposition," Neural World
processor and MATLAB is shown in table IV.                                           journal, Neural World Journal, 2009, vol. 19, no. 2, pp. 131-164.
                                                                                [14] Hazem M. El-Bakry, and H. Stoyan, "FNNs for Code Detection in
An interesting point is that the memory capacity is reduced                          Sequential Data Using Neural Networks for Communication
when using HSTDNN. This is because the number of variables                           Applications," Proc. of the First International Conference on Cybernetics
                                                                                     and Information Technologies, Systems and Applications: CITSA 2004,
is reduced compared with CTDNN.                                                      21-25.
                                                                                [15] Hazem M. El-Bakry, "New High speed time delay Neural Networks
                            V. CONCLUSION                                            Using Cross Correlation Performed in the Frequency Domain,"
                                                                                     Neurocomputing Journal, vol. 69, October 2006, pp. 2360-2363.
To facilitate investigation of patients and overcome diseases, fast             [16] Hazem M. El-Bakry, "A New High Speed Neural Model For Character
detection of biological viruses in DNA sequence has been presented.                  Recognition Using Cross Correlation and Matrix Decomposition,"
                                                                                     International Journal of Signal Processing, vol.2, no.3, 2005, pp. 183-202.
Missed DNA codes have been retrieved by using Hopfield neural                   [17] Hazem M. El-Bakry, "New High Speed Normalized Neural Networks for
networks. After that a new approach for fast detection of                            Fast Pattern Discovery on Web Pages," International Journal of Computer
biological viruses like H1N1 and H1N5 in DNA sequence has                            Science and Network Security, vol.6, No. 2A, February 2006, pp.142-
been introduced. Such strategy has been realized by using our                        152.
                                                                                [18] Hazem M. El-Bakry "Fast Iris Detection for Personal Verification Using
design for HSTDNNs. Theoretical computations have shown                              Modular Neural Networks," Lecture Notes in Computer Science,
that HSTDNNs require fewer computation steps than                                    Springer, vol. 2206, October 2001, pp. 269-283.
conventional ones. This has been achieved by applying cross                     [19] Hazem M. El-Bakry, and Qiangfu Zhao, "Fast Normalized Neural
correlation in the frequency domain between the input data and                       Processors For Pattern Detection Based on Cross Correlation
                                                                                     Implemented in the Frequency Domain," Journal of Research and Practice
the weights of neural networks. Simulation results have                              in Information Technology, Vol. 38, No.2, May 2006, pp. 151-170.
confirmed this proof by using MATLAB. The proposed                              [20] Hazem M. El-Bakry, and Qiangfu Zhao, "High speed time delay Neural
algorithm can be applied to detect other biological viruses in                       Networks," International Journal of Neural Systems, vol. 15, no.6,
DNA sequence perfectly.                                                              December 2005, pp.445-455.
                                                                                [21] Hazem M. El-Bakry, and Qiangfu Zhao, "Speeding-up Normalized
                              REFERENCES                                             Neural Networks For Face/Object Detection," Machine Graphics &
                                                                                     Vision Journal (MG&V), vol. 14, No.1, 2005, pp. 29-59.
[1] Hazem M. El-Bakry, and Nikos Mastorakis, “An Intelligent Approach for       [22] Hazem M. El-Bakry, and Qiangfu Zhao, "A New Technique for Fast
    Fast Detection of Biological Viruses in DNA Sequence,” Proc. of 10th             Pattern Recognition Using Normalized Neural Networks," WSEAS
    WSEAS International Conference on APPLICATIONS of COMPUTER                       Transactions on Information Science and Applications, issue 11, vol. 2,
    ENGINEERING (ACE '11), Spain, March 24-26, 2011, pp. 237-244.                    November 2005, pp. 1816-1835.
[2] Hazem M. El-Bakry, and Nikos Mastorakis, “A New Approach for                [23] Hazem M. El-Bakry, and Qiangfu Zhao, "Fast Complex Valued Time
    Prediction by using Integrated Neural Networks,” Proc. of 5th WSEAS              Delay Neural Networks," International Journal of Computational
    International Conference on COMPUTER ENGINEERING and                             Intelligence, vol.2, no.1, pp. 16-26, 2005.
    APPLICATIONS (CEA '11), Puerto Morelos, Mexico, Jan. 29-31, 2011,           [24] Hazem M. El-Bakry, and Qiangfu Zhao, "Fast Pattern Detection Using
    pp. 17-28.                                                                       Neural Networks Realized in Frequency Domain," Enformatika
[3] Hazem M. El-Bakry, "Fast Virus Detection by using High Speed Time                Transactions on Engineering, Computing, and Technology, February 25-
    Delay Neural Networks," Journal of Computer Virology, vol.6, no.2,               27, 2005, pp. 89-92.
    2010, pp.115-122.                                                           [25] Hazem M. El-Bakry, and Qiangfu Zhao, "Sub-Image Detection Using
[4] Hazem M. El-Bakry, "An Efficient Algorithm for Pattern Detection using           Fast Neural Processors and Image Decomposition," Enformatika
    Combined Classifiers and Data Fusion," Information Fusion Journal, vol.          Transactions on Engineering, Computing, and Technology, February 25-
    11, 2010, pp. 133-148.                                                           27, 2005, pp. 85-88.
[5] Hazem M. El-Bakry, "A Novel High Speed Neural Model for Fast Pattern        [26] Hazem M. El-Bakry, and Qiangfu Zhao, "Face Detection Using Fast
    Recognition," Soft Computing Journal, vol. 14, no. 6, 2010, pp. 647-666.         Neural Processors and Image Decomposition," International Journal of
[6] Hazem M. El-bakry, and Nikos Mastorakis, “Fast Packet Detection by               Computational Intelligence, vol.1, no.4, 2004, pp. 313-316.
     using High Speed Time Delay Neural Networks," Proc. of the 10th            [27] Hazem M. El-Bakry, and Qiangfu Zhao, "A Fast Neural Algorithm for
     WSEAS Int. Conference on Multimedia Systems & Signal Processing,                Serial Code Detection in a Stream of Sequential Data," International
     Hangzhou University, China, April 11-13, 2010, pp. 222-227.                     Journal of Information Technology, vol.2, no.1, pp. 71-90, 2005.
                                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                        Vol. 9, No. 11, November 2011
[28] Hazem M. El-Bakry and Nikos Mastorakis, "Fast Code Detection Using             [45] S. Amari, “Learning Patterns and Pattern Sequences by Self-Organizing
     High Speed Time Delay Neural Networks," Lecture Notes in Computer                  Nets of Threshold Elements,” IEEE Transactions on Computers, vol. C-
     8Science, Springer, vol. 4493, Part III, May 2007, pp. 764-773.                    21, no. 11, pp. 1197–1206, November 1972.
[29] Hazem M. El-Bakry, and Nikos Mastorakis, "A New Fast Forecasting               [46] S. Amari and K. Maginu, “Statistical Neurodynamics of Associative
     Technique using High Speed Neural Networks," WSEAS Transactions on                 Memory,” Neural Networks, vol. 1, pp. 63–73, 1988.
     Signal Processing, Issue 10, vol. 4, October 2008, pp. 573-595.                [47] D. Hebb, The Organization of Behavior, New York, New York: John
[30] J. W. Cooley and J. W. Tukey, "An algorithm for the machine calculation            Wiley and Sons, 1949.
     of complex Fourier series, " Math. Comput. 19, 1965, pp. 297–301.              [48] J. Hopfield, “Neurons with Graded Response Have Collective
[31] R. Klette, and Zamperon, "Handbook of image processing operators,"                 Computational Properties Like Those of Two-State Neurons,”
     John Wiley & Sonsltd, 1996.                                                        Proceedings of the National Academy of Science USA, vol. 81, pp. 3088–
[32] http://www.worsleyschool.net/science/files/virus/page.html                         3092, May 1984.
[33] http://en.wikipedia.org/wiki/Virus                                             [49] J. Hopfield and D. Tank, “Computing with neural circuits: A model,”
[34] http://medical-dictionary.thefreedictionary.com/Biological+virus                   Science, vol. 233, pp. 625–633, 1986.
[35]                                                                                [50] I. Arizono, A. Yamamoto, and H. Ohta, “Scheduling for minimizing total
     http://www.emc.maricopa.edu/faculty/farabee/biobk/biobookdiversity_1.              actual flow time by neural networks,” International Journal of Production
     html                                                                               Research, vol. 30, no. 3, pp. 503–511, March 1992.
[36] http://www.learnartificialneuralnetworks.com/hopfield.html                     [51] B. Lee and B. Sheu, “Modified Hopfield Neural Networks for Retrieving
[37] http://en.wikipedia.org/wiki/Hopfield_net                                          the Optimal Solution,” IEEE Transactions on Neural Networks, vol. 2,
[38] J. J. Hopfield, "Neural networks and physical systems with emergent                no. 1, pp. 137–142, January 1991.
     collective computational abilities", Proceedings of the National Academy       [52] R. Lippmann, “An Introduction to Computing with Neural Nets,” IEEE
     of Sciences of the USA, vol. 79 no. 8 pp. 2554-2558, April 1982.                   Acoustics, Speech and Signal Processing Magazine, pp. 4–22, April 1987.
[39]                                                                                [53] M. Lu, Y. Zhan, and G. Mu, “Bipolar Optical Neural Network with
http://reference.wolfram.com/applications/neuralnetworks/NeuralNetworkThe               Adaptive Threshold,” Optik, vol. 91, no. 4, pp. 178–182, 1992.
                                   ory/2.7.0.html                                   [54] W. McCulloch and W. Pitts, “A logical calculus of the ideas imminent in
[40] http://www.engineeringletters.com/issues_v14/issue_1/EL_14_1_23.pdf                nervous activity,” Bulletin of Mathematical Biophysics, vol. 5, pp. 115–
[41] http://www.codeproject.com/KB/recipes/HopfieldNeuralNetwork.aspx                   133, 1943.
[42] http://web-us.com/brain/neur_hopfield.html                                     [55] R. Rosenblatt, “The perceptron: A probabilistic model for information
[43] http://www.heatonresearch.com/articles/2/page5.html                                storage and organization in the brain,” Psychological Review, vol. 65, pp.
[44] Hongmei. He, Ondrej. Sykora, " A Hopfield Neural Network Model for                 386–408, 1958.
     the Outerplanar Drawing Problem," International Journal of Computer            [56] R. Rosenblatt, Principles of Neurodynamics, New York, New York:
     Science,     vol.    32,    no.     4,    2006,    available    on   line,         Spartan Books, 1959.
     http://www.iaeng.org/IJCS/issues_v32/issue_4/IJCS_32_4_17.pdf                  [57] D. Schonfeld, “On the Hysteresis and Robustness of Hopfiled Neural
                                                                                        Networks,” IEEE Transactions on Circuits and Systems – II : Analog and
                                                                                        Digital Signal Processing, vol. 2, pp. 745–748, November 1993.

                        I1




                        I2                                                        Output
                                                                                   Layer
                                                                                              O/P1


                                                                                              O/P2


                      In-1

                                                                     Hidden
                                                                      Layer
                        In
                                                                   Cross correlation in time domain
                                        Input                      between the (n) input data and
                                        Layer                      weights of the hidden layer.


                                           Serial input data 1:N in groups of (n) elements
                                           shifted by a step of one element each time.
                       IN
                                                                         Figure 1. CTDNNs.
                                                (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                              Vol. 9, No. 11, November 2011




    I1




    I2                                              Output
                                                     Layer

                                                               O/P1


                                                               O/P2


  IN-1

                                          Hidden
                                           Layer
   IN
                                       Cross correlation in the frequency
                                       domain between the total (N) input data
                                       and the weights of the hidden layer.

                                             Figure 2. HSTDNNs.




  TABLE I: THE THEORETICAL SPEED UP RATIO FOR DETECTING H1N1 OR H1N5 (LENGTH OF BIOLOGICAL VIRUS CODE=400).
Length of Number of computation steps required for Number of computation steps required Speed up
serial data             CTDNNs                               for HSTDNNs                  ratio
  10000               2.3014e+008                             4.2926e+007                5.3613
  40000               0.9493e+009                             1.9614e+008                4.8397
  90000               2.1478e+009                             4.7344e+008                4.5365
 160000               3.8257e+009                             8.8219e+008                4.3366
 250000               5.9830e+009                             1.4275e+009                4.1912
 360000               8.6195e+009                             2.1134e+009                4.0786
 490000               1.1735e+010                             2.9430e+009                3.9876
 640000               1.5331e+010                             3.9192e+009                3.9119


 TABLE II: THE THEORETICAL SPEED UP RATIO FOR DETECTING H1N1 OR H1N5 (LENGTH OF BIOLOGICAL VIRUS CODE=625).
Length of Number of computation steps required for Number of computation steps required Speed up
serial data             CTDNNs                               for HSTDNNs                  ratio
  10000               3.5132e+008                             4.2919e+007                8.1857
  40000               1.4754e+009                             1.9613e+008                7.5226
  90000               3.3489e+009                             4.7343e+008                7.0737
 160000               0.5972e+010                             8.8218e+008                6.7694
 250000               0.9344e+010                             1.4275e+009                6.5458
 360000               1.3466e+010                             2.1134e+009                6.3717
 490000               1.8337e+010                             2.9430e+009                6.2306
 640000               2.3958e+010                             3.9192e+009                6.1129
                                                 (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                               Vol. 9, No. 11, November 2011
 TABLE III: THE THEORETICAL SPEED UP RATIO FOR DETECTING H1N1 OR H1N5 (LENGTH OF BIOLOGICAL VIRUS CODE=900).
Length of Number of computation steps required for Number of computation steps required              Speed up
serial data             CTDNNs                               for HSTDNNs                               ratio
  10000               4.9115e+008                             4.2911e+007                            11.4467
  40000               2.1103e+009                             1.9612e+008                            10.7600
  90000               4.8088e+009                             4.7343e+008                            10.1575
 160000               0.8587e+010                             8.8217e+008                             9.7336
 250000               1.3444e+010                             1.4275e+009                             9.4178
 360000               1.9381e+010                             2.1134e+009                             9.1705
 490000               2.6397e+010                             2.9430e+009                             8.9693
 640000               3.4493e+010                             3.9192e+009                             8.8009


                        TABLE IV: PRACTICAL SPEED UP RATIO FOR DETECTING H1N1 OR H1N5.
Length of serial data      Speed up ratio (n=400)      Speed up ratio (n=625)       Speed up ratio (n=900)
      10000                        8.94                        12.97                        17.61
      40000                        8.60                        12.56                        17.22
      90000                        8.33                        12.28                        16.80
      160000                       8.07                        12.07                        16.53
      250000                       7.95                        17.92                        16.30
      360000                       7.79                        11.62                        16.14
      490000                       7.64                        11.44                        16.00
      640000                       7.04                        11.27                        15.89

				
DOCUMENT INFO
Shared By:
Stats:
views:112
posted:6/24/2012
language:English
pages:8
Description: Fast detection of biological viruses in DNA sequence is very important for investigation of patients and overcome diseases. First, an intelligent algorithm to completely retrieve DNA sequence is presented. DNA codes that may be missed during the splitting process are retrieved by using Hopfield neural networks. Then, a new approach for fast detection of biological viruses like H1N1 and H1N5 in DNA sequence is presented. Such algorithm uses high speed time delay neural networks (HSTDNNs). The operation of these networks relies on performing cross correlation in the frequency domain between the input DNA sequence and the input weights of neural networks. It is proved mathematically and practically that the number of computation steps required for the presented HSTDNNs is less than that needed by conventional time delay neural networks (CTDNNs). Simulation results using MATLAB confirm the theoretical computations.