# antonio

Document Sample

```					Toward an automatic parallel tool for
solving systems of nonlinear
equations

Antonio M. Vidal
Jesús Peinado

Departamento de Sistemas Informáticos y Computación
Universidad Politécnica de Valencia
Solving Systems of Nonlinear Equations
Given F :  n   n , find x *   n / F ( x* )  0

Newton’s iteration:            x  xc  J ( xc ) 1 F ( xc )
Newton’s Algorithm
Chose x ( 0 )
Evaluate F ( x ( 0 ) )
While F ( x ( i ) )  bound
Compute Jacobian matrix J ( x ( i ) )
Solve J ( x ( i ) ) sk   F ( x ( i ) )
xi  xi  si
Evaluate F ( x ( i ) )
Methods to solve Nonlinear
Systems
• Newton’s Methods: To solve the linear system by using
a direct method (LU, Cholesky,..) Several approaches :
Newton, Shamanskii, Chord,..

• Quasi-Newton Methods: To approximate the Jacobian
matrix . (Broyden Method, BFGS,...)
B(xc) ≈ J(xc)
B(x+)= B(xc)+uvT

• Inexact Newton Methods : To solve the linear system
by using an iterative method (GMRES, C. Gradient,..) .
||J(xk )sk+ F(xk )||2 = ηk ||F(xk )||2
Difficulties in the solution of Nonlinear
Systems by a non-expert Scientist
• Several methods
• Slow convergence
• A lot of trials are needed to obtain the optimum
algorithm
• If parallelization is tried the possibilities increase
dramatically: shared memory, distributed
memory, passing message environments,
computational kernels, several parallel
numerical libraries,…
• No help is provided by libraries to solve a
nonlinear system
Objective
• To achieve a software tool which
automatically obtains the best from a
sequential or parallel machine for solving a
nonlinear system, for every problem and
transparently to the user
Work done
• A set of parallel algorithms have been implemented:
Newton’s, Quasi-Newton and Inexact Newton algorithms
for symmetric and nonsymmetric Jacobian matrices
• Implementations are independent of the problem
• They have been tested with several problems of different
kinds
• They have been developed by using the support and the
philosophy of ScaLAPACK
• They can be seen as a part of a more general
environment related to software for message passing
machines
SCALAPACK
• Example of distribution for solving a linear system with J Jacobian
Matrix and F problem function
• Programming Model: SPMD.
• Interconnection network: Logical Mesh
• Two-dimensional distribution of data: block cyclic
Software environment
USER           Authomatic Parallel Tool

Numerical Paralell Algorithms

ScaLAPACK                Scalable Linear Algebra Package

MINPACK                                                               Global
Minimization
Package                                      PBLAS          Parallel BLAS

LAPACK         Linear Algebra Package
Basic Linear Algebra
BLACS          Communication Subroutines

Other      CERFACS:                                                                                   Local
packages..   CG,GMRES
Iterative Solvers
BLAS   Basic Linear Algebra Subroutines
Message-passing
primitives
(MPI, PVM, ...)
Developing a systematic approach
How to chose the best method?

•    Specification of data problem
i. Starting point.
ii. Function F.
iii. Jacobian Matrix J.
iv. Structure of Jacobian Matrix (dense, sparse, band, …)
v. Required precision.
vi. Using of chaotic techniques.
vii. Possibilities of parallelization (function, Jacobian Matrix,…).

•    Sometimes only the Function is known:
Prospecting with a minimum simple algorithm (Newton+finite
differences+sequential approach) can be interesting
La metodología(1).
Esquema general
Developing a systematic approach
Method                                  flops
Newton                                                         2
C C  k N (C E  C J           n3 )
3
Shamanskii                                         2
C c  k S (C J          n 3  m (C E  2n 2 ))
3
Chord                                       2
Cc  C J         n 3  kC (C E  2n 2 )
3
Newton-Cholesky                                                        n3
C C  k NCH (C E  C J               )
3
Broyden                                          4
CC  C E  CJ              n 3  k B (C E  29n 2 )
3
BFGS          CC
n3                               n3
 CE  C J  3  kBF (2 n 2  CE )  m (C J  3 )  (k BF  m )( n 2)
15

Newton-GMRES                  C C  C E  k NG (C E  C J  kG 2n 2 m  C E )
Newton-CG                        C C  C E  k NCG (C J  kCG n 2  C E )

CE= Function evaluation cost; CJ=Jacobian matrix evaluation cost
Developing a systematic approach
•   Function and Jacobian Matrix characterize the nonlinear system
•   It is important to know features of both: sparse or dense, how to compute
(sequential or parallel), structure,…
•   It is be interesting to classify the problems according to their cost, specially to
identify the best method or to avoid the worst method and to decide what must
be parallelized

J
O(n) O(n2) O(n3) O(n4) >O(n4)
F
O(n)    P11   P12    P13   P14     P1+
O(n2)   P21   P22    P23   P24     P2+
O(n3)   P31   P32    P33   P34     P3+
O(n4)   P41   P42    P43   P44     P4+
>O(n4)   P+1   P+2    P+3   P+4     P++
Developing a systematic approach
• Once the best sequential option has been
selected the process can be finalized
• If the best parallel algorithm is required the
following items must be analyzed:
– Computer architecture: (tf, t, b )
– Programming environments: PVM/MPI….
– Data distribution to obtain the best
parallelization.
– Cost of the parallel algorithms
Developing a systematic approach

Data Distribution
It depends on the parallel environment. In the case of ScaLAPACK:
Cyclic by blocks distribution: optimize the size of block and the size of
the mesh
Parallelization chances
Function evaluation and/or Computing the Jacobian matrix.
Parallelize the more expensive operation!

Cost of the parallel algorithms
Utilize the table for parallel cost with the parameters of the parallel
machine: (tf, t, b)
Developing a systematic approach
Final decision for chosing the method
Cost < O(n3) => 0; Cost >= O(n3) => 1

CE     CJ                         Advisable

0      0    Chose according to the speed of convergence. If it is
slow chose Newton or Newton GMRES
0      1    Avoid to compute the Jacobian matrix. Chose Broyden
or use finite differences
1      0    Newton or Newton-GMRES adequate. Avoid to compute
the function
1      1    Try to do a small number of iterations. Use Broyden to
avoid the computation of Jacobian matrix
Developing a systematic approach
Final decision for parallelization
No chance of parallelization => 0; Chance of parallelization => 1

Fun Jac.                                 Advisable
0          0    Try to do few iterations. Use Broyden or Chord to avoid
the computation of Jacobian matrix
0          1    Newton or Newton-GMRES adequate. Do few iterations
and avoid to compute the function
1          0    Compute few times Jacobian matrix. Use Broyden or
Chord if possible.
1          1    Chose according to speed of convergence. Newton or
Newton-GMRES adequate
Developing a systematic
approach
Finish or feedback:
IF selected method is convenient
THEN finish
ELSE feedback

Sometimes bad results are obtained due to:
– No convergence.
– High computational cost
– Parallelization no satisfactory.
La metodología(12).
Esquema del proceso guiado
La metodología(12).
Esquema del proceso guiado
How does it work?
•Inverse Toeplitz Symmetric Eigenvalue
Problem
•Well known problem: Starting point, function, analytical Jacobian
matrix or finite difference approach, …
•Kind of problem
F  O(n3)                            F  O(n3)
 P33 Anal.Jac.                         P34   Fin.Dif. Jac
J  O(n3)                            J  O(n 4 )

• Cost of Jacobian matrix high: Avoid compute it. Use
Chord o Broyden.
• High chance of parallelization, even if finite difference
is used.
• If speed of convergence is slow use Broyden but
insert some Newton iterations.
How does it work?
•Leakage minimization in a network of water
distribution
•Well known problem: Starting point, function, analytical Jacobian
matrix or finite difference approach, …
•Jacobian matrix: symmetric, positive def.
•Kind of problem

F  O(n2)
 P22
J  O(n2)

• Avoid methods with high cost of a iteration like Newton-
Cholesky
• Computation of F and J can be parallelized.
• Use Newton-CG (to speed-up convergence) or BFGS
Conclusions
• Part of this work has been done in the Ph.D. Thesis of
J.Peinado: “Resolución Paralela de Sistemas de
Ecuaciones no Lineales”. Univ.Politécnica de Valencia.
Sept. 2003
• All specifications and parallel algorithms have been
developed
• Implementation stage of the automatic parallel tool starts
in January 2004 in the frame of a CITYT Project:
“Desarrollo y optimización de código paralelo para
sistemas de Audio 3D”. TIC2003-08230-C02-02

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 7 posted: 9/30/2012 language: English pages: 22