Docstoc

A On Computational Physics

Document Sample
A On Computational Physics Powered By Docstoc
					PHY400W — cp 2003                                                                  1


A On Computational Physics
      When a problem is clearly stated, it is of no further interest to
      the physicist.
                                                          — P. Debye


      Analysing a general potential system with two degrees of freedom
      is beyond the capabilities of modern science
                                                       — V.I. Arnol’d


A.1 Physics

Physics is the study of how the world works.

From experiment we derive some sort of intuition about how the world op-
erates, and try to distill this into some kind of physical law or model. From
then on, it’s just a matter of expressing this in mathematical terms, and
exploring the quantitative outcome of our intuition.

In a typical physics course, we learn how to express the ideas of physics
as, say, differential equations, which we can then solve. Indeed, much of
the physics course is devoted to all those lovely equations and their neat
closed-form (“analytical”) solutions: harmonic oscillators, point masses in
gravitational fields, hydrogen atoms, and so on. Such is the power of the
differential and integral calculus. It’s very easy to gain the impression that
all of the equations of physics is of this nature, and that it is just a measure
of our lack of mathematical sophistication that we can’t find an analytical
solution to some problem.

Unfortunately, this is not true. Most problems do not have nice analytical
solutions, and there is no hope of finding one. So we have to turn to numerical
solutions in order to solve our problems.

Arnol’d quotes the set of solvable problems in classical mechanics:


      one-dimensional problems, motion of a point in a central field,
      Eulerian and Lagrangian motions of a rigid body, the problem of
      two fixed centers, and motion along geodesics on the ellipsoid.
PHY400W — cp 2003                                                                    2


To this list we can add the instructive case of the Toda lattice. This was only
found to be solvable (integrable) after numerical solutions had hinted at it.

But let’s consider one of these integrable cases, the pendulum. This has the
equation of motion
                                   d2 θ
                                        = − sin θ
                                   dθ2
This is solvable only in that the solution can be expressed in terms of known
functions, the elliptic integrals. To evaluate these functions is still a problem.
But let’s take a step back: to evaluate sin θ is a problem. We don’t consider
it so these days, we just punch a few keys on a pocket calculator. (A few
decades ago we would have looked up the value in a set of tables, calculated
by a human). But our calculator (whether electronic or human) only does
arithmetic. In order to calculate the sine it must go through some arithmetic
procedure, a step by step recipe, to calculate the value required using arith-
metical operations. Developing such a step by step recipe, an algorithm, is
the basis of finding our solutions in computational physics. We might think
this is rather simple. After all, we can write a series expansion for the value
of our sine. But we want a solution in a finite time, so we would need to
truncate the series. This will incur a truncation error, so we will need to
make sure that it is small enough. Then we find that there are numerical
techniques to enhance the accuracy with fewer series terms, by slight mod-
ification of the coefficients. In fact, the algorithm used in our calculator is
quite far from the series that you might expect.


A.2 Computers and physics

The development of computers has been driven by the needs of physics.

By this I mean not just the collection and analysis of data, but the compu-
tation of mathematical results that cannot be obtained by analytical means:
in other words, number crunching. The most powerful and advanced systems
of the day have always been designed for this.

This means that programming for physics is a much more complicated task
than, say programming an operating system. We do have to worry about
efficiency and program size; and not only must the coding be correct, but
also the mathematical representation of the problem.
PHY400W — cp 2003                                                              3


A.3 What is computational physics?

There are two major aspects to computation in physics:


   • Obtaining solutions to known equations which cannot be solved ana-
     lytically.

   • Exploring model systems for which no solution is known.


These both exploit algorithms for numerical methods of solution on comput-
ers.

To apply these techniques we have to understand their limitations, e.g the
accuracy of approximations, the stability of results against numerical per-
turbations, and the limitations of the representation of real numbers on the
computer.

The algorithms then have to be translated into a list of instructions for
the computer. This usually involves a computer language in which these
instructions are given.
PHY400W — cp 2003                                                                  4


B Programming
       I don’t know what the language of the year 2000 will look like,
      but I know it will be called Fortran.
                                                     – C.A.R. Hoare


B.1 Choice of language

In order to represent a problem on the computer we require a programming
language which we can use to encode the processes we want to carry out.
For instance, we need to be able to tell the computer “add this number to
that variable” or “print out this value”.

Such languages are of course formal languages that have a precise grammar
and syntax. We can construct a program which is a list of sentences that
accord with these rules of grammar and syntax. The elements of this language
have some semantic meaning in terms of the numerical tasks we require to
solve.

The program has to be run or executed on the computer. There are two basic
ways of achieving this. The program can be interpreted by another program,
which carries out the tasks specified in the program. This process can be
rather slow. A more efficient scheme is to compile the program into machine
code (the code understood by the CPU). This machine code then directly
controls the hardware. (Typically, a program is compiled into object code,
which is then linked to other object files to form an executable file. Some of
these other object files may be in the form of libraries which hold commonly
used code.)

The language of computational physics has traditionally been Fortran, but
these days C and C++ are being used to increasing extents, partly due
to the availability of compilers rather than any real benefit. Indeed, array
operations are rather problematic in C-based languages due to the internal
representations, and C++ is only recently approaching the ease of use and
speed of Fortran, with the use of exotic template techniques that can defeat
many compilers.

In this course we will take a slightly easier route and give examples in Python.
Why use a barely known and obscure language like Python?
PHY400W — cp 2003                                                                  5


   • Python is free and freely available for many platforms.

   • Python is a modern language, with object oriented features. It is much
     easier to learn than C++.

   • Python is interpreted; this makes it easy to ‘see what happens if we. . . ’

   • Python is easily extensible and has a large variety of available exten-
     sions and libraries.

   • With the Numerical Python extension, Python has many of the array-
     handling features of Fortran 90/95.

   • The VPython extension gives us an easy way to produce graphs and
     animations.


Python is an interpreted language, which makes it somewhat simpler to code
and run a program (there are no intermediate compile/link steps).

Interpreted languages like Python do tend to produce slow programs, how-
ever. it might be profitable to write a program that take a long time to run
in some compiled language. Here are some comments on available languages.


Fortran The latest incarnation of Fortran is Fortran95 (standardised in
1995). Unfortunately Fortran compilers tend to be high-cost items. A sub-
set compiler, for a language called F, is available for free download from
www.meltingsand.com. This is suitable for writing new Fortran95 programs,
but not for compiling old FORTRAN77 code.

Fortran95 offers many benefits over languages like C which were not designed
for numeric computing. (Of course, these other languages do have some
features it would be nice to have in Fortran.) Some benefits of Fortran are:


   • Parameterized floating point precision.

   • Flexible array handling.

   • Arbitrary ranges in array subscripts.

   • Array slices.

   • Complex numbers.
PHY400W — cp 2003                                                                6


   • Intrinsic functions are part of the language, so greater optimisation is
     possible.
   • HPF extensions permit writing programs for parallel execution on sev-
     eral computers.
   • Arbitrary operators can be defined.

Fortran is not object-oriented.


C It’s better to start with C++ these days.


C++ C++ is a powerful and complex language which permits object-
oriented programming. (This is carefully phrased. C++ also permits non-OO
programming).

The gcc compiler is freely available for many platforms. Some benefits of
C++ are:

   • Object oriented features: classes, inheritance.
   • Templates.
   • The STL (Standard Template Library) is part of standard C++.

Efficient handling of matrices and arrays in numerical calculations seems to
require exotic template techniques which may not be handled well by many
compilers.


Java Java is compiled to run on a “Java virtual machine” which may in
turn recompile the code into native machine code (‘just in time” compilation).
Some benefits of Java are:

   • Object oriented features: classes, inheritance.
   • Eliminates some areas of C++ that can cause problems, e.g. pointers.
   • Large set of supporting libraries, including platform-independent GUI
     (graphical user interface) construction.
PHY400W — cp 2003                                                                7


Some disadvantages of Java are:


   • No overloading of operators. In particular, complex numbers become,
     well, complex.

   • Multidimensional arrays are not efficiently handled.

   • There are problems associated with the Java representation of floating
     point numbers.


Java is not recommended for numerical work.


B.2 Basics of programming

How do we transform a physics problem into something that can be solved
on a computer?

There are a number of steps in the process:


   • Analyse the problem to find a suitable numerical method of solution.

   • Derive an algorithm to perform the solution.

   • Write a computer program to embody the algorithm.

   • Get the program debugged and running.

   • Solve the problem!


An algorithm is essentially a set of steps that has to be performed in order
to accomplish a task — a recipe, if you like, that has to be followed. The
computer language that is used to translate this algorithm into a program
must offer useful facilities that match the algorithm. These are usually in the
form of data elements, control structures and input and output facilities. In
addition, the language should enable the proper structuring of the program.

A useful approach to designing an algorithm is that of structured program-
ming and step-wise refinement. An algorithm is broken down into a number
PHY400W — cp 2003                                                                 8


of broad steps, each of which can be represented by a procedure in a program-
ming language. Each of these can then be refined (by a recursive process)
until the algorithm is done. (This is also known as a top-down approach).

At a higher level, one can take a procedural or an object-oriented approach.
The latter attempts to view the problem as a set of objects which can be
manipulated to solve the problem. This approach tends to be rather vo-
luminous, but does limit the potential interactions between parts of large
programs. Most simple problems, such as we will be dealing with, are prob-
ably best approached from a straight-forward procedural approach.


B.3 Data elements

With numerical programming, the basic data elements are of course numbers.
In physics, we deal with various kinds of numbers: integers, real numbers and
complex numbers. Computers, and computer languages, attempt to supply
some realisations of these numbers for us to use, but there are of course major
differences from the idealised mathematical quantities.

Integers are available in some limited range, with a maximum value of 231 − 1
for “32-bit” integers.

The “real numbers” available on a computer have little resemblance to the
real numbers of mathematics. The range is limited, they are not complete and
continuous, and the spacing between available numbers varied from number
to number. The system will be described below.

Complex numbers are found in Fortran, but not by default in most other lan-
guages. The recently standardised C++ has a complex library, but complex
numbers are not part of the language.


B.4 Control structures

Computer languages provide statements to implement these. There are a
number of control structures that are useful:


Sequence Usually statements are executed in sequence. Some languages
    (e.g. HPF) can parallelize code so that the same code operates in
PHY400W — cp 2003                                                               9


     parallel on several processors.

Repetition These structures are generically known as loops. Fortran pro-
    vides do loops, and C for loops.

Decision Logical choice is implemented by if/then/else structures. Multiple
     choice is implemented as a case statement.


B.5 Higher-level structures

Programs are usually assembled out of higher level structures called proce-
dures. C provides a function for this purpose, while in C++ the basic unit of
structure is the class. A C program starts running in the main program. (A
C++ program is less obvious). Fortran has a specific program construct to
define where the program will start: it also provides two types of procedure,
and a module construct for providing cross-file structuring.
PHY400W — cp 2003                                                                  10


C Numbers
      The purpose of computing is insight, not numbers.
                                                  — R.W. Hamming


Physics is a quantitative science: numbers obviously play a large role.

But many of the mathematical ideas that are used in physics are there to
get away from numbers and arithmetic. Much of what we do in calculations
involves the integral and differential calculus; in fact, solving many problems
in physics means finding the solution to a differential equation.

Of course, the “physics” in the problem is in setting up the equations, but
to see if the physics is correct we still have to solve them.

In solving problems on the computer, we deal with a simplified number sys-
tem, with none of the topological niceties of the real numbers like continuity
and completeness. So notions of differentiability do no enter: the number sys-
tem is discrete. We have to fall back on arithmetic in order to do anything,
and everything has to be done by tedious arithmetic manipulations.

(What of symbolic packages like Mathematica and Maple: don’t these allow
us to use the familiar tools of the calculus? Yes, but they are just a handy
way of remembering what the integral, say, of a function is. Think of them
as a handy table of integrals.)


C.1 The computer number system

There are two types of number represented on the computer: integers and
floating point numbers, the counterparts of real numbers. These are usually
represented in a binary manner in the hardware.

The representations depend also on the hardware. The PC will be used as
the example; other types of hardware might have differences in the ranges of
these numbers.

The integers we will use are usually stored in four bytes (assuming we are
dealing with “32-bit” programs!), each byte consisting of eight bits. 31 bits
are used to represent an integer, and one bit for the sign. Thus integers in the
PHY400W — cp 2003                                                                   11


range −231 to 231 − 1 (-2147483648 to 2147483647) are represented. (There
are usually other integers available, e.g 16-bit numbers, and some languages
might offer signed and unsigned variants).

Real numbers are represented in the form of floating point numbers. These
are written in the form
                                sM B e−eo
where s is a sign bit, M is a mantissa, B is the base of the number system
(usually 2), e is an exponent and eo is an offset. The mantissa consists of a
certain number of significant digits (bits in a binary representation). There
is an implied decimal point at the start of the mantissa, and the exponent
is adjusted so that the leading digit (bit) is non-zero. (This adjustment
process is known as normalisation of the number; all floating point numbers
are normalised).

Thus, if we take a mantissa of 3 decimal digits and a signed exponent of 2
decimal digits as an example, a floating point number might be

                                 −0.123 × 1001


Clearly, there is only a finite number of such numbers. These are only ap-
proximations of real numbers, and properties of the reals like continuity are
not shared.

The floating point number will be stored in 32 or 64 bits (on the PC). One
bit is used for the sign, 23 or 52 for the mantissa and 8 or 11 for the exponent.
The leading bit is always set, and the exponent is adjusted while the bits of
the mantissa are shifted to achieve this. This process is known as normalisa-
tion. After an arithmetic operation, the floating point result is normalised.
If this cannot be done, an error condition (denormalised number) occurs.



                    Precision   sign   mantissa   exponent
                    single      1      23         8
                    double      1      52         11
                    extended    1      64         15


The range of representable numbers on the PC is approximately
PHY400W — cp 2003                                                               12



             Precision   minimum     maximum      sig. decimals
             single      1.2E-38     3.4E+38      6-9
             double      2.2E-308    1.8E+308     15-17
             extended    3.4E-4932   1.2E+4932    18-21


Even if there is no error condition, there can be problems with the result.
Most important is the disappearance of low-significance bits in the process
of subtracting two similar numbers.

We can get a feeling for this by considering base 10 with four digits in the
mantissa, i.e. take B=10 and, say, e0 = 50. Then the number a = 10.3213
can be represented as


                           + 5 2 1 0 3 2


Note that there has already been some truncation of the number to fit into
the floating point scheme.

The subtraction a = 10.3213 − 10.3122


             + 5 2 1 0 3 2 - + 5 2 1 0 3 1


results in the unnormalised number


                           + 5 2 0 0 0 1


The normalised result is then:


                           + 4 9 1 0 0 0


Note that significant digits have been lost from the right hand side. This can
be a major problem in many calculations.

				
DOCUMENT INFO