G1BIQMG13QTI Introduction to Quantum Mechanics by wrp10278


									        G1BIQM/G13QTI                          Introduction to Quantum Mechanics.

1         The road to quantum mechanics
Modern quantum mechanics is the culmination of a series of experimental discoveries and
partially successful theories which arose in the period 1900-1926. When the catharsis eventually
came in 1926, an elegant self-consistent picture emerged, but in the period leading up to it
progress was haphazard and the participants were often confused and frustrated. A complete
and historically faithful account of this period, with all of its confusion and half-understood
theories, is not the most helpful way of understanding the foundations of quantum mechanics, at
least at a first pass.1 We will therefore restrict our attention to a somewhat selective account of
the most important discoveries, capturing the essential elements necessary for an understanding
of the basis of quantum theory but passing over any technical details that are not strictly
    Instead of following these developments in strict chronological order, it is helpful to separate
the main elements into two logical components,
        • particle-wave duality and

        • Bohr’s theory of Hydrogen (or “Old Quantum Mechanics”).
Particle-wave duality is the essential element we will need to “discover” Schr¨dinger’s equation,
which can be seen as the root of modern quantum mechanics. This discovery is not a derivation
however (and cannot be, since quantum mechanics is a fundamentally new theory). Its veracity
is checked by seeing how well it describes real-world observations, and this is where Bohr’s
theory will play its role in our logical development. Bohr had produced in 1913 a set of ad
hoc rules which could explain some atomic spectra (and not others). Schr¨dinger’s equation is
convincing in large part because Bohr’s theory emerges as a natural result of it. Even better,
problems which are impervious to Bohr’s incomplete theory become treatable and modern
quantum mechanics emerges as a complete and self-consistent theory of matter.

1.1        A new fundamental constant
The seed for quantum mechanics is conventionally held to be the theoretical explanation
by Planck in 1900 of blackbody radiation. This concerns measurements of the frequency-
distribution of light radiated from a hot object or from within a hot cavity. In qualitative
terms, the theory endeavours to explain why a hot object radiates heat in the form of infrared
    This is not to say that the topic is to be avoided. This is a fascinating episode in the history of science and
is well worth investigating, if only at a recreational level. Two books in particular can be recommended which
give very readable accounts of this period. One is the biography of Niels Bohr by Abraham Pais, entitled Niels
Bohr’s Times: in Physics, Philosophy and Polity. The other is The Making of the Atomic Bomb by Richard
Rhodes. A copy each of these can be found in the George Green library and they are highly recommended as
bed-time reading.

radiation, something really hot glows red so that visible light is radiated and something really
really hot radiates light of all colours and appears “white hot”. Theoretical predictions based
on the existing theory of electromagnetism had produced nonsensical results. They had, for
example, predicted an “ultraviolet catastrophe,” whereby the light intensity decayed so slowly
with increasing frequency that the total predicted energy of the radiated light was infinite. Un-
surprisingly the theory failed to agree with experiment at high frequencies, though agreement
at lower frequencies was good.
    To understand what Planck did to get around this, we need to recall a couple of elements
of electromagnetic theory. The classical theory of light holds that it is a wave-field describing
oscillations in the electric and magnetic fields. The simplest plane-wave solutions are of the
                                   E(x, t) = Re E0 ei(k·x−ωt)
                                   B(x, t) = Re B0 ei(k·x−ωt)                                   (1)
in which the wavevector is in the direction of propagation of the wave and is related to the
wavelength through |k| = 2π/λ. For the record, recall also that the circular frequency ω is
related to the ordinary frequency ν through ω = 2πν. In classical theory, the energy per unit
volume of such a field is of the form
                                energy /
                                               = const. × amplitude2                            (2)
                               unit volume
where the constants depend on the units being used and need not concern us. The important
point is that we can set up wave fields with arbitrarily small energies simply by having arbitrarily
small wave amplitudes. Planck’s contribution is to suggest a radical departure from this classical
picture whose sole justification is initially that it leads to a prediction for blackbody radiation
consistent with experiment.
    His assertion is that, in blackbody radiation, there is a fundamental “quantum” of energy
for any given frequency given by
                                              E = hν                                            (3)
and that, effectively, the total energy in any given volume must be an integer multiple of this.
The quantity h here is a new fundamental constant, now referred to as Planck’s constant.
At very high frequencies the electromagnetic field is then faced with the option of having a
relatively large energy or having none at all, and is forced to opt for the latter. Effectively
then, the high frequencies are cut off and the ultraviolet catastrophe is avoided. A detailed
description of Planck’s analysis would take us too far off course and we will not pursue it here.
Instead let us note it as the first hint of a new theory and jump to 1905, when Einstein used
Planck’s formula to explain the so-called photoelectric effect and in so doing offered what is
effectively the modern interpretation of E = hν.

1.2    Photons
In the photoelectric effect, electrons escape from the surface of a metal after absorbing energy
from light shone on it. In order to escape the attraction of the surface, an electron has to acquire

an energy W which is called the work function and varies from metal to metal. Once again,
the classical formalism fails to explain experimental observations. A particular example is the
fact that if the frequency is too low, no electrons escape no matter how intense the light is —
in classical theory we could make light of any frequency sufficiently energetic to eject electrons
simply by making it more intense. Einstein’s assumptions were that light consisted of discrete
particle-like packets of energy called photons and that each photon carried an energy E = hν.
Photoelectric emission occurred when a photon collided with an electron and transferred all of
its energy to it. The electron can then escape with a kinetic energy
                                          Eesc = hν − W.
Note in particular that the electron cannot escape if hν < W . A complete explanation of the
photoelectric effect then becomes possible on the basis of this formula.

Figure 1: The intensity of classical wave field is interpreted in quantum mechanics as a measure
of the density of photons.

    We have therefore arrived at the first element in particle-wave duality. Light, previously,
thought of as an entirely wave-like phenomenon, can display particle-like behaviour. These seem
like entirely different ideas, so how can they be consistent with one another, and in particular
how can (2) be compatible with (3)? The following is the modern interpretation. Imagine a
pulse of light, with a reasonably well-defined frequency and wavelength, as illustrated in Fig. 1.
Suppose now that we take a small region and blow it up as in the figure. What the classical field
represents is a course-grained or smoothened picture of what is actually a swarm of photons,
all moving in more or less the same direction. The energy density is then
                             energy /                # of photons
                                            = hν ×                     .
                            unit volume              / unit volume
Notice that this can be made consistent with (2) if we suppose that the number of photons per
unit volume takes a similar form,
                             # of photons
                                               = const. × amplitude2                          (4)
                             / unit volume

(with different constants of course). On macroscopic length scales or for high intensities the
number of photons in a typical region is quite large and the discreteness of the photons is unim-
portant in much the same way that molecules are unimportant in traditional fluid mechanics
— a purely classical course-grained theory then suffices. On microscopic scales however, such
as correspond to the absorption of a photon by an electron, the particle nature of light becomes
important — this is the domain of quantum mechanics.
    Let us anticipate somewhat and note what the final interpretation of (4) will be. We can
think of the quantum-mechanical limit as corresponding to the case where the photons are
sparse. We can still retain (4) in that case, but we must interpret the left-hand side as the
probability of finding a photon in a given region (which might be small). In particular, the
mathematics of the classical theory of light (Maxwell’s equations) will carry over to the quantum
theory but the interpretation of what the fields represent will be different.

1.3    Photons have momentum
The next step in the story is almost a direct consequence of E = hν but did not come about until
the 1920’s. Compton used the fact that photons should have momentum to explain observations
of γ-rays which had been scattered from free electrons. If we think of the photon as a particle
of zero mass, then in relativistic theory it should have a momentum related to its energy by

                                            E = cp.

Combining this with Planck’s formula we then note that the momentum should be
                                              hν  h
                                         p=      = .                                          (5)
                                               c  λ
By asserting this logical extension of Planck’s formula Compton was able to explain observations
made of photons scattering off electrons.
    In “Compton scattering” a very high energy photon (or γ-ray) collides with a stationary
electron and is deflected. In so doing it donates some of its energy to the electron and its
wavelength increases. Using (5) and conservation of momentum and energy, Compton was able
to come up with an explicit formula for the wavelength change as a function of the angle of
deflection. The fact that this prediction agreed with observations firmly established the idea
that photons could be treated as particles with momentum as well as energy and that (5) gave
the momentum as a function of wavelength.
    Finally, we note that it has become customary to work with the following rescaling of
Planck’s constant,
(pronounced h-bar) instead of with h itself. In terms of this constant, Planck’s formula and the
momentum equation can be written in the forms

                                           E = hω
                                           p = hk                                             (6)

respectively, where the momentum equation is now written in vector form associating a vector
momentum with a plane wave propagating in the direction of k.

1.4    If light can be particles, matter can be waves
De Broglie made the contribution (in his doctoral thesis) in 1924 of asking whether, if the rela-
tionships in (6) are natural for light, they might not work in the opposite direction for matter.
This would associate a plane wave with any particle propagating with a definite momentum
and energy. In other words, if light has a dual particle-wave nature, we might find the same
to hold for matter. Even though we used the specific properties of light to deduce p = hk as   ¯
a consequence of E = hω, de Broglie argued that from the point of view of relativistic theory
it seemed natural that the same formulas should hold without modification for matter (see
Appendix 1.8).
    Asserting wave-particle duality for matter as well as light is a fairly wild conjecture (so wild
and speculative that de Broglie was only grudgingly awarded a PhD for it), but experimental
confirmation came in 1926 with experiments by Davisson and Germer showing interference
effects could be made by passing electrons through crystals. The observation of interference is
the classic signature of waves and provides indisputable evidence of the wave-particle duality
of matter. It has been repeated many times since with different sorts of particles.
    Assuming that (6) can be used for matter as well as light is a crucial step in the development
of quantum mechanics. Once we make this leap we are pointed firmly in the direction of the
full theory and all the elements are in place to see where Schr¨dinger’s equation comes from.
First however we will step back and follow another thread in which equally stark deviations
from the classical picture were appearing.

1.5    Atomic spectra and Old Quantum Mechanics
It had been known since the nineteenth century that light emitted from excited atoms had
frequencies restricted to a set of well-defined values. These atomic spectra were characteristic
of each element and, once again, could not be explained with classical physics.
    The simplest spectrum belongs to the simplest element, Hydrogen. In 1913 Bohr produced
a complete and accurate account of the spectrum of Hydrogen using a set of ad hoc rules that
completely violated classical notions. His model for Hydrogen starts with an electron following
a circular orbit around a much heavier proton, attracted by the electrostatic coulomb force. He
then assumes that

   • The angular momentum of the electron about the centre is an integer multiple of h,

                                                L = n¯ .

      This forces the orbit radius and energy to be restricted to discrete sets of values,

                                               rn = n 2 a0

                                                 En = −
      respectively. Explicit calculation gives
                                              [4π 0 ]¯ 2
                                       a0 =              ≈ 0.053nm
                                                me e 2
      (called the Bohr radius) and
                                      E0 =             ≈ 2 × 13.6eV
                                             [4π 0 ]a0
      (exercise!). The term in square brackets is omitted if cgs units are used. We may say that
      the orbits are quantised.
   • Transitions between these quantised orbits occur in discrete jumps and when they do, a single
     photon is emitted which carries away all of the energy difference. The frequencies of the
     emitted light are therefore restricted to the values
                                                             1      1
                              hνn→m = En − Em = E0             2
                                                                 − 2 .
                                                           2m      2n
The set of frequencies νn→m explains the hydrogen spectrum completely. It therefore seems
immediately clear that Bohr’s model carries a significant element of truth. However, many
aspects of it are disturbing. There is no reason to suppose that angular momentum should be
quantised other than that it leads to a explanation of Hydrogen. If an electron were orbiting
a proton in classical physics, it would be expected to lose energy to electromagnetic radiation
because it is an accelerating charge and would eventually fall into the centre. Bohr’s explanation
is simply that it doesn’t happen because it isn’t observed.
    Furthermore, while the theory can explain Hydrogen (and other single-electron ions) beau-
tifully, it is not at all clear how more complicated atoms could be treated, even in principle.
Classical orbits are not circular or even closed in such systems and a self-consistent calculation
of energies cannot be made. A theory which can only be applied to very simple systems is
clearly not completely satisfactory.
    Some generalisation was possible. Somerfeld extended Bohr’s calculation to allow for ellip-
tical orbits (and got the same set of energies). More generally, it was found that one could
specify a set of rules for any system that was classically integrable or separable, by applying
the action quantisation condition
                                              pdq = nh
around any closed coordinate curve in phase space. This process is called old quantum mechan-
ics and generalises Bohr’s quantisation of angular momentum somewhat but it is still of no use
in understanding the majority of physical systems, which are not integrable or separable.
    Note that in light of de Broglie’s relationship, we can in retrospect give some intuitive expla-
nation of the action quantisation. pdq/h = λ−1 dq simply counts the number of wavelengths
that fit into an orbit, which should be an integer if standing waves were set up. However Bohr
knew nothing of this since de Broglie’s assertion came much later.

1.6     Appendix: notation for plane waves
The canonical example of a travelling wave is the function

                                   u(x, t) = u0 cos(kx − ωt)

describing a plane wave in one dimension. Here u0 is the amplitude telling us how large the
oscillations are. We refer to k as the wavenumber, which is related to the spatial period or
wavelength λ by
and ω is the frequency which is related to the temporal period T by
                                            ω=         .
We have, of course,
                              u(x + λ, t) = u(x, t) = u(x, t + T ).
In the case of frequency the following version is also used
                                               1   ω
                                         ν=      =    .
                                               T   2π
If we need to distinguish between ν and ω, we refer to ω as the circular frequency and to ν
simply as the frequency although in practice the word frequency is often used for both. For
some reason there is no symbol in common use for the spatial analogy 1/λ of ν.
    We can also write the travelling wave in the form

                                   u(x, t) = u0 cos k(x − ct)

is the phase velocity of the wave. Notice that, for example, crests of the wave corresponding to
x − ct = 2πn move to the right with velocity c.
    It is often a huge advantage in manipulating plane waves to write them in the complex form

                                    u(x, t) = Re u0 ei(kx−ωt) .

This is so common that one often thinks of a plane wave as the complex function

                                      u(x, t) = u0 ei(kx−ωt)

with the understanding that at the end of a calculation physical answers are obtained by taking
the real part. Using the complex version means that differential relations are often quite simple.
One has
                                               = iku

for example. In classical problems, this use of complex notation is merely a device to ease
mathematical manipulation and the things functions like u(x, t) represent are ultimately real.
In quantum mechanics we will find the novel aspect that the function that we are trying to
calculate is genuinely complex. The whole theory would be extraordinarily unwieldy if we
tried to formulate it without complex numbers, which is unlike classical problems where using
complex solutions is helpful, but not absolutely necessary. In view of its relevance to quantum
theory, we will adopt the complex convention from now on.


                                          k.x= const.
                                     Figure 2: Plane waves.

   The world is three-dimensional of course so we need in general to calculate with functions
u(x, t) depending on x = (x, y, z). The three-dimensional version of the (complex) plane wave
                         u(x, t) = u0 ei(k·x−ωt) = u0 ei(kx x+ky y+kz z−ωt)
                                         k = (kx , ky , kz )
is called the wave vector. This is called a plane wave because at any given instant it is constant
on the planes defined by the condition

                                          k · x = const.

It is not hard to show that u(x, t) has the same value on any two planes separated by a
(perpendicular) distance
                                        nλ =
where n is an integer and we call
                                          k = |k|

the wavenumber. The interpretation is that planes defined by the condition k · x = const.
represent wave fronts and that these propagate in the direction of the vector k with a phase
as in the one-dimensional case. The direction of the wavevector k therefore tells us the direction
in which wave fronts travel and it magnitude k tells us the the speed at which they do so. To
see this note that if x is on a given wavefront at time t = 0 and y is on the plane this wavefront
evolves into a time t later, then
                                         k · x = k · y − ωt,
                                            k · (y − x) = ωt.
This means that
                                             d⊥ == ct
where d⊥ is the projection of y − x on the direction of k. So c gives the rate of separation of
the planes containing x and y as claimed.

                                        k.y = ω t

                  k.x = 0


                                      Figure 3: How to get d⊥ .

1.7    The wave equation
The wave equation in one-dimension is
                                         ∂2u     ∂2u
                                             − c2 2 = 0.
                                         ∂t2     ∂x

It is easily verified that the plane wave
                                       u(x, t) = ei(kx−ωt)
is a solution provided
                                             c=   .
In fact one can show that any function of the form
                                u(x, t) = f (x − ct) + g(x + ct)
is a solution but this more general form will not concern us here because it does not work for
other equations we are interested in (whereas plane waves do).
    In three dimensions the wave equation is
                         ∂2u      ∂2u ∂2u ∂2u             ∂2u
                             − c2    −     − 2        =      2 −c
                                                                  2   2
                                                                          u = 0.
                         ∂t2      ∂x2 ∂y 2  ∂z            ∂t
Here it is easily verified that the three-dimensional plane-wave
                                       u(x, t) = ei(k·x−ωt)
is a solution provided
                                             c=     .
These one and three-dimensional wave equations are the first port of call whenever we try to
understand wave problems. They are ubiquitous in applied maths and mathematical physics
and describe many physical wave problems, including waves on a string, sound waves and
electromagnetic waves (although as in the case of electromagnetic waves solutions might rep-
resent components of a vector rather than giving a complete scalar answer). They do not
apply directly to quantum mechanics but they do provide an important indication of how a
quantum-mechanical theory might work. In the next Chapter we will look for analogous partial
differential equations that are consistent with the constraints we can place on matter waves.

1.8    Appendix: de Broglie’s relation is natural in relativity
Given the association of a frequency ω with an energy E through Planck’s formula, the asso-
ciation between momentum and wavevector is very natural in light of the theory of relativity,
independently of any of the properties of light.
    Vectors in relativity often come as part of a four-component package, called four-vectors,
the most basic of which is the four-vector
                                           X = (x, ct)
representing the position of an event in spacetime or the relative displacement between two
events. Similar combinations are the four-momentum
                                           P = (p, E/c)

or the four-wavevector
                                              K = (k, ω/c).
Physical laws are naturally stated as relationships between four-vectors. If we are told that
energy is proportional to frequency, it then appears inevitable in relativity that the relationship
can be extended to the corresponding fourvectors. We should therefore expect
                                                  P = hK
to hold as a matter of principle. Written in separate timelike and spacelike components, this is
simply (6) above.

2     Wave Mechanics
2.1                        o
       Discovering the Schr¨dinger equation
Once all the elements of particle-wave duality are in place, the next step is to try to come up
with an explicit partial differential equation describing the wave properties of massive particles
such as electrons. In particular, this equation would play the same role for electrons that
Maxwell’s equations (and the wave equation derived from them) play for photons.
   In fact, it is very useful to keep this analogy with the wave equation for light in mind as we
seek the wave equation for electrons. We already know the answer in that case. Each of the
components of E and B satisfies the wave equation
                           ∂2ψ                         ∂2ψ ∂2ψ ∂2ψ
                               = c2      2
                                             ψ = c2        + 2 + 2 .
                           ∂t2                         ∂x2  ∂y  ∂z
How might we “deduce” this equation from the particle properties of photons, in such a way
that the case of electrons might be treated similarly?
   All we have to go on for the moment is that every plane-wave solution,
                                         ψ(x, t) = ei(k·x−ωt)
                                                                         ¯                 ¯
is associated with a particle travelling in free space with momentum p = hk and energy E = hω.
In fact, we could rewrite the solution in terms of these variables as,
                                        ψ(x, t) = ei(p·x−Et)/¯ .                                (7)
If we are given a solution with well-defined momentum and energy we can read off the compo-
nents of momentum by applying the vector of operators
                                                    h∂ h∂ h∂
                                                    ¯     ¯    ¯         ¯
                          ˆ    p ˆ ˆ
                          p = (ˆx , py , pz ) =         ,    ,       =
                                                    i ∂x i ∂y i ∂z       i
(giving pψ = pψ) and similarly associate energy with the derivative
                                                  E ∼ i¯

(so that i¯ ∂ψ/∂t = Eψ). We see now that the wave equation for light is simply a statement
that the energy-momentum equation for photons
                                 E 2 = c2 p2 = c2 (p2 + p2 + p2 )
                                                    x    y    z

should apply to the wave-field as an operator equation.
   Let us try to do the same thing for electrons. In the nonrelativistic limit, the energy-
momentum equation for a particle of mass m is
                                           E=     .
A plane wave solution of the form (7) should then be a solution of
                               ∂ψ    1    2    2     2        h2 2
                            i¯    =     ˆ     ˆ    ˆ
                                        px + p y + p z ψ = −       ψ.
                               ∂t   2m                       2m
Now we simply say that any wave-field (which we could express as a linear superposition
of plane-waves using Fourier transform methods) associated with an electron or any other
nonrelativistic particle with mass should be a solution of the same linear equation.
   We have effectively written down Schr¨dinger’s equation. Before discussing its properties,
however, a generalisation is needed. All of our discussion so far has involved particles propa-
gating freely in space, without external forces acting on them. We will more generally expect
the energy-momentum relation to involve a potential function,
                                       E=       + V (x).
It seems natural to generalise the wave equation for particles in such cases to
                                    ∂ψ     h2 2
                                 i¯    =−       ψ + V (x)ψ.
                                    ∂t     2m
This is the Schr¨dinger equation. Note that we might write it in the form
                                               ∂ψ   ˆ
                                          i¯      = Hψ
where H is the differential operator
                                      ˆ      ¯
                                             h    2
                                     H=−            + V (x)
which will be called the Hamiltonian operator, or Hamiltonian for short.
   Before discussing the solutions of this equation in more detail, some remarks are in order.
   • We’ve written the Schr¨dinger equation in scalar form, whereas we know that the equation
     governing the wave properties of photons had a vector character. Why do we assume that
     such a simple form holds when we already know the case of light to be more complicated?
     The simple answer is that we’re just guessing. We’ve written down the simplest equation
     we can and now hope that it corresponds to physical reality. Only after such a comparison
     is made does it really gain credibility (and it will).

   • The plane-wave solutions we’ve written down were complex. Normally such complex so-
     lutions are a mathematical device to simplify the calculation and to get physical solutions
     we must take the real or imaginary part (see (1), for example). The Schr¨dinger equation
     gives us, for the first time in nature, a theory whose solutions are intrinsically complex.
     This might be taken as the first indication that the physical interpretation of ψ(x, t),
     when it comes, will be novel.
   • Notice that, in the case of photons, h cancels when we turn the energy-momentum relation
     into a differential equation. This is one of the accidents that allows the wave theory of
     light to be formulated in such a way that quantum mechanics does not appear. Had the
     photon had mass, it seems like Maxwell’s equations might well have had h-dependent
     terms in them and quantum mechanics might have appeared at an earlier stage. There is
     a second accident, however, which is more subtle but probably more important. It is the
     nature of photons that we can pile lots of them into the same solution and effectively have
     the wavefield describing the density of large numbers of photons simultaneously. In this
     case the relevant fields, E and B, become physically measurable quantities. There is an
     exclusion principle in the case of electrons, however, stating that a given wave-field can
     only describe one electron at a time. It will never then be measurable as a true density
     of electrons and does not have a classical limit as a physical field. There are certain
     particles, called bosons, for which we can allow the same wavefunction to describe many
     individuals simultaneously. It turns out to be very hard experimentally, requiring very
     low temperatures in particular, but has recently become possible and is currently a hot
     topic (referred to as Bose-Einstein condensation).

2.2    Looking for solutions
It is conventional to call ψ(x, t) the wavefunction. We still have no idea (officially) what it rep-
resents, but let us first satisfy ourselves that the Schr¨dinger equation is promising as a physical
theory. The way we do this is to note that when we look for solutions we automatically arrive
at a generalisation of old quantum mechanics. In particular, Schr¨dinger was able to apply it
to Hydrogen and, in very short order, was able to rederive Bohr’s quantisation conditions. Not
only that, but it becomes obvious how we might in principle try to solve any other problem,
even if in practice the general solution procedure might be very hard or intractable.
    Later in the module we will see the detailed solution of the Schr¨dinger equation for Hy-
drogen but at this early stage it is not helpful to delve so much into technical detail. Instead
we will try to see what the solution strategy is and look in general terms at how the energy-
quantisation of old quantum mechanics emerges. The first step is to take advantage of the
fact that the system is time-independent (we assume, for example, that the potential does not
depend on t). If there is a nontrivial potential V (x) the plane waves
                                       ψ(x, t) = ei(k·x−ωt)
are no longer solutions of the Schr¨dinger equation but we can still find solutions with a har-
monic time dependence
                                        e−iωt = e−iEt/¯ .

That is, we look for solutions which separate into functions of space and time, of the form,
                                     ψ(x, t) = e−iEt/¯ ϕ(x).

Substituting in the Schr¨dinger equation gives
                               ˆ      ¯
                                      h        2
                               Hϕ = −              ϕ + V (x)ϕ = Eϕ.
This is called the time-independent Schr¨dinger equation. Notice that it has the form of an
eigenvalue problem. When we substitute forms for V (x) corresponding to various physical
problems and impose reasonable boundary conditions (usually ϕ(x) → 0 as |x| → ∞), we often
find solutions corresponding to a sequence of eigenvalues
                             Hϕn (x) = En ϕn (x)              n = 1, 2, . . . .                (8)

                  o                                        ˆ
In particular Schr¨dinger found for the hydrogen atom that H had the eigenvalues
                                           En = −
which correspond to Bohr’s set of allowed energies. This is very strong evidence that the
Schr¨dinger equation is on the right track. Furthermore, we know now how to tackle any other
problem in principle — write down the Hamiltonian operator and look for its eigenvalues. These
eigenvalues are then interpreted physically as the values of energy that quantum mechanics
allows the system to have.
    Much of modern physics reduces in practice to solving (8). In this module we will soon see
how to solve it for some simple one-dimensional problems and the physically important cases of
the simple harmonic oscillator (which underlies much of condensed matter physics, the theory
of lasers and much more) and Hydrogen (which is at the basis of our understanding of atoms
in general and therefore all of chemistry in particular). Before embarking on that programme,
though, let us reconsider the Schr¨dinger equation in general terms and try to come up with
some form of interpretation for the wavefunction ψ(x, t).

2.3    The beginnings of an interpretation
Once we start finding solutions of the Schr¨dinger equation, we soon become utterly convinced
that it is “right” because it so effortlessly reproduces and generalises the quantisation rules of
old quantum mechanics. It should make us uneasy, however, that we still have not really said
what ψ(x, t) is supposed to represent. In looking for an interpretation, it is again useful to
make the analogy with light.
    Consider (4) once again. The square modulus of the wavefunction of light is like a density
(of photons). Let’s see if the same might not be true of

                             ρ(x, t) = |ψ(x, t)|2 = ψ ∗ (x, t) ψ(x, t),

which is the nearest analogy we have for electrons. An intriguing hint that this might work as
a density is that we can define a vector j(x, t) such that the following continuity equation is
                                              + · j = 0.
The continuity equation arises whenever we have flow of some quantity in space. Think of a
fluid where ρ(x, t) represents the mass density. Then the current j(x, t) is a vector field directed
everywhere along the fluid flow whose magnitude is the rate at which fluid passes through unit
area normal to the flow. Physically, the continuity equation says that the rate of change of
density in some small region (∂ρ/∂t) is balanced by the influx of fluid from outside ( · j). The
existence fo a continuity equation involving ρ(x, t) = |ψ(x, t)|2 in this way therefore suggests
that |ψ(x, t)|2 is a density of some sort but again it should be stressed that this is not a “proof”
in any sense, but simply an encouragement to press ahead and see what comes of it.
    To get a continuity equation, let us define
                                  h                     ¯
                            j=       (ψ ∗ ψ − ψ ψ ∗ ) =   Im ψ ∗ ψ.
                                 2im                    m
Then, assuming ψ(x, t) satisfies the (time-dependent) Schr¨dinger equation, we have
              ∂ ∗           ∂    ∂
                 (ψ ψ) = ψ ∗ ψ +    ψ                 ψ
              ∂t            ∂t   ∂t
                            1  ∗  h2
                                  ¯           2             h2
                                                            ¯                         2
                         =    ψ −                 ψ+Vψ −ψ −                               ψ∗ + V ψ∗
                           i¯     2m                        2m
                                h        2                 2
                         = −       ψ∗        ψ−ψ               ψ∗
Using the identity
                                                                    2         2
                                 · (ϕ ψ − ψ ϕ) = ϕ                      ψ−ψ       ϕ
we then find that
                              ∂ ∗           h¯
                                (ψ ψ) = −         · (ψ ∗ ψ − ψ ψ ∗ ) ,
                             ∂t            2im
which is precisely the identity we were aiming for.
    Schr¨dinger found the fluid analogy suggested by the continuity equation so tempting that he
initially thought that the particle was genuinely smeared over space and that ρ(x, t) represented
a literal mass density. Such an interpretation is rapidly seen to be inconsistent with physical
reality, however. In practice, wavefunction solutions are often found to spread very rapidly in
space whereas electrons, when observed, always seem to be point-like. Such an interpretation
would not be consistent with the observation of particle tracks in cloud chambers, for example.

2.4    Born’s interpretation of the wavefunction
The final interpretation is usually credited to Born. It is now accepted that ρ(x, t) is merely the
probability that the particle will be found at a given position at a given time. Before looking for

it, we can have no idea where it is, other than that we are more likely to find it where ρ(x, t) is
large. If we make the same measurement on many identicle systems, all described by the same
wavefunction — extremely difficult in practice but always possible as a thought experiment —
we will obtain a different result every time. Quantum mechanics will never tell us what happens
in any individual experiment. It only says what happens on average when the results of many
identical experiments are collated.
    Many people feel cheated by this interpretation and think that we should be able to give
a more complete description of reality. The interpretation above seems to work nonetheless —
no one has ever constructed an experiment for which it is inadequate — and however strange
it seems it does end up giving a logically consistent theory. We will develop this interpretation
and the formalism that goes with it in more detail later — the fully developed version is
often referred to as the “Copenhagen interpretation” ( the people who developed it were often
associated with Bohr’s school in Copenhagen) or the “orthodox interpretation.”
    In quantum mechanics we deal with one electron at a time. In that case we demand that
the probability that the electron can be found somewhere is unity,

                                 ρ(x, t) dV =    |ψ(x, t)|2 dV = 1.

Since the Schr¨dinger equation is linear, we are free to multiply any solution by a constant in
order to ensure that this condition is satisfied. This process is called normalisation, and the
wavefunction thus obtained is said to be normalised. Defining the inner product between two
arbitrary functions to be
                                    ψ|ϕ = ψ ∗ (x) ϕ(x) dV
the normalisation condition can be written as

                                            ψ|ψ = 1.

Note that the quantity we think of as physical, ρ(x, t), does not change if we multiply the
wavefunction by a complex phase

                                      ψ(x, t) → eiθ ψ(x, t).

This is an example of gauge-invariance. There is never any reason to choose one phase con-
vention over any other and this indicates ψ(x, t) is not directly a directly measurable quantity.
We can only probe it indirectly by making measurements and making deductions about what
the density ρ(x, t) must have been like. (Note also that, after the measurement the state of the
system will have changed because the measurement process will have disturbed the system).
   Note that, in the case of a solution
                                      ψ(x, t) = e−iEt/¯ ϕ(x)

obtained from the time-independent Schr¨dinger equation, the density

                                  ρ(x, t) = |ψ(x, t)|2 = |ϕ(x)|2

is independent of time. For this reason such solutions are often referred to as stationary states.
They allow a hand-waving argument for why an electron in an atom does not radiate energy
because of acceleration and fall into the centre. The charge distribution −eρ(x) associated
with a stationary state is independent of time, even though it is associated with a classically
moving electron. In classical electromagnetism such a stationary charge distribution does not
radiate energy. This is yet another hole in Bohr’s formalism (sort of) filled in, though it should
be emphasised that a completely rational treatment of radiation needs a more sophisticated
treatment of light which we will not be going into.

2.5    Appendix: the continuity equation
A very common situation in applied maths is where we follow the evolution in time of a density
ρ(x, t) of “stuff” where, depending on the application, “stuff” can represent mass, heat, electric
charge and so on. Whenever stuff is moving around we also have an associated current which is
represented mathematically by a vector field j(x, t). Its direction tells us the direction in which
stuff is moving at a given point and time and its magnitude |j| tells us how much stuff flows in
the direction of j per unit time per unit area perpendicular to j. A universal feature of such a
current-density combination is that, if the total amount of stuff is conserved, they satisfy the
continuity equation
                                            + · j = 0.                                         (9)
The fact that we can get this equation in the quantum mechanical case is an important step in
interpreting the wavefunction and here we will see where it comes from in the general context.
    For simplicity, let assume a one-dimensional model where ρ(x, t) is the amount of stuff
per unit length. Current can only flow in one direction in one dimension so j has only one
component, which we denote by j(x, t). This tells us the net amount of stuff flowing per unit
time past the point x. The current is positive if stuff is flowing to the right and negative if it
is flowing to the left. The amount of stuff in the interval (a, b) at time t is
                                       N (t) =            ρ(x, t) dx.

If the interval (a, b) is kept fixed, the rate of change of this quantity is
                                       dN        b       ∂ρ(x, t)
                                          =                       dx.
                                       dt        a         ∂t
This tells us how much stuff enters the interval (a, b) per unit time. If stuff is neither created
nor destroyed — that is, if there is a law of conservation of stuff — then this must be accounted
for by the net current entering the interval at a and leaving at b. That is
                                           = j(a) − j(b).
But we have,
                                                            b   ∂j(x, t)
                                 j(a) − j(b) = −                         dx.
                                                           a      ∂x

So putting all this together means that
                                 b   ∂ρ(x, t)         b       ∂j(x, t)
                                              dx =        −            dx.
                                a      ∂t             a         ∂x
Since this holds for any interval (a, b) we must have

                                           ∂ρ ∂j
                                             +   = 0.
                                           ∂t ∂x
This is the continuity equation in one dimension.
   This generalises in three dimensions to
                                     ∂ρ ∂jx ∂jy ∂jz
                                        +    +    +    = 0,
                                     ∂t   ∂x   ∂y   ∂z

which is nothing but (9) in component form. The derivation in three dimensions follows the
same principle but we look at the rate of change of stuff in a volume rather than in an interval
and where we used simple integration above, we use vector calculus and the divergence theorem
to manipulate this rate of change.

3     Some one-dimensional problems
The big problem to be solved once the Schr¨dinger equation is written down is the hydrogen
atom. This allows us reproduce Bohr’s results, but with greater understanding, and points the
way to solving more complicated problems. A full solution of Hydrogen requires more sophisti-
cated techniques than we have at our disposal at present, however, so we put this problem off
until later in the module. One-dimensional problems already exhibit the essential features we
need and we begin by solving some examples which can be solved by fairly elementary tech-
niques (ordinary differential equations with constant coefficients). Note that these problems
are often of interest in their own right and not merely as an academic exercise — problems such
as the ones we solve here are often used in modelling semiconductor devices and other systems
of practical interest.
    We start with the general one-dimensional problem. A particle moving under the influence
of a potential V (x) has the time-independent Schr¨dinger equation
                                 h2 d2 ψ(x)
                               −            + V (x)ψ(x) = Eψ(x).
                                2m dx2
Notice that we use the symbol ψ for the time-independent wavefunction in this Chapter. It is
convenient to rewrite this in the form
                                        2m(E − V (x))
                               ψ (x) +                 ψ(x) = 0.
Define the local wavenumber k(x) and momentum p(x) implicitly by
                                    p(x)2   2m(E − V (x))
                                k(x)2 = 2 =               .
                                      ¯          h2
Then the Schr¨dinger equation becomes
                                     ψ (x) + k(x)2 ψ(x) = 0.
Quantum mechanics in one dimension therefore reduces to solving second-order linear differ-
ential equations. We can of course only find explicit solutions in special cases and we will
restrict ourselves in this Chapter to problems where k(x) is locally constant. Before tackling
these explicit problems, however, it is worth noting that the general qualitative features of the
solution can be neatly related to the behaviour of classical trajectories in various regions of the
line. We state here what the situation is without proof but note that these features will be
evident in the explicit problems we do later.
    The essential features of any solution ψ(x) near a given x can be read from the value of
k(x)2 there and in particular from its sign. We identify two kinds of behaviour.
    • The classically allowed region is defined by the condition E > V (x). Here there is sufficient
      energy there for the particle to climb the potential and have enough left over for a positive
      kinetic energy. In particular we can find meaningful solutions of the classical equations of
      motion in which the particle moves here (hence the name). In the Schr¨dinger equation,
      we find that k(x) > 0 in the classically allowed region and k(x) is real. The solutions
      are oscillatory as we will see shortly for the case where k is constant.

   • The classically forbidden region is defined by the condition E < V (x). There is not
     enough energy for the particle to enter such regions, which would necessitate a negative
     kinetic energy. In the Schr¨dinger equation, we have k(x)2 < 0 so k(x) is imaginary.
     While this region is forbidden to classical solutions, there is nothing to prevent us from
     finding solutions to the Schr¨dinger equation there. We find merely that the solutions
     are of a different nature to those in the allowed region. Instead of oscillating they depend
     evanescently on x. In the case of constant potential, for example, we find that solutions
     in the forbidden region are real exponentials of type e±κx , where k = iκ.
    One of the surprising features of quantum mechanics is that, because ψ(x) need not vanish in
classically forbidden regions, we find a nonzero probability of finding the particle there. Physical
effects of this feature are referred to as tunnelling. The idea is that we imagine the particle
penetrating into the hillside of a potential as if through a tunnel. In particular, we often find
that there is a possibility that the particle can pass completely through a potential barrier even
though it does not have enough energy to go over the top. This is the essential mechanism for
atomic nuclei to decay, for example, and the unpredictability inherent in quantum mechanics
is at the root of our inability to know when any given nucleus will decay.
    Let us now look at some concrete examples.

3.1    A particle in a box
The most important feature of the Schr¨dinger equation is that it often leads to solutions only
for a discrete set of quantised energies. This will generally happen in problems where a particle
would classically be confined to a region of finite volume in space. The simplest model exhibiting
this behaviour corresponds to particle being confined to a box in one dimension. We suppose
the box occupies the interval −a < x < a and that the potential vanishes in this interval so
that classically the particle moves as a free particle there.
    The particle is absolutely forbidden from leaving the box so we insist that the wavefunction
vanishes outside this interval. We also demand that the wavefunction be continuous. In other
words, ψ(x) should be a solution of the equation

                               ψ (x) + k 2 ψ(x) = 0 for −a < x < a                           (10)
where k =           h
                2mE/¯ , subject to the boundary conditions

                                     ψ(x) → 0 as x → ±a.                                     (11)

This problem is sometimes alternatively stated as solving the Schr¨dinger equation subject to
the infinite square-well potential
                                             0        for |x| < a,
                                   V (x) =
                                             ∞ for |x| > a.
The infinite potential outside the box forces the wavefunction to vanish there and gives us the
boundary conditions (11) above.

can be compared with the potential function V (x).
and they are commonly represented graphically by horizontal lines on a vertical scale which
where we now use the length L = 2a of the box. Each of these is referred to as an energy level
                                                             2m       2mL2
                                                       En =        =            ,
                                                            ¯ 2
                                                            h 2 kn   ¯
                                                                     h 2 π 2 n2
               In particular we get solutions only of the energy is one of the quantised values
                            for n = 2, 4, 6, · · ·                                   B sin kn x
                                                                                                     ψ(x) =
                            A cos kn x for n = 1, 3, 5, · · ·
                                                                                  for positive integers n, and these are of the forms
                                                                          ,       k = kn =
                                                                                                                                               We find solutions only if
                            A = 0 = sin ka or B = 0 = cos ka.
                         so A cos ka = 0 = B sin ka and for nontrivial solutions we have either
                                               0 = A cos ka − B sin ka,
                                                 0 = A cos ka + B sin ka
                                                                                                                        The boundary conditions give
                                            ψ(x) = A cos kx + B sin kx.
                                                                                         Equation (10) has the general solution
potential is often referred to as an infinite square well potential.
as one with a potential which goes to ∞ outside the box. Since a graph is rectangular, the
Figure 4: A schematic representation of a particle in a box. We may interpret the problem
                                                                  x=−a    x=a
                               ¡£¡£¡£¡£¡£¡£¡£¡¤              £ £ £ £ £ £ £
                                                                      V=0                         ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡
                               £¤¡¡¡¡¡¡¡¡¤£                                                       ¢¡¡¡¡¡¡¡¡¢¢ ¡   ¡¡¡¡¡¡¡¡
                                  £ £ £ £ £ £ £ £
                                   £¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤£                                                              ¢  ¢  ¢  ¢  ¢  ¢  ¢ 
                                                                                                   ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¡ 
                                                                                                    ¢¡¡¡¡¡¡¡¡¢¢ ¡   ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                      £¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡£¤                                             ¢¡¡¡¡¡¡¡¡¢ ¡                  
                                         £¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤£  £ £ £ £ £ £ £
                                       £¤¡£¤¡£¤¡£¤¡£¤¡£¤¡£¤¡£¤¡¤                                      ¢¡¡¡¡¡¡¡¡ ¡ 
                                                                                                       ¢¡¡¡¡¡¡¡¡¢¢¡     ¢¢   ¢¢   ¢¢   ¢¢   ¢¢   ¢¢   ¢¢    ¡
                                             ¡£¡£¡£¡£¡£¡£¡£¡¤                                           ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¡ 
                                               ¡£¡£¡£¡£¡£¡£¡£¡£                                                          ¡¡¡¡¡¡¡¡  
                                                  £ £ £ £ £ £ £ £
                                                    £¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤£                                                      ¢  ¢  ¢  ¢  ¢  ¢  ¢ 
                                                                                                             ¡¡¡¡¡¡¡¡¢¢ ¡   ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                      ¤¡¡¡¡¡¡¡¡¤£¤                                                            ¢  ¢  ¢  ¢  ¢  ¢  ¢ 
                                                                                                             ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¡ 
                                                                                                              ¢¡¡¡¡¡¡¡¡¢¢ ¡  ¡¡¡¡¡¡¡¡
                                                                                                                              ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                          £¡£¡£¡£¡£¡£¡£¡£¡                                     ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¡
kinetic energy) but it is reflected if the energy is too low.
penetrable walls — with enough energy it punches through and escapes (albeit with reduced
a particle bouncing between brick walls, this one might be thought of as a particle confined by
where V0 is a constant, assumed positive in this section. If the previous problem is visualised as
                                                             for |x| > a,        0
                                                                                      V (x) =
                                                             for |x| < a,       −V0
  A second example of a confining potential is the finite square well defined by the conditions
                                                                            Particle in a finite square well                                    3.2
                   so the first state with n = 1 is even, the next with n = 2 is odd, and so on.
(12)                                                           ψn (−x) = (−1)n+1 ψn (x),
            A noteworthy feature of these solution is that they alternate in symmetry. That is
                                                                               L      L
                                   for n = 2, 4, 6, · · ·                        sin              
                                                                               2     nπx          
                                                                                                  ψn (x) = 
                                                                            L      L              
                                   for n = 1, 3, 5, · · ·                     cos
                                                                            2     nπx             
             normalised wavefunctions are therefore
2/L. The     and it is easily seen that we can achieve this in each case by choosing A = B =
                                                                          −a                              −∞
                                 |ψ(x)|2 dx =   |ψ(x)|2 dx = 1
that                          ∞               a
   It remains to determine the constants A and B in the wavefunction. These are chosen so
                     Figure 5: The energy levels for the particle in a box.
                        ¤¡£¡£¡£¡£¡£¡£¡£¡£¤                £ £ £ £ £ £ £                    ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡
                                                                                             ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢                             
                            £¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤£                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                                              ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¢
                                                                                                 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                ¡£¡£¡£¡£¡£¡£¡£¡£¤                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                                                     ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢
                                  ¡£¡£¡£¡£¡£¡£¡£¡£¤                                                   ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡
                                                                                                       ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢
                                      ¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡£                                                 ¢ ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢
                                                                                                           ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                        ¤¡¡¡¡¡¡¡¡£                             En                          ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                         £ £ £ £ £ £ £ £
                                                                                                               ¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡
                                             ¤¡¡¡¡¡¡¡¡£¤                                                         ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢
                                                                                                                  ¢¡¡¡¡¡¡¡¡ ¢
                                                 £¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤£                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                                                                   ¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡
                                                  ¤£¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤¡¤                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                                                                                       ¢ ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢
                                                       £¡£¡£¡£¡£¡£¡£¡£¡                                                    ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                                                                                             ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                                                                            ¡¡¡¡¡¡¡¡ ¢ 
                             V=0                   II             V=0

                             I                                        III

                                      x=−a                  x=a

                                 Figure 6: The finite square well.

   Here we consider energies in the range
                                             −V0 < E < 0
corresponding to confinement. The Schr¨dinger equation can be written
                                 ψ (x) + k 2 ψ(x) = 0 in |x| < a
                                 ψ (x) − κ2 ψ(x) = 0 in |x| > a
                                     2m(E + V0 )               −2mE
                           k=                           and κ =       .
                                         h                      ¯
As before we insist that the solution should be continuous. We also impose continuity on the
derivative, in particular at x = ±a where the potential is discontinuous — otherwise the current
would be discontinuous there, and we would have to account for the creation of particles at
x = ±a.
    We can find continuous solutions to the Schr¨dinger equation for any value of E. However,
only at a discrete set of values can we find solutions which are square-integrable. In the present
context, square-integrability means we insist that the wavefunction decays exponentially in the
forbidden regions I and III of figure 6. The solution is therefore of the form
                                   Ce                  for x < −a,
                       ψ(x) =  A cos kx + B sin kx for −a < x < a,
                                                       for x > a.
Now, one of the exercises shows that since the potential is symmetric with respect to reflection
about the origin, V (−x) = V (x), the solution must be either an even or an odd function of x
(as in (12)). Let us consider the even case first. An even solution is of the form
                                        De              for x < −a,
                             ψ(x) =        A cos kx for −a < x < a,
                                                         for x > a.

Imposing continuity of the wavefunction and its derivative then gives

                                        κa = ka tan ka.

This transcendental equation has a solution only for a finite number of energies E n which are
the quantised energy levels of the system. We obtain a similar condition

                                       κa = −ka cot ka.

for the energy levels corresponding to odd solutions.
    The details of this calculation are left as an exercise as outlined in the problem set. The
important point is that by demanding that the solution be square-integrable, effectively impos-
ing the condition that it decay at infinity, we find solutions only for a discrete set of energies.
This is a general feature of problems where the classical motion is confined to a region in space
of finite volume (or more properly of length in one dimension).

3.3    Scattering from a step
In the examples so far we have found that the energy is quantised when the classical motion is
confined to a finite region. The situation changes if we consider problems where the particle is
not confined classically. We illustrate such scattering problems with the case of a step potential
of the form
                                              0 for x < 0,
                                   V (x) =
                                              V0 for x > 0.
illustrated in figure 7.

                          b                           d

                          a                           c



Figure 7: The step potential, and a schematic representation of the various coefficients in the
general solution.

    In this case classical motion is always unbound. If 0 < E < V0 , a particle coming from
the left is reflected at the step at x = 0 because it does not have enough energy to enter the
region x < 0 and returns to negative infinity. If E > V0 the particle has enough energy to
go everywhere. If it comes from the left, then on reaching x = 0 it loses kinetic energy and

slows down, buts keeps going. Notice that classical particles are therefore either reflected or
transmitted at the step with probability one, depending on the value of E.
    In the wave-mechanical picture we solve the Schr¨dinger equation to find solutions of the
                                       aeikx + be−ikx      for x < 0,
                             ψ(x) =
                                       ceik0 x + de−ik0 x for x > 0,
with                            √
                                  2mE              2m(E − V0 )
                           k=             and k0 =             .
                                   h                   ¯
We treat the case E > V0 here. As before, we impose continuity on the wavefunction and its
derivative. These conditions give, respectively,

                                          a+b = c+d
                                      k(a − b) = k0 (c − d).

It is convenient to represent this information using a transfer matrix, which tells us how to find
the coefficients for the solution on the right given the coefficients for the solution on the left,

                               c         1    k0 + k   k0 − k     a
                                    =                                  .
                               d        2k0   k0 − k   k0 + k     b

To specify the solutions any further we need to ask how the solutions behave at infinity.
    In the case of bound states, simply stating that the solution should not blow up as x → ±∞
is enough to determine the remaining coefficients (up to an overall normalisation constant) and
give quantisation of energy. In the present case of scattering states, we get equally valid solutions
no matter what the values of the remaining constants. Such solutions are not normalisable, so
effectively they represent an infinite number of particles, but we can still give physical meaning
to the solutions. They represent a steady stream of particles hitting the step, some going to the
right (as represented by the solutions eikx and eik0 x ) and some going to the left (as represented
by the solutions e−ikx and e−ik0 x ). Let us compute the current on the left-hand side where

                                        ψ(x) = aeikx + be−ikx .

                      j(x) =     Im ψ ∗ (x)ψ (x)
                             =   Im ik |a|2 − |b|2 + ab∗ e2ikx − a∗ be−2ikx
                             =     |a|2 − |b|2
                             = v |a|2 − |b|2

where v is the velocity of particles on the left. Then the part aeikx represents right-going
particles and is responsible for the positive current v|a|2 while be−ikx represents left-going par-
ticles and is responsible for the negative component −v|b|2 of the current. We get a similar
decomposition of the current on the right.
    A particularly interesting case is where we send a steady stream of particles from the left.
These hit the step and may be reflected back to the left or transmitted forward to the right.
The essential feature of this situation is that we do not have particles coming towards the step
from the right and d = 0 in the solution above. It is traditional to normalise such a scattering
state so that it is written in the form
                                          eikx + re−ikx        for x < 0,
                               ψ(x) =
                                          teik0 x              for x > 0,
where r and t are referred to as the reflection and transmission coefficients respectively. These
coefficients are related to each other using the transfer matrix as follows

                               t         1       k0 + k      k0 − k        1
                               0        2k0      k0 − k      k0 + k        r

and this gives in turn
                                    k − k0               2k
                                   r=        and t =          .
                                    k + k0             k + k0
Notice that in contrast to the classical situation, where all particles are transmitted when
E > V0 , the quantum-mechanical solution indicates that some particles are reflected since the
reflection coefficient is nonzero. A calculation of the current as above yields
                                           (1 − |r|2 )v      for x < 0,
                                j(x) =
                                           |t|2 v0           for x > 0,
           ¯                                           ¯
where v = hk/m is the velocity on the left and v0 = hk0 /m the velocity on the right. It is easy
to verify using the expressions for r and t above that the current is the same on both sides.
This allows us to write
                                         1 = |r|2 + |t|2 .
We then interpret R = |r| as the probability that a particle is reflected by the step and
T = (v0 /v)|t|2 as the probability that it is transmitted.
   We conclude by noting briefly what features of the above solution are different if 0 < E < V 0
(the details of this are left as an exercise). The solution on the left is the same as before but
we must substitute iκ = i 2m(V0 − E) for k0 in the solution on the right. The expressions for
the reflection and transmission coefficients then become
                                        k − iκ                    2k
                                   r=                and t =           .
                                        k + iκ                  k + iκ
Notice in particular that
                                                 |r|2 = 1,

which indicates that all particles are reflected. They do not have enough energy classically
enter the region x > 0 and quantum-mechanically, the wavefunction is a decaying exponential
ψ(x) = te−κx there. While they can penetrate for a bit, even quantum-mechanically all particles
are eventually sent back. Note that the expressions we derived for the current cannot be easily
transcribed from the E > V0 case when x > 0 because when we take the imaginary part in the
formula for j(x), the replacement of k0 by iκ changes things. We find in fact that the current
vanishes on both sides in this case, which means that particles moving to the right are always
balanced by particles moving to the left.
    Finally, we finish by emphasising that in this scattering system we have found that the
energy is not quantised. Equally valid solutions can be found for all positive energies. This is
a general feature of problems in which particles can escape to infinity.

4     State space, observables and the formalism of quan-
      tum mechanics
We are now in a position to start describing the general principles of quantum mechanics.
This is usually done by formulating a series of postulates. These take the interpretations of
quantum mechanical results which have seemed “not unreasonable” to us as we have developed
the basic equations, and formulate them so that they are more precise and valid for general
situations. These there is no proof for these postulates, of course, but they form a self-consistent
interpretation of quantum mechanics which, in the three-quarters of a century since its birth,
has proved consistent with every observation made of a quantum mechanical system.
    We first state them for a single particle moving in a potential. Once the mathematical tools
have been developed further, these can be generalised in a natural way to any quantum system.
Note that the order and numbering of the postulates below is not universal and we might well
find it useful to rearrange the contents when we consider more general systems later.

4.1    State space
We begin by describing two postulates which establish how we describe the state of a quantum-
mechanical system.

Postulate 1: The state of a one-particle system is represented by a wavefunction

                                              ψ(x, t),

depending on position x and in general evolving in time. Everything we can hope to know about
the system is determined from ψ(x, t).

Postulate 2: We associate a probability density

                                        ρ(x, t) = |ψ(x, t)|2

with ψ(x, t) so that the probability of finding the particle inside a region D of space is

                                     p(x ∈ D) =          ρ(x, t)dV.

In much of what follows we consider wavefunctions at a fixed instant in time, so it may be
convenient to suppress time in the notation and refer simply to ψ(x) or even to ψ.
    We will consider states which describe a single particle (and will leave aside for now the
issue of scattering states of the type encountered in the previous Chapter). In that case we can
meaningfully associate a wavefunction ψ to the state of a particle if

                                       ψ       =   |ψ|2 dV < ∞,

the integration being taken over all space. Such a wavefunction is said to be square-integrable
or normalisable and allows us to define a normalised wavefunction

                                               ˆ  ψ
for which the total probability of finding a particle somewhere in space is one. In addition we
might want to consider further conditions such as continuity or smoothness. It turns out that in
the first instance it is convenient to impose only the condition that the function be sufficiently
unpathological that it be integrable — we will not go into the technical details here — and
impose additional constraints such as continuity or differentiability as we need them later.
    The set of integrable and normalisable wavefunctions is closed under addition and multi-
plication by complex constants. That is, if ψ and ϕ can describe a quantum state then so
                                         χ = αψ + βϕ
for any complex constants α and β. So state space has the structure of a vector space. Fur-
thermore we can define on state space an inner product

                                      ϕ|ψ =        ϕ(x)∗ ψ(x) dV.

This inner product plays a key role in quantum mechanics. The key properties of any inner
product are

   • Linearity: χ|αϕ + βψ = α χ|ϕ + β χ|ψ
   • Conjugate symmetry: ϕ|ψ               = ψ|ϕ

   • Positivity and nondegeneracy: ψ|ψ ≥ 0, with equality iff ψ = 0.

So we have established that state space is naturally thought of as an inner product space, that
is, a vector space with an inner product. In fact it has the slightly stronger property of being a
Hilbert space. A Hilbert space is an inner product space with certain nice properties to do with

the convergence of sequences. We will never use these additional properties in this module,
but it is standard terminology in quantum mechanics to refer to state space simply as “Hilbert
space” and we will use the same terminology, if only to communicate with text books. In honour
of this terminology we often denote state space by the symbol H.
    A useful property of the inner product which follows from the conditions above is the
Schwarz inequality: | ϕ|ψ | ≤ ψ · ϕ with equality iff αϕ + βψ = 0 for some α and β,
where in general we denote, for any ψ, its norm by
                                           ψ =           ψ|ψ .
In particular this means that ϕ|ψ is finite for any two square-integrable states ψ and ϕ.
   An important tool in the analysis of quantum-mechanical systems is the orthonormal basis.
An orthonormal basis is a sequence of functions,
                                    ϕn (x),   n = 1, 2, 3, · · · ,
which satisfy
                                           ϕn |ϕm = δnm
(so they are orthonormal) and which are such that any state ψ can be written as a linear
combination of them
                                     ψ=      c n ϕn
(so they form a basis). In working with an orthonormal basis of functions it is useful keep in
mind a simple geometrical analogy with unit vectors in I 3 .
Example: Consider the expansion
                                        x = xi + yj + zk
of an arbitrary vector in I in terms of the unit vectors i, j and k. The unit vectors are
orthonormal with respect to the dot product because
                     i · j = j · k = k · i = 0 and i · i = j · j = k · k = 1.
Given the vector x, we can find the components x y and z by forming the projections
                           x = i · x,      y = j · x and z = k · x.

   A similar construction works for orthonormal bases in state space except that the compo-
nents are complex and there are more of them (infinitely many in fact). Given the expansion
ψ = n cn ϕn , we can compute the coefficients cn from the inner products
                                   ϕn |ψ      =    ϕn |       c m ϕm

                                              =          cm ϕn |ϕm
                                              =          cm δnm
                                              =   cn .

It is useful to note that we can therefore represent any state in the form

                                          ψ=               ϕn ϕn |ψ .

Other useful identities follow similarly from the orthonormality of the basis. Let the states ψ
and χ have the expansions

                               ψ=         c n ϕn       and χ =                 an ϕ n .
                                      n                                    n

Then their inner product can be written

                                   χ|ψ        =             an ϕ n |       c m ϕm
                                                       n               m
                                              =            a∗ cm ϕn |ϕm
                                              =            a∗ cm δnm
                                              =            a∗ c n
                                              =            χ|ϕn ϕn |ψ ,

which is similar to the identity

                                   x 1 · x 2 = x 1 x2 + y 1 y2 + z 1 z 2

for 3D vectors, except that we have in addition the complex conjugation of a set of components.
A special case of this is that the norm of a state is
                                          ψ        =        ψ|ψ
                                                   =            |cn |2
                                                   =            | ϕn |ψ |2

which is analogous to
                                          |x|2 = x2 + y 2 + z 2 .

4.2    Observables, operators and the Hermitian conjugate
The term observable is used in quantum mechanics for any property of a system which we
might hope to measure and give physical meaning to (the practicality of any such observation
will not concern us). In practice, the observables we meet will be positions, momenta, energies
and things which are functions of these quantities. We have already seen that momentum

components and energy are associated in quantum mechanics with operators which act linearly
on wavefunctions. The vector of operators

                                                  h∂ h∂ h∂
                                                        ¯    ¯         ¯
                          ˆ    p ˆ ˆ
                          p = (ˆx , py , pz ) =       ,    ,       =
                                                  i ∂x i ∂y i ∂z       i

corresponds to the three components of momentum and the Hamiltonian operator

                                          ˆ = h      2
                                          H              + V (x)
corresponds to energy. The eigenvalues of H in particular corresponded to the energies “allowed”
by quantum mechanics. The next two postulates assert that this carries over to any observable.
Postulate 3: To every observable O there corresponds a Hermitian operator O acting on wave-
functions ψ(x).
Postulate 4: A measurement of the observable O can only give a value in the spectrum of O. If
O has discrete eigenvalues
                                      Oϕn = λn ϕn ,
the allowed values of O are therefore quantised. Furthermore if the particle is in a state described
by an eigenfunction ϕn , then a measurement must yield the value λn with certainty.

  We need to explain some terminology (particularly “Hermitian”) before these postulates fully
make sense, but let us begin by listing operators corresponding to some common observables:
                             h ∂
   • Momentum: px ψ =             ψ
                             i ∂x
                     ˆ       ¯
                             h            2
   • Kinetic energy: T ψ = −                  ψ
                   ˆ      ¯
                          h               2
   • Total energy: Hψ = −                     + V (x) ψ
   • Potential energy: V ψ = V (x)ψ

   • Position: xψ = xψ
These operators have in common the property that they are linear. That is
                                   ˆ             ˆ      ˆ
                                   O(αψ + βϕ) = αOψ + β Oϕ

for all allowed states ψ and ϕ and complex constants α and β. They generalise the concept of a
linear transformation (or multiplication of a vector by a matrix) to infinite-dimensional Hilbert
spaces. These operators might involve differentiation and other operations that mean that that

are not defined for arbitrary states. In general we expect therefore to have to restrict their
action to a domain D ⊂ H. We generally hope that the domain is at least is large enough that
we can approach any state arbitrarily closely while remaining within D but even for common
operators D will be a proper subset of H.
Example: In one dimension the momentum operator
                                                     h ∂
                                                     i ∂x
acts on differentiable functions ψ(x) such that
                                pψ   2
                                         = h2
                                           ¯         |ψ (x)|2 dx < ∞.

We exclude from the domain of p functions which are not differentiable and functions which
become unnormalisable when a derivative is taken.
The need to specify domains is a severe complication of dealing with infinite-dimensional spaces.
A careful treatment is outside the scope of the module and we will pass over the issue for the
most part. In many of our calculations the specification of the domains of operators such
as the Hamiltonian is hidden in our specification of the boundary conditions we impose on
wavefunctions. For example, when we quantised the infinite square well we restricted ourselves
to functions which were continuous and which vanished on the boundary.
    An important property of the operators that represent observables is that they are Hermitian
and we will now describe what this property represents. An operator O is Hermitian if
                                            ˆ    ˆ
                                          ϕ|Oψ = Oϕ|ψ
for all states ϕ and ψ in its domain. An important property of such operators is that they have
                                                   ˆ                          ˆ
real eigenvalues. Let ϕn be a proper eigenstate of O, by which we mean that Oϕn = λn ϕn and
                                             ˆ Then because O is Hermitian,
ϕn is a normalisable state in the domain of O.
                                             ˆ     ˆ
                                         ϕn |Oϕn = Oϕn |ϕn
and because ϕn is an eigenvector this becomes
                                     ϕn |λn ϕn = λn ϕn |ϕn
                                     λn ϕn |ϕn = λ∗ ϕn |ϕn

and we find that λn = λ∗ . It is then reasonable to state that these eigenvalues are the results
that can be obtained from a measurement of the observable O. Had O not been Hermitian it
would in general have had complex eigenvalues and they could not have represented values of
an observable.
Example: In one dimension the position operator is Hermitian.

                                  x        =         ϕ(x)∗ (xψ(x)) dx

                                                 =       (xϕ(x))∗ ψ(x)dx

                                                 =     ˆ
                                                       xϕ|ψ .

Example: In one dimension the momentum operator is Hermitian. Integrate by parts and use
the fact that of the states are normalisable then they vanish as x → ±∞:
                                           ∞      ¯
                            p       =          ϕ(x)∗ψ (x) dx
                                       −∞         i
                                      h                  ¯
                                                         h ∞
                                    =   [ϕ(x)∗ ψ(x)]∞ −
                                                    −∞         ϕ (x)∗ ψ(x)dx
                                      i                   i −∞
                                        ∞   ¯
                                    =         ϕ (x) ψ(x)dx
                                       −∞   i
                                    =    ˆ
                                         pϕ|ψ .

    The examples above illustrate one more complication that arises with infinite-dimensional
operators. Not all eigensolutions are proper. Consider the momentum operator in one dimen-
sion. We can easily write an equation
                                                       h             h
                                               peip0 x/¯ = p0 eip0 x/¯
that has the structure of an eigenvalue equation, but notice that the eigenfunction e ip0 x/¯ is not
normalisable and therefore not in the Hilbert space. If p0 is a real number, however, then eip0 x/¯h

is “not too far” outside H — we can construct normalisable states which approximate e ip0 x/¯      h

and are approximate eigenfunctions of p. Such (real) numbers are included, along with the
proper eigenvalues, in the spectrum of an operator. An operator corresponding to an observable
will in general have a spectrum consisting of real numbers which represent the possible values
that result from a measurement of that observable. Proper eigenvalues make up the discrete
part of the spectrum — generally corresponding to isolated points on the real line. Improper
eigenvalues like p0 above make up the continuous spectrum and form a continuous rather than
a discrete set as the name suggests. A precise statement and demonstration of these facts is
beyond the scope of this module and we will simply assume that an operator representing an
observable has a real spectrum, possibly combining discrete and continuous parts, in which
discrete parts correspond to proper eigenvalues and continuous parts correspond to improper
eigenvalues as in the case of the momentum operator above.2
    We have established that Hermitian operators have real proper eigenvalues. We can also
show that the eigenstates corresponding to two such eigenvalues are orthogonal. Let
                                   Oϕn = λn ϕn             ˆ
                                                       and Oϕm = λm ϕm
   2                                                                               ˆ
     If we make certain claims about the domains of the Hermitian operator O then we can prove that the
spectrum is real. Operators for which this can be done are called self-adjoint. In many text books, the terms
Hermitian and self-adjoint are used interchangeably, but there is a difference in the domains they can have and
they are not exactly the same thing. The postulates should properly state that observables are represented by
self-adjoint operators but we will not make the distinction in this module and will use the terms interchangeably.

be two proper eigensolutions. Then
                                             ˆ     ˆ
                                         ϕn |Oϕm = Oϕn |ϕm
                               ⇒        λm ϕn |ϕm = λ∗ ϕn |ϕm

                               ⇒        (λm − λn ) ϕn |ϕm = 0.

If the eigenvalues are distinct then this implies

                                           ϕn |ϕm = 0

and the eigenvectors are orthogonal as promised. Eigenvectors which correspond to the same
degenerate eigenvalue are not necessarily orthogonal from the outset. However, it is not difficult
to show that, given a set of eigenvectors with the same eigenvalue, we can can choose linear
combinations of them which are orthogonal. This is useful for the following reason. Given a
Hermitian operator, we can choose the eigenvectors so that they form an orthonormal set

                                          ϕn |ϕm = δnm .
In quantum mechanics, if O has a purely discrete spectrum then the proper eigenvectors are
complete. That is, any state can be written as a linear combination of them and they form an
orthonormal basis — which we will often refer to as an eigenbasis.
                                                                                 ˆ     h
Example: Consider the space of functions ψ(θ) on the interval 0 < θ < 2π and let L = −i¯ ∂/∂θ
act on functions with a smooth periodic extension

                                       ψ(θ + 2π) = ψ(θ).
One can show that L is Hermitian and that the eigenfunctions
                              ϕm (θ) = √       m = · · · , −1, 0, 1, · · ·
form an orthonormal set (exercise). Then any function ψ(θ) can be represented as a linear
combination                        ∞                 ∞
                         ψ(θ) =       c m ϕm = √         cm eimθ
                                 m=−∞            2π m=−∞
                                               1    2π
                              cm = ϕm |ψ = √           e−imθ ψ(θ)dθ.
                                               2π 0
In this case the eigenfunctions form an orthonormal basis as promised and representing arbitrary
functions as linear combinations of them recaptures the idea of Fourier series.
   Finally, related to the idea of a Hermitian operator is the Hermitian conjugate of an operator.
    ˆ                                                        ˆ
Let A be an operator, not necessarily Hermitian. We say A† is the Hermitian conjugate of A if  ˆ

                                          ˆ    ˆ
                                        ϕ|Aψ = A† ϕ|ψ

for all suitable ψ and ϕ. Once again the situation is complicated by the need to specify domains.
This is an issue which we will pass over and account for by the use of the word “suitable”.
Example: Define a translation operator which acts on one-dimensional wavefunctions as follows
                                                     Ta ψ(x) = ψ(x − a).

Then for any ϕ and ψ

                           ϕ|Ta ψ           =        ϕ(x)∗ ψ(x − a)dx

                                            =        ϕ(x + a)∗ ψ(x )dx     (x = x − a)

                                            =       ˆ
                                                    T−a ϕ|ψ
                ˆ†   ˆ
and we find that Ta = T−a .
   Notice that an operator is Hermitian if
                                                          ˆ    ˆ
                                                          A† = A.

We list below some properties of the Hermitian conjugate. These can be shown without diffi-
culty of we assume that any time an operator acts on a state, the result is well defined (that
is, the state is in the domain of the operator).

        ˆ ˆ       ˆ    ˆ
   (i) (A + B)† = A† + B †

          ˆ        ˆ
   (ii) (αA)† = α∗ A†

                   ˆ     ˆ        ˆ        ˆ
so in particular (αA + β B)† = α∗ A† + β ∗ B † .

          ˆˆ     ˆ ˆ             ˆˆ    ˆ ˆ       ˆ ˆ
   (iii) (AB)† = B † A† (since ϕ|ABψ = A† ϕ|Bψ = B † A† ϕ|ψ )

         ˆ       ˆ
   (iv) (A† )† = A

        ˆ        ˆ
   (v) (An )† = (A† )n

          N            †       N
   (vi)           ˆn
                cn A       =              ˆ
                                     c∗ ( A † ) n
          n=0                  n=0
                     ˆ     ˆ
Finally we note that A and B are Hermitian then so are

   (vii) An

          ˆ ˆ
   (viii) A + B

   (ix)            ˆ
                cn An for real cn .

4.3       Measurement and the evolution of the wavefunction
The last two postulates tell us what happens when we try to make a measurement of a quantum
system and how it evolves between measurements.
Postulate 5: If the observable O is measured then the result must be in the spectrum of O. If λn
is a proper eigenvalue, then the probability that λn is obtained when a system with wavefunction ψ
is observed is
                                          | ψ|ϕn |2       ˆ     2
                                    pn =             = ψ|ϕn ,
where ϕn is the corresponding eigenstate and ψ is the normalised wavefunction corresponding to ψ.
If λn is obtained, then immediately after the measurement the particle is in a state corresponding
to ϕn .

    A full statement of the postulate would tell us how to deal with results in the continuous
spectrum, so that in particular it subsumes Postulate 2 as a special case. For now however
we restrict ourselves to the discrete spectrum. Let us assume in particular that O has a fully
discrete spectrum and that the proper eigenstates ϕn form a complete set. Then if we assume
that ψ = ψ is normalised we can expand it as
                                          ψ=         c n ϕn
                                       ψ|ψ =         |cn |2 = 1.
It seems natural therefore to interpret
                                            pn = |cn |2
as a probability and the postulate asserts that this is indeed what it is. Immediately after
the measurement, we know with certainty that the observable has the value λn . We can only
conclude that the state is then described by ϕn and not ψ. The interpretation is that in
observing the system we have interfered with it and changed the state. The process whereby ψ
changes to ϕn is called “reduction” or the “collapse of the wavefunction”. The essential inability
to detach the observer from the system being observed is a central feature of quantum mechanics
and is intimately connected with its inherent unpredictability.
    Even in the absence of measurement, an isolated quantum system evolves in time. The final
postulate tells us how this happens.

Postulate 6: The evolution in time of the wavefunction of an isolated quantum system is governed
by the Schr¨dinger equation
                                             ∂ψ   ˆ
                                          i¯    = Hψ

where H is a Hermitian Hamiltonian operator.

   Note that in particular, “isolated” here means that we do not make observations or measure-
ments, which would lead to a collapse of the wavefunction outside the remit of the Schr¨dinger
equation. Notice also that we have not specified the form of the Hamiltonian operator. For a
particle moving in a potential we have already seen that it is
                                      ˆ   ¯
                                          h             2
                                      H=−                   + V (x),
but the postulate leaves open the possibility that we might deal with more general systems,
with other Hamiltonians. This is similar to classical mechanics where we state Newton’s laws of
motion without specifying what the forces between bodies are. Determining the explicit form
of a given force (such as the inverse square law of gravitation) is then a question of coming up
with a description of a given system rather than part of the framework of mechanics itself.
    Even if we don’t give a general prescription for H, the fact that it is Hermitian is an
important feature. It means that probabilities are conserved and the evolution is consistent with
the probabilistic interpretation we have given to wavefunction amplitudes. Consider for example
the evolution of ϕ|ψ where both ϕ and ψ evolve according to the Schr¨dinger equation,

                             d              ∂ψ     ∂ϕ
                                ϕ|ψ    =    ϕ|   +    |ψ
                             dt             ∂t     ∂t
                                             1 ˆ       1 ˆ
                                       = ϕ| Hψ +         Hϕ|ψ
                                            i¯         h
                                          1      ˆ       ˆ
                                       =      ϕ|Hψ − Hϕ|ψ
                                       = 0.

In particular in the case ϕ = ψ we find that the total probability ψ|ψ of finding the particle
somewhere in space does not change with time, which is clearly necessary for the consistency
of our interpretation.

4.4    Expectation values and uncertainties
Suppose we are repeatedly able to prepare a quantum system in an identical state ψ and
make repeated observations of an observable O. The astonishing fact is that even though
the first postulate asserts that the system cannot be specified in any more detail and is fully
determined quantum-mechanically, we should expect to get different results each time. That is,
identical experiments conducted on an identical state will yield different results even when our
experimental techniques are perfectly accurate and we know the state with infinite precision.
   What quantum mechanics does allow us to predict are the statistics of such measurements.
Let us assume for simplicity that the spectrum of O is discrete and therefore that we can
construct an orthonormal basis of eigenfunctions ϕn of O. Let the system be in a normalised

                                              ψ=           c n ϕn .
Since the probability that an eigenvalue λn is obtained in a measurement is
                                                  pn = |c2 |,

the average result obtained in the series of measurements of O, which in quantum mechanics
is called the expectation value and denoted by O , is
                                    O =            λn p n =                λn |c2 |.
                                              n                    n

This can neatly be expressed in terms of the wavefunction itself. Notice that
                               Oψ =         ˆ
                                         cn Oϕn =    λn c n ϕn ,
                                          n                        n

so we can therefore write
                                   O =                            ˆ
                                                  c∗ (λn cn ) = ψ|Aψ
and this can be computed without reference to the eigenbasis. In fact, the right-hand side can
be computed in general even if O does not have a discrete spectrum. One can show that that
this matrix element expression gives the average result of measurement even in that case (see
appendices) and we will let this define the expectation value in general. In this context we take
comfort from the fact that, because O is Hermitian, the expectation value
                                          ˆ       ˆ
                                 O = ψ|Aψ = Aψ|ψ = O ∗
is self-evidently real. Finally, if we prefer to leave open the possibility that we might work with
unnormalised wavefunctions, then we find the expectation value using
                                              ψ|Aψ      ˆ
                                    O =            =      .
                                               ψ 2    ψ|ψ
   Even though we can readily compute the average result of these measurements given ψ, we
do not know in advance of any individual measurement what we will get and the outcome is
uncertain. We quantify the uncertainty by defining
                                     ∆O2 = (O − O )2 .
That is, the uncertainty ∆O, which measures how much our measurements are spread around
the average value, is the variance of the results we find. Notice that we can simplify this as
                               ∆O2 =          (O − O )2
                                      =       O2 − 2 O O + O
                                      =       O2 − O                   .
Notice also that we can predict O with certainty precisely when ψ is an eigenstate of O (see

4.5    The commutator
An important difference between classical observables and their quantum operator counterparts
is that the operators do not commute. For example, for any state ψ(x) in one dimension
                                  h                              ¯
                       pxψ(x) =     (xψ(x))     =     ˆˆ
                                                      xpψ(x) =     xψ (x),
                                  i                              i
or, in other words,
                                           ˆ ˆ ˆˆ
                                           xp = px.
                                                              ˆ     ˆ
We formalise this difference by defining, for any two operators A and B, the commutator
                                       ˆ ˆ     ˆˆ   ˆˆ
                                      [A, B] = AB − B A.

Commutators are of primordial importance in quantum mechanics and their algebra plays a
role that is sometimes analogous to that played by calculus in classical mechanics. Indeed the
first formulations of quantum mechanics (such as matrix mechanics) were intimately connected
with this idea.
    A particularly important example is the commutator between position and momentum op-
erators. From the calculation above we can see that

                                            x ˆ      h
                                           [ˆ, p] = i¯

where the right-hand side is interpreted as the operator which multiplies a wavefunction by
i¯ . Calculation of more complicated examples is often aided by making use of the following

                      (a)     ˆ ˆ       ˆ     ˆ ˆ      ˆ ˆ
                              A, αB + β C = α A, B + β A, C ,

                      (b)     ˆ ˆ      ˆ ˆ
                              A, B = − B, A

                      (c)     ˆ ˆˆ     ˆ ˆ ˆ ˆ ˆ ˆ
                              A, B C = A, B C + B A, C ,

                      (d)     ˆ ˆ ˆ
                              A, B, C      ˆ ˆ ˆ     ˆ ˆ ˆ
                                         + B, C, A + C A, B           = 0,

whose proof is either obvious or left as an exercise (see problem sheets). We note in particular
                      ˆ ˆ †     ˆˆ        ˆˆ       ˆ ˆ      ˆ ˆ     ˆ ˆ
                     A, B = (AB)† − (B A)† = B † A† − A† B † = B † , A† ,
      ˆ     ˆ
so if A and B are Hermitian then so is

                                         ˆ   1 ˆ ˆ
                                         C=    A, B .
                   ˆ    ˆ                                                       ˆ
In the case where A and B are the position and momentum operators, for example, C is the
identity operator.

4.6    Positive definite operators
With certain observables such as kinetic energy, we expect a measurement always to yield a
positive or at least a nonnegative result. It will be useful in later calculations to be able to
characterise the operators associated with such observables.
   A Hermitian operator A is said to be semipositive definite if, for all states ψ in its domain,
                                           ψ|Aψ ≥ 0.

We sometimes write in this case,
                                             A ≥ 0.
If equality holds if and only if ψ = 0, then we say that A is positive definite. Notice that by
substituting an eigenfunction for ψ we immediately find that any proper eigenvalue λ n of a
semipositive definite operator satisfies
                                            λn ≥ 0
while the eigenvalues of a positive definite operator satisfy

                                            λn > 0.

Therefore semipositivity or positivity is reflected in the allowed outcomes of a measurement of
the corresponding observable.
    In deducing that certain operators are (semi)positive, we make use of the following obser-
        ˆ     ˆ                                ˆ ˆ
   • If A and B are (semi)positive, then so is A + B.
        ˆ                      ˆ ˆ
   • If A is any operator then A† A is semipositive.
                                               ˆ                          ˆ
   • As a special case, we find that the square A2 of a Hermitian operator A is semipositive.
The first of these is obvious. For the second, note first that for any ψ in an appropriate domain,
                                 ˆ ˆ     ˆ ˆ     ˆ ˆ
                               ψ|A† Aψ = Aψ|Aψ = A† Aψ|ψ .
                             ˆ ˆ
The outer forms tell us that A† A is Hermitian. From the middle form, and from the property
                                          ˆ ˆ
                                          Aψ|Aψ ≥ 0
                                       ˆ ˆ
of the inner product, we deduce that A† A is semipositive definite as claimed. From these
properties we can quickly come up with a number of physical examples.
                   ˆ                                            ˆ
   • An operator V acting on three-dimensional wavefunctions by V ψ(x) = V (x)ψ(x) is semi-
     positive iff V (x) ≥ 0 for all x.
   • The kinetic energy
                                               ˆ     p2
                                              Tx = x
      in a single degree of freedom is semipositive.

   • The kinetic energy
                                        ˆ     p2
                                              ˆ    p2
                                                   ˆ  p2
                                        T = x + y + z
                                             2m 2m 2m
      in three degrees of freedom is semipositive.
   • If the potential energy V is semipositive then so is the Hamiltonian
                                            ˆ  ˆ ˆ
                                            H =T +V.

   • For any observable O we have

                                     ∆O2 = (O − O )2 ≥ 0

      so the definition of uncertainty makes sense.

4.7    The uncertainty principle
For any observable we can find some states for which the outcome of a measurement can be
predicted with certainty. A particular example is where we prepare the quantum system in a
proper eigenstate. Even if there are no proper eigensolutions, we can prepare states for which
the measurement can be predicted with arbitrary accuracy. For example, an observation of
momentum for a normalised state of the form
                                    ϕp0 (x) = Ceip0 x/¯ − x

would with high probability give a result close to p0 if is small. Likewise, for small    the
                                   ψx0 (x) = Ce−(x−x0 ) /
is tightly peaked around x = x0 and a measurement of position will give a result close to x0 .
The uncertainty principle tells us, however, that we cannot hope to know both position and
momentum arbitrarily accurately at the same time. That is, we can construct states for which
the uncertainty in either position or momentum can be made arbitrarily small but we cannot
make both uncertainties small for the same state.
    This can be stated compactly by writing an inequality
                                         ∆x∆p ≥
which must be satisfied by the uncertainties ∆x and ∆p in position and momentum for an
arbitrary state. If we can predict the momentum very accurately then the wavefunction must
be very extended spatially and we must be very uncertain of the particle’s position, and vice
versa. It is instructive to think about this for the states ϕp0 (x) and ψx0 (x) above.
   This incompatibility between position and momentum is the most famous expression of the
                                                                                       ˆ ˆ
uncertainty principle but it can be stated (and proved) rather more generally. If A and B are

any two observables, then for a given state ψ the uncertainties ∆A and ∆B are constrained by
the inequality
                                      ∆A∆B ≥ | C | ,                                    (13)
where C is the Hermitian operator
                                         ˆ    1 ˆ ˆ
                                        C=       A, B .
To show this let us assume that
                                         A =0= B .
(If this is not initially the case we can define new observables
                             A =A− A          and B = B − B
which have the same uncertainties and for which the assumption is true.) We now apply the
Schwarz inequality
                                    ϕ|ϕ χ|χ ≥ | ϕ|χ |2
                                   ϕ = Aψ             ˆ
                                              and χ = Bψ,
                                 ˆ ˆ   ˆ ˆ       ˆ ˆ
                                 Aψ|Aψ Bψ|Bψ ≥ | Aψ|Bψ |2 .
        ˆ     ˆ
Because A and B are Hermitian we can write
                                   ˆ      ˆ           ˆˆ
                                 ψ|A2 ψ ψ|B 2 ψ ≥ | ψ|ABψ |2
and taking a square root gives
                                     ∆A∆B ≥ | ψ|ABψ |.                                  (14)
Now, in general AB is not Hermitian, but we can write it in the form
                                         ˆˆ   ˆ     ˆ
                                         AB = X + i Y
                     ˆ     1 ˆˆ   ˆˆ       ˆ     i ˆˆ ˆˆ
                    X=       AB + B A  and Y = − AB − B A
                           2                     2
are. The square modulus of
                                 ψ|ABψ = X + i Y
is then
                                 | ψ|ABψ |2 =       X   2
                                                            + Y     2

                                              ≥     Y
                                                     i ˆ ˆ
                                              =     − [A, B]
                                              =       C         .
Combined with (14) this gives the uncertainty principle (13) as required.

4.8    Ehrenfest’s Theorem
Since classical mechanics works so well in describing our everyday experience of the macroscopic
world, we should hope to see it emerge as a limiting case of quantum theory. How this happens
is not a simple matter, however, and even today is an active field of investigation. While a full
understanding of the classical limit is hard, we can relatively easily recover classical-looking
equations if we consider a limit h → 0 (or, more properly, a limit where properties of the
                                        ¯                            ¯
system with the same dimensions as h become large compared to h). One of the most basic
of these correspondences is Ehrenfest’s Theorem, which says that the centre of a wavepacket
whose position and momentum are well localised (subject to the constraints imposed by the
uncertainty principle) follows a classical trajectory.
    We will state it in a somewhat more general form. If the wavefunction ψ evolves according
to the time-dependent Schr¨dinger equation, then the expectation value of any observable A
evolves according to
                                       dA         1 ˆ ˆ
                                             =      [A, H] .
                                        dt        h
We prove this before interpreting it. We have

                            dA           d      ˆ
                                   =         ψ|Aψ
                             dt         dt
                                         ∂ψ ˆ            ˆ ∂ψ
                                   =          |Aψ + ψ|A
                                          ∂t               ∂t
                                          1 ˆ ˆ              1 ˆˆ
                                   =         Hψ|Aψ + ψ| AHψ
                                         i¯h                 h
                                            1     ˆ ˆ      1    ˆˆ
                                   =    − ψ|H Aψ +            ψ|AHψ
                                           i¯              h
                                         1       ˆ ˆ
                                   =         ψ|[A, H]ψ ,
as required (we have assumed that A does not depend explicitly on time). Let us apply this to
the one-dimensional Hamiltonian
                                     ˆ     p2
                                     H=       + V (x)
and the position and momentum observables. It can be shown (exercise) that the relevant
commutators are
                                        1          ˆ
                                          [ˆ, H] =
                                       i¯          m
                                          [ˆ, H] = −V (x)

where the operator V (x) acts by multiplication on the wavefunction. Notice that F (x) =
−V (x) is the force acting on the particle and we can interpret the corresponding operator as

the force operator. Ehrenfest’s theorem then says

                                         dx             p
                                          dt            m
                                                  =     F .
Notice that these look very much like the classical equations of motion for a particle moving
under the influence of an external force F (x) = −V (x). We have made no approximations so
far, but let us now imagine a wavepacket whose position and momentum are localised as far
as allowed by the uncertainty principle. For example, we might know each with uncertainties
∆x = ∆p = h/2. If these uncertainties are small compared to the characteristic scales of
length and momentum for our system then we might approximate

                                            F ≈ F( x )

and Ehrenfest’s theorem says that the average position x and momentum p follow classical
    This is often taken to be a demonstration that the classical world emerges from quantum
theory in a limit h → 0. If we think of wavepackets as being slightly fuzzy particles, then as
long as the fuzziness is on a scale that is small compared to macroscopic scales they behave
effectively as classical particles. The small numerical value of h = 1.05×10−27 g cm2 s−1 in typical
macroscopic units makes this interpretation very tempting. It is misleading, however, because
wavepackets often spread very rapidly and even optimally localised wavepackets can spread to
macroscopic dimensions in a short space of time, especially if the dynamics is unstable. For
example, rough calculation shows that a pencil standing on its tip, with uncertainties that are
minimised subject to the constraints of the uncertainty principle, will fall over on a time scale
of a few seconds. A similarly initialised chaotically tumbling moon will become completely
delocalised on a time scale of decades, which is a very short time compared to the lifetime of
such systems. We therefore need more to explain why the objects we see around us seem sharp
and behave classically.

4.9    Appendix: finite-dimensional spaces
Everything we have done with operators on a Hilbert space can be specialised to matrices
acting on column vectors, regarded as elements of a finite-dimensional vector space. This
matrix notation is a convenient way of representing a state space with a finite basis

                                        ϕn ,    n = 1 · · · N.

It is conventional in the finite-dimensional case to replace these with the symbols

                                         ei ,   i = 1 · · · n,

however, and we will adopt that notation here.

    We consider in particular the n-dimensional vector space V whose elements can be written
as column vectors,                                
                                        v =  . .
                                              . 
                                                  
                                                 .                                      (15)
Let us define, for each column vector v an adjoint or Hermitian conjugate
                                                        ∗        ∗
                                         v † = v ∗T = (v1 · · · vn ),                       (16)
which is the row vector formed by taking the complex conjugate of its transpose. Then we may
                                     u|v = u† v =    u∗ v i
                                                      i                                  (17)
for the inner product of any two vectors in V . It is easily seen that this has the properties
of linearity, conjugate symmetry and positivity demanded of an inner product. There is an
obvious orthonormal basis for V . The column vectors
                                1                      0                            0
                                                                                   
                               0   
                                                      1    
                                                                                   0   
                       e1 =    .
                                    ,      e2 =      .
                                                            ,       ···   en =    .
                                .                      .                            .
                                                                                   
                                                                                   
                                0                      0                            1
form a basis for V since we can express any vector in the form
                                              
                                     . 
                                v =  .  = v 1 e1 + · · · + v n en
                                     . 
and in fact are easily seen to form an orthonormal basis. (Note that we may in general want
to consider other orthonormal bases, however). Note that for any orthonormal basis we can
represent any vector u in the form
                                          u=    ci e i
                                                ci = ei |u ,
just as in the case of a general Hilbert space.
    A linear operator on V can always be represented by multiplication by a matrix. Let v be
a column vector with components vi as above. Then Av is the column vector with components
                                            (Av)i =             aij vj

where aij are the elements of the matrix A. Using the inner product, we can give a simple
expression for the elements of A in terms of its action on basis vectors, namely,
                                             aij = ei |Aej .

This is easily seen by substituting the explicit forms given above for the standard basis vec-
tors ei . In fact this lies behind the standard terminology in quantum mechanics that for any
wavefunctions ϕ and ψ,
is called a “matrix element” of the operator A ˆ
    As with the case of operators on H the adjoint or Hermitian conjugate of a matrix A is
defined by the property that
                                          u|Av = A† u|v                                   (18)
for all u and v. It’s not hard to see by substituting the standard basis vectors for u and v that
this means
                                            A† = A ∗ T .                                     (19)
That is, A† is the transpose of the complex conjugate of A — if the elements of A are aij , the
elements of A† are a∗ . Hermitian matrices are those for which

                                            A = A† .                                        (20)

In the real case they are nothing other than the symmetric matrices. Following the discussion
of Hermitian operators, we can prove that the eigenvalues of a Hermitian matrix are real and
the corresponding eigenvectors may be chosen to form an orthonormal basis. In the finite-
dimensional case, however, we have the luxury of not having to specify domains or worry about
improper eigenvectors.

4.10     Appendix: the continuous spectrum
We have seen that even relatively common observables such as momentum may have improper
eigenfunctions. We outline briefly here how such observables are dealt with in quantum me-
chanics, although we will skim over all of the technical details.
    In general, we suppose that an observable O has a set of proper eigensolutions
                                         Oϕn = λn ϕn

where n runs over a discrete index set, sometime finite and sometimes infinite. The proper
                                                     ˆ               ˆ
eigenvalues form the discrete or point spectrum of O. In addition, O may also have improper
                                          Oϕλ = λϕλ
for λ in some continuous subset of the real line, forming the continuous spectrum of O. These
improper eigenfunctions are not normalisable,

                                            ϕλ = ∞,

and are therefore not properly speaking in the Hilbert space H, but they can be approximated
by normalisable states. For example, normalised states of the form
                                     ϕp0 (x) = Ceip0 x/¯ − x

are approximate eigenfunctions of the momentum operator p if p0 is real and > 0 is small.
    Even though improper eigenfunctions are not in the Hilbert space proper we assume in
quantum mechanics that we can use them to represent proper states by integrating over the
continuous label λ. In particular, we assume that any proper state ψ can be represented in the
                                   ψ=     cn ϕn + c(λ)ϕλ dλ
where the integration is over the range of the continuous spectrum of O. We further assume
that the improper states ϕλ can be scaled so that if

                                  χ=               an ϕ n +       a(λ)ϕλ dλ

is any second state, then
                                 χ|ψ =              a∗ c n +
                                                     n            a(λ)∗ c(λ) dλ
and in particular
                                 ψ|ψ =                 |cn |2 +      |c(λ)|2 dλ.
Notice that for any normalised state ψ,

                                          |cn |2 +         |c(λ)|2 dλ = 1

and we can then interpret the parts on the left as probabilities. We amend the fifth postulate
to say that if we make a measurement of the observable O we obtain a proper eigenvalue λn
with probability
                                           pn = |cn |2
and we obtain a result in the subset I of the continuous spectrum of O with probability

                                      p(O ∈ I) =                |c(λ)|2 dλ.

If we make a series of measurements O on quantum systems, all prepared in an identical state
ψ, then the average outcome is the expectation value of O,

                             O    =            λn |cn |2 +        λ|c(λ)|2 dλ

                                  =       ψ|           λn c n ϕn +     λc(λ)ϕλ dλ

                                  =         ˆ
                                          ψ|Oψ ,

neatly generalising our previous results.
   We can think of our standard representation of the state as a function of position as a special
                                                                    ˆ    ˆ
case of all this. Consider the case of one dimension. If we replace O by x, then the expressions
above become our standard expressions for overlaps and so on with c(λ) replaced by ψ(x). In
particular, the second postulate effectively becomes a special case of the fifth postulate.

4.11    Appendix: compatible observables
We conclude our treatment of the general structure for quantum mechanics by discussing a
special case that plays an important role in the treatment of symmetries, angular momentum
and higher-dimensionsal problems.
   Two observables A and B are compatible if the corresponding operators commute,
                                             ˆ ˆ
                                            [A, B] = 0.

In this case the Heisenberg uncertainty relation gives no constraint and we will indeed find that
it is possible to construct states for which we simultaneously know A and B with certainty.
These states are ones for which
                                            Aψ = aψ
                                             Bψ = bψ
for real numbers a and b. We say that such states are simultaneous eigenfunctions of the
           ˆ     ˆ
operators A and B.
    We have discussed how eigenfunctions of operators corresponding to observables are con-
veniently used as orthonormal bases for Hilbert space. There is a generalisation that can be
applies to compatible observables.
                                    ˆ       ˆ
Simultaneous diagonalisation: If A and B and commute and at least one of them has a
discrete spectrum then we may construct an orthonormal basis

                                     ϕn ,       n = 1, 2, · · ·

of simultaneous eigenfunctions
                                        Aϕn = an ϕn
                                        Bϕn = bn ϕn .

We begin by assuming that A has an orthonormal basis

                                     χn ,      n = 1, 2, · · · ,

                                            Aχn = an χn
                                      a1 ≤ a 2 ≤ a 3 ≤ · · · .
Notice that
                                  ˆˆ     ˆˆ         ˆ
                                  ABχn = B Aχn = an Bχn ,
                       ˆ                                       ˆ
so for each n either Bϕn vanishes or it is an eigenfunction of A with eigenvalue an . The
eigenvalue an is either degenerate or nondegenerate.

   If it is nondegenerate then we conclude that
                                                 Bχn = bn χn

for some bn and χn is already an eigenfunction. We let

                                                    ϕn = χ n

in that case.
    If an is degenerate then we have a fight on our hands. Let

                                        a = an = · · · = an+dn −1
so the eigenvalue is dn -fold degenerate. For n ≤ k ≤ n + dn − 1, Bχk either vanishes or is an
eigenvector of A and either way we conclude that
                          Bχk = linear combination of χn , · · · , χn+dn −1
                                      n+dn −1
                                 =              Bik χi ,

                                             Bik = χi |Bχk .
The coefficients Bik for n ≤ i, k ≤ n + dn − 1 form a dn × dn Hermitian matrix. Let this matrix
have eigenvalues
and eigenvectors
                                                               
                                                    . 
                                                    . ,
                                                    . 

with j = 1, 2, · · · , dn and define
                                         ϕn+j−1 =                cj χn+i−1 .

Then ϕn+j−1 is an eigenfunction of A with eigenvalue

                                         Aϕn+j−1 = an ϕn+j−1

and an eigenfunction of B with eigenvalue
                                         Bϕn+j−1 = bj ϕn+j−1 .

Up to a relabeling of eigenvalues of B, this is exactly what we set out to prove.

   In practice when faced with a basis of simultaneous eigenfunctions we often use a multiple
index instead of n. For example we might list the eigenfunctions as
                            Aϕlm = al ϕlm ,       with a1 < a2 < · · ·

                            Bϕlm = blm ϕlm ,         with m = 1 · · · dl .
In situations where the two eigenvalues suffice to break the degeneracy of the eigenfunctions it
is also common to label them with the eigenvalues themselves, as in
                                           Aϕab = aϕab

                                           Bϕab = bϕab .

4.12     Appendix: bra-ket notation
In the 1950’s, Paul Dirac suggested a notation for states and operators that has since become
very popular. It is useful because it helps us to “think quantum-mechanically”, although it is
derided in certain quarters circles for encouraging nonrigorous thinking. Even those who prefer
not to use it in their own work need to be familiar with its basics, however, since so much of
writing about quantum mechanics is done in terms of it.
    Bra-ket notation takes advantage of the structure


of the inner product between two states ψ and ϕ. Instead of writing a state as a function ψ(x)
of position we write it as a “ket”
                                            |ψ .
The idea is that |ψ represents a “state vector” in the Hilbert space H. This state vector can
be represented concretely by a wavefunction ψ(x) but might equally be represented by the
components cn in an arbitrary orthonormal basis. We regard the ψ part simply as a label and
in fact can replace it by any symbol we think might be representative of the state. Common
alternatives are
                                    |α ,     |n ,     |λ ,
or even things like
                                       |↑      and |+ .
The important point is that whether we represent a state concretely by a wavefunction ψ(x) or
a set of components cn , we can form linear combinations such as

                                           α|ψ + β|ϕ

and kets are elements of a vector space.

   An important element in bra-ket notation is that for every ket |ψ we define a corresponding
This is defined to be an object which acts linearly on kets to give the complex number
                                          ψ| · |ϕ = ψ|ϕ .
(Bra-ket, get it?) The relation between a ket and its bra
                                              |ψ → ψ|
is the Hilbert-space analog of the relation between a column vector and a row vector
                                             
                                     .            ∗        ∗
                                v =  .  → v † = (v1 · · · vn ),
                                     . 
using the adjoint operation.
   A very powerful aspect of bra-ket notation is that we can also use to make sense of combi-
nations such as
                                           |ψ ϕ|
as operators. If |χ is any third state define the operator |ψ ϕ| acting on |χ to be
                             |ψ ϕ| · |χ = |ψ ϕ|χ = ( ϕ|χ ) |ψ .
This has use for example in stating that if there is an orthonormal basis consisting of |ϕ n ’s,
                                          |ϕn ϕn | = I.
This is known as a resolution of the identity and is a compact way of expressing the fact that
the states |ϕn form a complete set. To see this we note that if the |ϕn ’s are complete then
any vector |ψ can be expanded in the form
                                 |ψ       =       cn |ϕn

                                          =           ϕn |ψ |ϕn

                                          =       |ϕn ϕn |ψ

                                          =            |ϕn ϕn | |ψ .

Because in bra-ket notation the label inside the ket can be quite arbitrary and might not have
meaning as a state, when we write matrix elements we keep the operator outside the ket and
write, for example,

where in normal or “default” notation we would write
                                                                                         ψ|Aϕ .
This done so that an expression such as 1|A|2 can be given a sensible interpretation, whereas
 1|A2 would be confusing. A slightly annoying aspect of all this is that we have no self-
contained means of representing
in bra-ket notation. We must write it instead in terms of the Hermitian conjugate operator
                                                                                        ψ|A† |ϕ

and before using bra-ket notation we are therefore required to have established and be familiar
with the basic properties of Hermitian conjugates.

5     The harmonic oscillator

                                                         ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                        ¡¡¡¡¡¡¡¡¡ ¢ 
                                                       ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                   ¢ ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢
                                                   ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                 ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡
                                              ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢ ¢
                                             ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢                     F=−kx
                                           ¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡
                                       ¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡¢ 
                                    ¢¢¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢
                                    ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                 ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢ ¢
                            ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡¢ 
                           ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                       ¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡ ¢¡
                         ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                      ¢¡¡¡¡¡¡¡¡¡ ¢
                     ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢
                  ¢¢¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢                                                       x
                   ¢  ¢  ¢  ¢  ¢  ¢  ¢  ¢ 

Figure 8: In the mechanical oscillator a spring exerts a restoring force F = −kx in opposition
to a displacement x. The corresponding potential is V (x) = 2 kx2 .

    We now find eigenvalues and eigenvectors of the harmonic oscillator Hamiltonian

                                                                                     ˆ  p2
                                                                                         ˆ  1
                                                                                     H=    + kˆ2 .
                                                                                        2m 2
This is the Hamiltonian for a one-dimensional particle of mass m, constrained by a spring with
spring constant k. Its importance extends far beyond this simple mechanical model, however.
Its eigensolutions form the basis for our understanding of a good deal of modern physics. They
provide the foundation for understanding vibrations in crystals (“phonons”), quantum optics
(lasers), molecular vibrations and much more. While we will not be directly concerned here

with these more esoteric applications of the model, it is nice to know that they are understood
using essentially the same analysis we are going to develop here.
    There are two ways to “solve” the harmonic oscillator in quantum mechanics. The first ap-
proach, which we can think of as the wave-mechanical picture, is to write down the Schr¨dinger
equation and solve it as a second order differential equation in terms of known special functions.
This would carry on the sort of approach we used in Chapter 3 and is an option available to us
no matter what one-dimensional potential we are faced with. It is a perfectly good way of doing
things but we will use a second approach, which we might think of as the matrix-mechanical
option. This way of doing things takes advantage of the special properties of the harmonic
oscillator to solve it using only operator algebra, replacing ordinary calculus with the calculus
of commutators. It hardly refers to differential equations. We prefer it because it is very elegant
and is a good way to get into insight the quantum-mechanical way of doing things. In practical
terms the techniques learned here can also be used to solve other oscillator-type problems, such
as in quantum optics, which are less naturally formulated in terms of differential equations.

5.1    Overview and some definitions
The main ingredient we need for the operator approach is that the Hamiltonian can be written
                                       ˆ     ˆ
as a quadratic expression in operators x and p which have the commutator

                                            x ˆ      h
                                           [ˆ, p] = i¯ .

In order to simplify the manipulation involved we defined the rescaled operators
                           Q = (mk)1/4 x
                                       ˆ      and          ˆ
                                                           P = (mk)−1/4 p.

Then, letting ω =    k/m denote the classical frequency of oscillation, we can write

                                      ˆ  1  ˆ    ˆ
                                      H = ω Q2 + P 2                                         (21)
                    ˆ     ˆ
where the operators Q and P have the commutation relation
                                            ˆ ˆ       h
                                           [Q, P ] = i¯ .                                    (22)

Let us define
                         1   ˆ    ˆ                               1   ˆ    ˆ
                     a= √
                     ˆ       Q + iP            and          a† = √
                                                            ˆ         Q − iP
                          2¯                                        h
                                            ˆ ˆˆ
                                            N = a† a.
For reasons that will become obvious we call a the annihilation operator, its Hermitian conjugate
ˆ †
a the creation operator and N ˆ the number operator.

   Note that

                             ˆ       1    ˆ    ˆ     ˆ    ˆ
                             N =          Q − iP     Q + iP
                                     1    ˆ      ˆˆ ˆˆ       ˆ
                                  =       Q2 + i(QP − P Q) + P 2
                                     1    ˆ    ˆ     1
                                  =       Q2 + P 2 −
                                    2¯               2
                                    Hˆ     1
                                  =       − .
                                    hω     2
This means that we can write the Hamiltonian in the form

                                        ˆ  ˆ 1 hω
                                        H= N+   ¯
                                 ˆ                                             ˆ
and finding the eigensolutions of N is the same as finding the eigensolutions of H. In fact, if
                                           N ϕn = nϕn
                          ˆ                              ˆ
are the eigensolutions of N , then the eigensolutions of H are
                                          Hϕn = En ϕn ,

                                       En = n +        ¯
We will be able to prove the following.

Claim: The eigenvalues of N are the positive integers n = 0, 1, 2, · · ·.

We will furthermore be able to give a simple recipe for constructing the corresponding eigen-
functions and in so doing we will have given a complete solution of the harmonic-oscillator
    The upshot of all this is that the energy levels of the harmonic oscillator form a regularly-
spaced ladder, starting at the so-called zero-point energy E0 = 1 hω — the minimum energy
allowed for a harmonically confined particle — with a spacing hω between neighbouring levels.
This contrasts with the infinite well, where we found ever increasing gaps as the quantum
number n increased, and with the hydrogen spectrum, where levels get closer as the quantum
number increases.

5.2    Technicalities
The main ingredients we need in order to demonstrate the claims of the previous section are
provided by the following observations.

                                    En                          V(x)

Figure 9: The energy levels of the harmonic oscillator form an equally-spaced ladder, with
neighbouring levels separated by the fixed amount hω.

   • N = a† a is a self-adjoint semipositive operator and therefore its eigenvalues n must be
     real numbers such that n ≥ 0.

   • [ˆ, a† ] = 1.
      a ˆ
   • If N ϕn = nϕn , then,
                                          ˆˆ             a
                                          N aϕn = (n − 1)ˆϕn                                (23)
                                         N a† ϕn = (n + 1)ˆ† ϕn .
                                                          a                                 (24)

The first point is covered by the discussion in section 4.6. The second comes simply from
inserting the definition of the creation and annihilation operators into the commutator:
                                 1 ˆ       ˆ ˆ      ˆ
                      [ˆ, a† ] =
                       a ˆ         [Q + i P , Q − i P ]
                                 1     ˆ ˆ       ˆ ˆ        ˆ ˆ      ˆ ˆ
                              =      [Q, Q] − i[Q, P ] + i[P , Q] + [P , P ]
                              =             h           h
                                    (0 − i(i¯ ) + i(−i¯ ) + 0)
                              = 1.

The third is the most interesting and to a certain extent gives the crux of the solution.
   Let ϕn be a proper eigenfunction of N . Then, using

                                       a† a = aa† + [ˆ† , a],
                                       ˆ ˆ ˆˆ        a ˆ

we get
                                 N aϕn = a† aaϕn
                                         ˆ ˆˆ
                                          =        aa† a + [ˆ† , a]ˆ ϕn
                                                   ˆˆ ˆ a ˆ a

                                          =        ˆˆ ˆ
                                                   aN − a ϕn
                                             a     ˆ
                                          = nˆϕn − aϕn ,

from which we get (23). A similar calculation (exercise!) gives (24).

5.3      The main result
Now suppose we are supplied with a (normalised) proper eigensolution N ϕn = nϕn of N . We     ˆ
will postpone until the next section the question of whether such a solution exists or whether it
is nondegenerate if it does. We will show here that if such a solution exists, then it is necessarily
a member of an infinite sequence
                                        ϕ0 , ϕ 1 , · · · , ϕ n , · · · ,
labelled by the nonnegative integers n ≥ 0. In particular every eigenvalue of N is a nonnegative
    The key point is (23). This tells us that either

                                               aϕn = 0

                                        ϕn−1 =             ˆ
is a second (normalised) eigenfunction of N with eigenvalue n − 1. If the latter is the case we
apply a once again and carrying on in this way we can generate a sequence of eigenfunctions
                                                   a         ˆ
                                     · · · ϕn−2 ←− ϕn−1 ←− ϕn

with decreasing eigenvalues
                                     ··· < n − 2 < n − 1 < n
of N . Either this sequence terminates because aϕn0 = 0 for some n0 = n − k or the sequence is
infinite. We can immediately exclude the latter because in that case negative eigenvalues would
eventually be generated whereas in the previous section we pointed out that N is semipositive
definite. The sequence must therefore terminate. We have

                                               aϕn0 = 0.

This is equivalent to
                               0 =      ˆ

                                 =     ˆ     a
                                       aϕn0 |ˆϕn0
                                 =     ϕn0 |ˆ† aϕn0
                                 =            ˆ
                                       ϕ n0 | N ϕ n0
                                 = n0            (since ϕn0 |ϕn0 = 1).

Therefore the sequence terminates by having the eigenvalue n0 = n − k vanish. This means
that all of the members of the sequence
                                         a           ˆ
                                                     a          ˆ
                                     ϕ0 ←− ϕ1 ←− · · · ←− ϕn

generated by applying the annihilation operator to ϕn , including ϕn itself, have nonnegative
integer eigenvalues.
    Not only that, but by applying the creation operator a† to ϕn we can can extend the sequence
                                        ˆ         a†
                                   ϕn −→ ϕn+1 −→ ϕn+2 · · · .
We can show that
                                       ϕn+1 =          †ϕ
                                                          a† ϕn
                                                     a n
is a normalised eigenfunction with eigenvalue n + 1 (exercise!) and since
                                        a † ϕn
                                        ˆ            =n+1>0

the sequence never terminates when extended in this direction.
    To summarise, we have shown that starting from any proper eigensolution we can generate
a sequence with eigenvalues n running over the nonnegative integers. As a byproduct of this
discussion we also have the identities
                                              0               if n = 0,
                                 aϕn =        √                                            (25)
                                                n ϕn−1        if n > 0

and                                                  √
                                       a† ϕn =
                                       ˆ                 n + 1 ϕn+1                        (26)
which will prove useful later. It will also be useful to note that given ϕ 0 we can generate the
rest of the sequence by repeatedly using (26) and this gives
                         1                       1      2                1  n
                   ϕn = √ a† ϕn−1 =
                            ˆ                         a† ϕn−2 = · · · = √ a† ϕ0            (27)
                          n                  n(n − 1)                    n!

for a general eigenfunction.

5.4     Concrete solutions
The final link in the calculation is to show that a starting solution from which we can deduce
the infinite sequence in the previous section actually exists and to investigate its degree of
degeneracy. To do this we must specify a bit more concretely how the creation and annihilation
operators, or equivalently the position and momentum operators, act on wavefunctions. Having
done this, we will be able to write concrete formulas for the eigenfunctions. This step will
depend on the particular nature of the oscillator problem, and might change depending on the
underlying physics.
                                                                                  ˆ      ˆ
   We will concentrate almost entirely on the basic mechanical oscillator, where Q and P act
on functions of one variable according to
             Qψ(x) = (mk)1/4 xψ(x)                and           ˆ
                                                                P ψ(x) = −i¯ (mk)−1/4 ψ (x).
Given these definitions we must find a seed solution ϕn (x) to get things going. The simplest
case to consider seems to be n = 0. In that case we have the equation
                                               aϕ0 (x) = 0
to solve, which is a first order differential equation and simpler then the second order equation
obtained in the general case. In order to avoid being overwhelmed by physical constants when
we do this, let us define a new coordinate
                                      ξ = h−1/2 (mk)1/4 x =
                                          ¯                          x,
Then it is easy to verify that
                                  √                                   √ d
                           Q=      hξ
                                    ¯             and           ˆ
                                                                P = −i h ,
                         1        d                                   1   d
                    a= √ ξ+
                    ˆ                        and                  a= √ ξ−
                                                                  ˆ          .
                          2       dξ                                   2  dξ
If in addition we define rescaled wavefunctions by
                                                   mω    −1/4
                                       ψn (ξ) =                 ϕn (x),
the normalisation condition
                                 ∞                       ∞
                                      |ϕn (x)|2 dx =         |ψn (ξ)|2 dξ = 1
                                 −∞                     −∞

remains simple. (Alternatively, we could be lazy and simply claim to choose units of length,
time and mass so that h = m = ω = 1. This would lead to the same sort of simplification.)
   With these conventions the seed equation is
                                           1     d
                                 aψ0 (ξ) = √ ξ +
                                 ˆ                  ψ0 (ξ) = 0.
                                            2    dξ

This is a first-order separable ordinary differential equation whose solution
                                                          2 /2
                                          ψ0 (ξ) = Ce−ξ          ,

where C is an integration constant, is easily found. We choose the constant C so that
                                          ∞                      √
                                   1=         |ψn (ξ)|2 dξ = |C|2 π

and the choice C = π −1/4 gives the ground state
                                                              2 /2
                                        ψ0 (ξ) = π −1/4 e−ξ          .

Notice that we have found a solution, so it exists, and it is unique up to a multiplicative
constant, so the ground state is nondegenerate.
    From the discussion in the previous section, we can extend these attributes to the other
members of the sequence (ψ0 (ξ), ψ1 (ξ), · · · , ψn (ξ), · · ·). We can also write explicit expressions
for them. For example, from (26),

                               ψ1 (ξ) = a† ψ0 (ξ)
                                          1    d            2
                                        = √ ξ−    π −1/4 e−ξ /2
                                           2   dξ
                                                                         2 /2
                                        = 2−1/2 π −1/4 (2ξ) e−ξ                 .

The next (normalised) member of the sequence is
                               ψ2 (ξ) = √ a† ψ1 (ξ)
                                                                                    2 /2
                                       = 2−3/2 π −1/4 (4ξ 2 − 2) e−ξ                       .

Carrying on like this we find in general the form

                                             π −1/4           2
                                    ψn (ξ) = √       Hn (ξ)e−ξ /2
                                               2n n!
where Hn (ξ) is a polynomial in ξ of degree n. The polynomials Hn (ξ) are known as the Hermite
    Using (27) we can write
                                          π −1/4      d                         2 /2
                                 ψn (ξ) = √        ξ−                    e−ξ
                                            2 n n!    dξ
for a general eigenfunction, so in fact the Hermite polynomials can be simply defined by the
condition                                              n
                                       −ξ 2 /2      d       2
                               Hn (ξ)e         = ξ−      e−ξ /2 .                      (28)

The calculations above have shown that
                                         H0 (ξ) = 1
                                         H1 (ξ) = 2ξ
                                         H2 (ξ) = 4ξ 2 − 2
are the first three Hermite polynomials.
    By manipulating these expressions, we can come up with alternative ways of generating the
Hermite polynomials and a couple are worth mentioning in particular. One can show that the
Hermite polynomials obey the recursion relation
                                  Hn+1 (ξ) = 2ξHn (ξ) − Hn (ξ)
and, together with the starting condition
                                            H0 (ξ) = 1
this is enough to define them. Another alternative is provided by the recursion relation
                               Hn+1 (ξ) = 2ξHn (ξ) − 2nHn−1 (ξ).
Details of these and other properties of Hermite polynomials are covered in the problems. There
are many more properties of Hermite polynomial than can be gone into here. The Hermite
polynomials have cropped up in several areas in applied maths and the standard reference
books will provide much more detail. It is not necessary to know systematically what these
additional features are, but it is useful to keep in mind that they exist and to be ready to delve
into the detailed references when necessary.
    All of these additional properties can be derived straightforwardly from (28) which we
obtained as a simple consequence of applying creation operators to the ground state. They
could also have been derived by writing out the Schr¨dinger equation as a differential equation
and solving it using the Frobenious method. You will find both approaches in the standard
textbooks, but the one adopted here has the advantage of needing less computation and being
more general in its application. The upshot is that using creation and annihilation operators
we are in a position to solve almost any problem involving harmonic oscillators efficiently and

5.5    Using creation and annihilation operators to get results
As a final example of the power of creation and annihilation operators, let us compute the
expectation value
                                             1            1
                                   x2 = √         Q2 =        Q2
                                             mk          mω
                                                           ˆ     ˆ
for a particle in the state ϕn . It is useful to note that Q and P are expressed in terms of the
creation and annihilation operators according to

                      ˆ      ¯
                             h †                         ˆ    ¯
                                                              h †
                      Q=        a   ˆ
                               (ˆ + a)         and       P =i    a   ˆ
                                                                (ˆ − a).
                             2                                2

Then, for the state ϕn ,
                           Q   =        ϕn |(ˆ† + a)ϕn
                                             a    ˆ
                                      h √
                                      ¯                   √
                               =          n + 1 ϕn |ϕn+1 + n ϕn |ϕn−1
                               = 0,
where we have used the fact that as distinct eigenfunctions of a Hermitian operator the states
ϕn form an orthonormal set. We can similarly show that
                                                P = 0.
We also find
                               Q2     =    ϕn |(ˆ† + a)2 ϕn
                                                a     ˆ
                                        h          2
                                      =    ϕn |(a† + a† a + aa† + a2 )ϕn
                                                       ˆ ˆ ˆˆ     ˆ
                                      =    ϕn |(ˆ† a + aa† )ϕn
                                                a ˆ ˆˆ
                                      =   (2n + 1)
and so
                                                   1 h  ¯
                                           x2 = n +
                                                   2 mω
It is not hard to get expectation values of other powers and of combinations of position and
momentum in this way.
    Had we tried to evaluate this expectation value using the explicit form of the eigenfunctions
we would have ended up with an integral of the form
                                       h      1      ∞                       2
                               x2 =            √
                                           n n! π
                                                             ξ 2 [Hn (ξ)]2 e−ξ dξ,
                                      mω 2          −∞

which looks like a lot more work. In fact, using the operator approach we never even had to
write the eigenfunctions explicitly. It is quite often the case with harmonic oscillators that we
can calculate without having to deal with the gory details of Hermite polynomials, or even to
explicitly acknowledge their existence.

6    Angular momentum
The need to deal with angular momentum in quantum mechanics arises when we try to solve
three-dimensional problems with spherical symmetry. Consider the Hamiltonian

                                          ˆ    h2
                                               ¯         2
                                          H=−                + V (r),

where V (r) is a central potential and we write the Laplacian in spherical polar coordinates

                      2       1 ∂ 2∂      1         ∂      ∂   ∂2
                          =        r  + 2 2    sin θ sin θ   + 2 .
                              r2 ∂r ∂r r sin θ      ∂θ     ∂θ ∂φ

We use M for the mass because we want to reserve the symbol m for something else. It will be
useful to separate out the angular part of the Laplacian by writing

                                         2       1 ∂ 2∂       1
                                             =     2 ∂r
                                                        r   − 2 Λ2
                                                 r        ∂r r
                                          1         ∂      ∂   ∂2
                               Λ2 = −          sin θ sin θ   + 2
                                        sin2 θ      ∂θ     ∂θ ∂φ
(and yes, the minus signs are meant to be there).
   In the method of separation of variables we will look for eigenfunctions of the form

                                    ψ(r, θ, φ) = R(r)Y (θ, φ).

Then the Schr¨dinger equation

                   ˆ                              2             2M (E − V (r))
                   Hψ = Eψ         ⇒                  (RY ) +                  RY = 0,
                        Y d 2 dR R 2         2M (E − V (r))
                               r    − 2Λ Y +                RY = 0,
                        r 2 dr   dr  r            h2
which decouples in the form
                              1 d 2 dR 2M (E − V (r))
                                   r   +              R   Λ2 Y
                              r2 dr dr        h2
                                              ¯         =      .
                                         1                 Y
On the left of this equation is a function of r. On the right is a function of (θ, φ). The only
way we can get them to balance is if each is a constant, λ say. Therefore we look for solutions
                                     Λ2 Y (θ, φ) = λY (θ, φ).
Notice that this has the form of an eigenvalue equation. Once we have determined an eigenvalue
λ we can return to the radial part of the problem and solve
                               1 d 2 dR 2M (E − V (r))     λ
                                    r   +      2       R = 2 R.
                               r2 dr dr      h
                                             ¯            r
This can also be written in the form
                               1 d 2 dR 2M (E − Veff (r))
                                    r   +                R = 0,
                               r2 dr dr       h2

                                                h2 λ
                                       Veff (r) =       + V (r).
                                               2M r2
This radial equation looks a bit like a one-dimensional Scr¨dinger equation with an effective
potential Veff (r) where, in addition to the central potential V (r) we have a centripital potential
h2 λ/2M r2 in which
                                             L2 = h 2 λ
plays the role of angular momentum squared. When we treat angular momentum quantum-
mechanically this is exactly the conclusion we will reach.
   We will show that the eigensolutions of Λ2 are of the form
                                     Λ2 Y (θ, φ) = l(l + 1)Y (θ, φ)
                                                l = 0, 1, 2, · · ·
called the angular momentum quantum number. These solutions are (2l + 1)-fold degenerate
with the degenerate eigenfunctions for a given l naturally labelled by a quantum number m
running from −l to l. We write the eigenfunctions in the form
                           Ylm (θ, φ),          m = −l, −l + 1, · · · , l − 1, l.
They are normalised according to the convention

                                 |Ylm |2 dΩ =      |Ylm (θ, φ)|2 sin θdθdφ = 1,

where the expression at left will be our shorthand for an integral over the angular coordinates
with the usual limits and solid angle element dΩ = sin θdθdφ and S 2 indicating that we integrate
over the two-dimensional sphere.
   These functions are called spherical harmonics. They crop up all over applied maths. In
traditional fields they are usually described by solving differential equations. As with the
harmonic oscillator, however, in using quantum mechanics we have the option of tackling them
using operator methods. These are far more elegant and powerful, and being able to adopt this
point of view is a tremendous benefit of studying quantum mechanics.

6.1     Definitions and commutation relations
In classical mechanics, the angular momentum of a particle passing through the point x with
momentum p is defined to be the cross product
                                                 L = x × p.
In components this reads
                                          Lx = ypz − zpy
                                          Ly = zpx − xpz
                                          Lz = xpy − ypx .

In quantum mechanics, we define the obvious operator versions

                            ˆ                    ¯
                                                 h   ∂     ∂
                                   ˆ       ˆ
                            Lx = y p z − z p y =   y    −z
                                                 i   ∂z    ∂y

                            ˆ                 ¯
                                              h   ∂     ∂
                                   ˆ     p
                            Ly = z px − xˆz =   z    −x
                                              i   ∂x    ∂z

                            ˆ                   ¯
                                                h   ∂     ∂
                                  p      ˆ
                            Lz = xˆy − y px . =   x    −y    .
                                                i   ∂y    ∂x

Since x commutes with py and so on, there are no ordering ambiguities in these equations. We
also define the operator
                                    ˆ     ˆx ˆy ˆz
                                    L2 = L2 + L2 + L2 ,
representing the square magnitude of angular momentum.
   Having introduced these new operators, the first thing we should do is sort out their com-
mutation relations. These are very important and worth remembering:
                                           ˆ ˆ          hˆ
                                          [Lx , Ly ] = i¯ Lz
                                           ˆ ˆ          hˆ
                                          [Ly , Lz ] = i¯ Lx
                                           ˆ ˆ          hˆ
                                          [Lz , Lx ] = i¯ Ly

(notice that these are related to each other by cyclic permutation of x, y and z) and
                                     ˆ ˆ
                                    [Li , L2 ] = 0,        i = x, y, z.

Deriving these is a useful bit of practice in manipulating commutators. We have
                                ˆ ˆ            ˆ      ˆ ˆ          p
                               [Lx , Ly ] = [y pz − z py , z px − xˆz ]
                                               p      p         ˆ p
                                           = y[ˆz , z]ˆx + x[z, pz ]ˆy
                                                 hp         hp
                                           = y(−i¯ )ˆx + x(i¯ )ˆy
                                              h p        ˆ
                                           = i¯ (xˆy − y px )
                                           = i¯ Lz
                         ˆ ˆˆ       ˆ ˆ ˆ    ˆ ˆ ˆ
and, using the identity [A, B C] = [A, B]C + B[A, C] from section 4.5,
                  ˆ ˆ           ˆ ˆ       ˆ    ˆ
                [ Lx , L2 ] = [ Lx , L2 + L2 + L2 ]
                                      x     y    z

                              ˆ ˆy          ˆ ˆz
                          = [ Lx , L2 ] + [ Lx , L2 ]
                            ˆ ˆ ˆ              ˆ ˆ ˆ          ˆ ˆ ˆ              ˆ ˆ ˆ
                          = Ly [ Lx , Ly ] + [ Lx , Ly ] Ly + Lz [ Lx , Lz ] + [ Lx , Lz ] Lz
                             h ˆ ˆ      ˆ ˆ     ˆ ˆ     ˆ ˆ
                          = i¯ (Ly Lz + Lz Ly − Lz Ly − Ly Lz )
                          = 0

and similarly for their cyclic permutations.
                   ˆ                   ˆ
    The fact that L2 commutes with Lz (or any of the other components) is fundamental to
the theory of angular momentum. In general we say that any two observables for which the
corresponding operators commute are compatible. It means that we can find eigensolutions that
are simultaneously eigensolutions for the two operators.

6.2                                               ˆ
       Looking for simultaneous eigenfunctions of L2 and Lz
                     ˆ      ˆ
The compatibility of L2 and Lz means that it is natural to look for functions Yλµ such that
                                          L2 Yλµ = λYλµ

                                         Lz Yλµ = µYλµ .
                             ˆ       ˆ        ˆ
The fact that we single out Lz over Lx and Ly here is purely a matter of convention. From the
                  ˆ     ˆz   ˆy ˆz
observation that L2 − L2 = L2 + L2 is semipositive definite we have the constraint λ − µ2 ≥ 0.
That is,
                                           λ ≥ µ2 ≥ 0.                                         (29)
This will be a useful constraint that will play a similar role to the one that the positivity of Nˆ
played in the case of the harmonic oscillator.
   Now define the ladder operators
                                         ˆ    ˆ      ˆ
                                         L± = Lx ± i Ly

and observe that
                                            ˆ+ ˆ
                                            L† = L− .
From the commutation relations in the previous section we can quite easily show that
                                        ˆ ˆ            hˆ
                                       [Lz , L± ] = ±¯ L±
                                        ˆ ˆ
                                       [L2 , L± ] = 0.

Notice now that
                             ˆ ˆ         ˆ ˆ          ˆ ˆ
                             Lz L± Yλµ = L± Lz Yλµ + [Lz , L± ]Yλµ
                                            ˆ        ¯ˆ
                                         = µL± Yλµ ± hL± Yλµ
                                                ¯ ˆ
                                         = (µ ± h)L± Yλµ .

                                     ˆ ˆ         ˆ ˆ
                                     L2 L± Yλµ = L± L2 Yλµ
                                                = λL± Yλµ .

                  ˆ            ˆ                                            ˆ       ˆ
Therefore either L± Yλµ = 0 or L± Yλµ is a new simultaneous eigenvector of L± and L2 with
eigenvalues µ ± h and λ respectively. We can investigate which is the case by evaluating the
                        ˆ                ˆ± ˆ             ˆ ˆ
                        L± Yλµ 2 = Yλµ |L† L± Yλµ = Yλµ |L L± Yλµ .
We have
                                ˆ ˆ      ˆ     ˆ ˆ        ˆ
                                L− L+ = (Lx − iLy )(Lx + iLy )
                                          ˆx     ˆ ˆ         ˆy
                                        = L2 + i[Lx , Ly ] + L2
                                          ˆ    ˆz ¯ ˆ
                                        = L2 − L2 − h Lz .

This gives
                                    L+ Yλµ   2
                                                     = λ − µ2 − hµ
                                                     = λ − µ(µ + h)                        (30)

and we similarly have
                                     L− Yλµ      2
                                                     = λ − µ(µ − h)                        (31)
(assuming that the eigenfunction Yλµ has been normalised).
                                                      ˆ        ˆ
    As we did for the harmonic oscillator, we can use L− and L+ to generate new eigenfunctions
   ˆ      ˆ                                                             ˆ
of Lz and L2 with respectively decreasing and increasing eigenvalues of Lz and a fixed eigenvalue
      ˆ 2
λ of L . If the sequence carried on indefinitely in either direction we would eventually violate
the constraint (29). It must therefore terminate in both directions and we have a finite sequence
of eigenfunctions
                        L−     ˆ
                               L−            ˆ
                                             L−            ˆ
                                                           L+         ˆ
                                                                      L+   ˆ
               Yλµmin ←− · · · ←− Yλ,µ−¯ ←− Yλµ −→ Yλ,µ+¯ −→ · · · −→ Yλµmax
                                       h                h

with Lz -eigenvalues
                                            ¯           ¯
                         µmin < · · · < µ − h < µ < µ + h < · · · < µmax .
From (30) and (31) the terminating values are respectively determined by

                                       λ = µmax (µmax + h)

                                       λ = µmin (µmin − h).
We know also that µmax can be reached from µmin by an integer multiple of h so

                                        µmax = µmin + k¯

for some integer k ≥ 0. From

                                          ¯                     ¯
                             µmax (µmax + h) = λ = µmin (µmin − h)

we then get
                                               µmax =       ¯
                                               k    k
                                          λ=          + 1 h2 .
                                               2    2
Let us denote
                                                   l=     .
Then the eigenvalues of L2 are constrained to be of the form

                                            λ = l(l + 1)¯                                 (32)

where 2l is a nonnegative integer and once a single simultaneous eigenfunction is found we can
construct from it a sequence containing 2l + 1 members in which λ is fixed the Lz -eigenvalue is

                                                µ = m¯                                    (33)

where m runs over the range
                                    m = −l, −l + 1, · · · , l − 1, l.                     (34)
We call l the angular momentum quantum number and m the azimuthal quantum number or
magnetic quantum number (the latter because it plays an important role when atoms are placed
in magnetic fields).
    The standard convention is to label the eigenfunctions using the quantum numbers l and m
rather than the eigenvalues λ and µ. From now on we will therefore write the eigenfunctions
in the form
rather than as Yλµ .
    It is useful to give explicit formulas for the generation of new eigenfunctions from a seed
solution. The main thing to sort out is the normalisation. Recall from (30) and (31) that
                     L± Ylm   2
                                  = λ − µ(µ ± h) = (l(l + 1) − m(m ± 1))¯ 2 .
                                              ¯                         h

This means that we can write
                                  0                                    if m = ±l,
                     L± Ylm =
                   h                  l(l + 1) − m(m ± 1) Yl,m±1       otherwise.

6.3    Concrete solutions: spherical harmonics
We are now in a position that is familiar from the harmonic oscillator. We have shown that if
we are given one simultaneous eigenfunction then the eigenvalues are constrained by (32) and
(33) and we can generate from it a sequence 2l + 1 eigenfunctions in which l is fixed and m
runs between −l and l in unit steps. In doing this we have used only the commutation relations

between the components of angular momentum. To determine whether such solutions exist,
and what their forms are if they do, we will start to use more explicitly the expressions we have
given for the angular momentum operators as differential operators.
    It will help a great deal to express them in terms of polar coordinates on the sphere. By
manipulating the definitions already given in terms of cartesian coordinates it is possible to
show that

                             ˆ            ∂               ∂
                             Lx = i¯ sin φ + cos φ cot θ
                                          ∂θ             ∂φ

                             ˆ             ∂               ∂
                             Ly = −i¯ cos φ − sin φ cot θ
                                           ∂θ             ∂φ

                             ˆ         ∂
                             Lz = −i¯
                            ˆ             1 ∂          ∂    1 ∂2
                            L2 = −¯ 2
                                  h              sin θ   +           .
                                        sin θ ∂θ       ∂θ sin2 θ ∂φ2
Notice in particular that
                                            L2 = h 2 Λ 2 ,
so in solving the angular momentum problem we will have gone a long way towards solving the
Schr¨dinger equation for central potentials. It is also worth recording that

                                ˆ           ∂           ∂
                                L± = he±iφ ± + i cot θ
                                     ¯                    .
                                            ∂θ         ∂φ

We will assume in this section that these operators act on functions ψ(θ, φ) on the sphere, on
which we define the inner product

                                  ϕ|ψ =          ϕ∗ (θ, φ)ψ(θ, φ)dΩ.

So even if we are motivated by a problem in three dimensions in this section we will ignore the
radial coordinate and restrict our attention to the polar coordinates (θ, φ).
   We begin by asking what sorts of functions can be eigenfunctions of Lz . These satisfy
                             ˆ                h
                             Lz Ylm (θ, φ) = m¯ Ylm (θ, φ)
                                        ⇒               = imYlm
                                                   ∂ −imφ
                                        ⇒            (e    Ylm ) = 0
                                        ⇒         eimφ Ylm = Θ(θ)
                                        ⇒         Ylm (θ, φ) = eimφ Θ(θ)

where Θ(θ) is some (as yet unknown) function of θ alone. If Ylm (θ, φ) is to be a single-valued
function on the sphere it must satisfy the boundary condition
                                    Y (θ, φ + 2π) = Y (θ, φ).
This implies that m must be an integer. From (34) we conclude in turn that l must be an
integer. This is more restrictive than the conclusions we reached in the previous section based
on the commutation relations alone. There 2l might have been an odd integer or, in the
parlance of quantum mechanics, l might have been a half-integer. The present case where
angular momentum operators act on functions in the sphere (or more generally on functions
on three-dimensional space) is known as the case of orbital angular momentum. We say that
orbital angular momentum must have integer angular momentum.
    Nature is not wasteful. If half-integer angular momentum is allowed in principle by the
commutation relations it might be no surprise that it pops up somewhere in quantum theory.
It turns out that elementary particles have a sort of internal angular momentum called spin for
which the operators have the commutation relations we used in the previous section but which
cannot be written in the same way as differential operators. For certain elementary particles
(the electron for example) this spin angular momentum is half-integer. The fact that we did not
exclude this possibility in the previous section was not therefore an oversight. In any case we
will be concerned exclusively with orbital angular momentum in this module so let’s continue
with that discussion.
    For a given (integer) value of l, we know that the top eigenfunction with m = l must be of
the form
                                       Yll (θ, φ) = Θ(θ)eilφ .
We know furthermore that
                            0 = L+ Yll (θ, φ)
                                             ∂             ∂
                                = heiφ
                                  ¯             + i cot θ    Θ(θ)eilφ
                                             ∂θ           ∂φ
                                = hei(l+1)φ (Θ (θ) − l cot θΘ(θ)) .
                                       Θ (θ) = l cot θΘ(θ)
which can be solved as a separable first-order differential equation to give

                           ln Θ = l     cot θdθ = l ln | sin θ| + const.
                                             Θ = C sinl θ,
where C is an integration constant we will choose so that the eigenfunction is normalised. It’s
not particularly interesting to compute so we will simply quote the result,

                                            (−1)l     (2l + 1)! l ilφ
                             Yll (θ, φ) =                      sin θe ,
                                             2l l!       4π

where the (−1)l is a matter of convention.
   The other eigenfunctions could in principle be obtained by successively applying the ladder
operator L− to this. We will state without proof the form of a general eigenfunction. For
m ≥ 0,
                                           2l + 1 (l − m)!
                      Ylm (θ, φ) = (−1)m                   Plm (cos θ)eimφ ,
                                             4π (l + m)!
where Plm (cos θ) is a special function known as the associated Legendre function. It can be
calculated from the Legendre polynomial Pl (u) using the expression
                      Plm (cos θ) = sinm θ          Pl (u),        where u = cos θ.
The spherical harmonics with m < 0 are defined by
                                   Yl,−m (θ, φ) = (−1)m [Ylm (θ, φ)]∗
(the (−1)m is once again a matter of convention). It is also useful to note the inversion symmetry
                              Ylm (π − θ, φ + π) = (−1)l Ylm (θ, φ).
Notice that (π −θ, φ+π) is the point on the sphere antipodal to (θ, φ). If we have wavefunctions
of the form ψ(r, θ, φ) = R(r)Ylm (θ, φ), then this means they have the symmetry
                                        ψ(−x) = (−1)l ψ(x).
   The functions Ylm (θ, φ) are collectively called the spherical harmonics. They form an or-
thonormal set,
                                [Ylm (θ, φ)]∗ Yl m (θ, φ)dΩ = δll δmm ,
and they are complete, in the sense that any square-integrable function on the sphere can be
written as an expansion
                                                ∞     ∞
                                   ψ(θ, φ) =               clm Ylm (θ, φ),
                                                l=0 m=−∞
                                   clm =        [Ylm (θ, φ)]∗ ψ(θ, φ)dΩ.
In this context they are important not just in quantum mechanics but in other areas of applied
maths where we try to represent functions of orientation or solve equations with spherical

6.4     Appendix: angular momentum operators in sperical coordinates
Here we show how the angular momentum operators can be written in terms of spherical polar
                                                         ˆ    ˆ ˆ ˆ
coordinates. We work in terms of the vector of operators L = (Lx , Ly , Lz ) and write

                                           ˆ   ¯
                                           Lψ = x ×           ψ.

Write the gradient using spherical polar coordinates
                                     ∂ψ      1 ∂ψ         1 ∂ψ
                              ψ=        er +      eθ +            eφ ,
                                     ∂r      r ∂θ      r sin θ ∂φ
where er , eθ and eφ are respectively the unit vectors in the radial, polar and azimuthal direc-
tions. Writing
                                            x = rer
the cross product can be computed as

                                             er        eθ       eφ
                                              r        0         0
                            x×    ψ =
                                             ∂ψ    1 ∂ψ        1 ∂ψ
                                             ∂r    r ∂θ     r sin θ ∂φ

                                               1 ∂ψ        ∂ψ
                                       = −            eθ +    eφ .
                                             sin θ ∂φ      ∂θ
                       ˆ           ˆ
                       Lx ψ = ex · Lψ
                                 h    ∂ψ              1 ∂ψ
                             =           ex · e φ −          ex · e θ
                                 i    ∂θ            sin θ ∂φ
                                 h    ∂ψ               1 ∂ψ
                             =           (− sin φ) −          cos φ cos θ
                                 i    ∂θ             sin θ ∂φ
                                           ∂                 ∂
                             = i¯ sin φ       + cos φ cot θ    ψ
                                           ∂θ               ∂φ

                                                     ˆ      ˆ
and the expressions claimed for the other components Ly and Lz can be shown similarly. For
L2 we smiply square and add, giving

                 ˆ    ˆ    ˆ               ∂2         ∂          ∂2    ∂2
                 L2 + L2 + L2 = −¯ 2
                   x    y    z   h             + cot    + cot2 θ 2 + 2
                                           ∂θ2       ∂θ         ∂φ    ∂φ
                                           ∂2         ∂            ∂2
                                 = −¯ 2
                                    h          + cot    + cosec2 θ 2
                                           ∂θ2       ∂θ           ∂φ
                                       ¯           ∂      ∂      ∂2
                                 = − 2        sin θ sin θ    + 2 = h 2 Λ2
                                    sin θ          ∂θ     ∂θ ∂φ

as required.

7     Higher-dimensional problems
We will now look at how we might tackle the Schr¨dinger equation
                                     ¯    2
                                −             + V (x) ψ(x) = Eψ(x)
in higher dimensions. In general this is a difficult, even intractable problem, but we are lucky
that many important examples have enough symmetry that they are completely solvable. Our
ultimate goal in this module is to be able to solve Hydrogen, but we will begin by tackling
slightly more general problems.
    Our approach here will be to use the method of separation of variables, which we have al-
ready partially covered in the previous chapter. The three-dimensional problem with rotational
symmetry is the most important one for obvious reasons, but it is informative to begin with
problems separable in cartesian coordinates.

7.1    Separation in cartesian coordinates
The method of separation of variables is simplest in cartesian coordinates. It works when we
have a potential of the form
                               V (x, y, z) = u(x) + v(y) + w(z).
This is obviously a fairly special condition but one important problem for which we have this
is the three-dimensional harmonic oscillator
                                           1      1
                             V (x, y, z) = kr2 = k(x2 + y 2 + z 2 ).
                                           2      2
Let’s illustrate what happens in this case.
   We look for eigenfunctions of the form
                                    ψ(x, y, z) = X(x)Y (y)Z(z).
On substituting into the Schr¨dinger equation in which the Laplacian is written in cartesian
                                    2   ∂2     ∂2     ∂2
                                      =     + 2+ 2
                                        ∂x2 ∂y       ∂z
we get an equation
          −       (X Y Z + XY Z + XY Z ) + (u(x) + v(y) + w(z)) XY Z = EXY Z
which on dividing by XY Z can be separated on the left into a function of x plus a function of
y plus a function of z,
                    ¯              h2
                                   ¯               h2
               −       X + u(x)X −    Y + v(y)Y  −    Z + w(z)Z
                   2M           + 2M            + 2M            = E.
                        X              Y               Z

The only way a function of x, a function of y and a function of z can add to give a constant is
if they are each individually constant. We therefore deduce that
                                      −       X + u(x)X
                                        2M                   = Ex
                                      −       Y + v(y)Y
                                         2M                  = Ey
                                      −       Z + w(z)Z
                                        2M                   = Ez
                                          Ex + Ey + Ez = E.
We have therefore replaced the three-dimensional problem with three separate one-dimensional
   In the case of the three-dimensional harmonic oscillator we have already solved the one-
dimensional problems in Chapter 5. We find that the separation constants must be of the
                      1                       1                            1
           Ex = n +      ¯
                         hω,      Ey = m +       hω
                                                 ¯      and     Ez = l +      ¯
                      2                       2                            2
where n, m and l are independent quantum numbers, each running from 0 to ∞. The corre-
sponding eigenfunctions are
                   X(x) = ϕn (x),        Y (y) = ϕm (y),       and       Z(z) = ϕl (z).
The three-dimensional eigenfunctions
                                   ψnml (x, y, z) = ϕn (x)ϕm (y)ϕl (z)
are therefore labelled by three separate quantum numbers and the corresponding energies are
                                       1           1          1
                        Enml =       n+  ¯
                                         hω + m +    ¯
                                                     hω + l +   ¯
                                       2           2          2
                               =     n+m+l+    ¯
Notice that these levels are degenerate, a possibility we that was excluded in one-dimensional
problems. For example, there are six ways to get the energy E = (2 + 3/2)¯ ω = (7/2)¯ ω: we         h
could have (n, m, l) = (2, 0, 0), (0, 2, 0), (0, 0, 2), (1, 1, 0), (1, 0, 1) or (0, 1, 1) and each of these
has a different eigenfunction ψnml (x, y, z). Any linear combination of them such as
                    ψ(x, y, z) = √ (ψ110 (x, y, z) + ψ101 (x, y, z) + ψ011 (x, y, z))
will also be an eigenfunction. The energy level (7/2)¯ ω therefore defines a 6-dimensional vector
space of eigenfunctions.

7.2    Central potentials in two dimensions
In two dimensions the Laplacian is expressed in polar coordinates (r, θ) as

                                      2       1 ∂ ∂   1 ∂2
                                          =       r + 2 2.
                                              r ∂r ∂r r ∂θ
For central potentials V (r) we look for solutions of the form

                                      ψ(r, θ) = R(r)Θ(θ).

Plugging in and separating into functions of r and functions of θ gives
                              1 d dR 2M (E − V (r))
                                  r   +             R   Θ
                              r dr dr       h2
                                            ¯         =   .
                                        1               Θ
A function of r can only equal a function of θ if they are both equal to the same constant, λ
say. The resulting angular equation

                                           Θ (θ) = λΘ(θ)

has solutions subject to the boundary condition

                                          Θ(θ + 2π) = Θ(θ)

which are of the (unnormalised) form

                                              Θ(θ) = eimθ

where m is an integer. These angular functions are the two-dimensional equivalent of the
spherical harmonics. Compare the two dimensional equation

                                                λ = m2

with its three-dimensional equivalent λ = l(l + 1).
   Here we are more interested in the radial equation
                             1 d dR 2M (E − V (r))    m2
                                 r   +             R=    R
                             r dr dr     h2
                                         ¯            r
                            1        2M (E − V (r)) m2
                     R (r) + R (r) +               − 2 R(r) = 0
                            r             h2
                                          ¯          r
As in the three-dimensional case, this is similar to the one-dimensional Schr¨dinger equation
with an effective potential
                                                      m2 h 2
                                   Veff (r) = V (r) +         .
                                                     2M r2

The solution, of course, depends on the details of the potential V (r). Let’s try a particular

Example A particle in a two-dimensional circular cavity.
This is a two-dimensional analog of the particle-in-a-box problem we solved in Chapter 3. We
might define the central potential

                                               0        for r < a,
                                   V (r) =
                                               ∞ for r > a.
Alternatively we write out the Sch¨dinger equation for a free particle in two dimensions and
impose the boundary condition
                                        ψ(a, θ) = 0.
                                        2M E
                                k2 =                and        x = kr
the radial equation can be written as

                                d2 R 1 dR      m2
                                    +      + 1− 2 R=0
                                dx2   x dx      r

and this is recognised as Bessel’s equation. The general solution is

                                 R(r) = AJm (kr) + BYm (kr),

where A and B are constants and Jm (x) and Ym (x) are Bessel functions in the usual notation.
To avoid singularities in the wavefunction at r = 0 we demand B = 0. The boundary condition
at r = a then gives
                                          Jm (ka) = 0.
For each m the zeros of the Bessel function — xnm with n = 1, 2, · · · — are tabulated in
reference books or readily available from numerical packages. We therefore get a solution to
the Schr¨dinger equation if
                                        k = knm =
and we get the corresponding energy levels

                                            h2 knm
                                            ¯ 2      h 2 x2
                                                     ¯ nm
                                   Enm    =        =        .
                                             2M      2M a2
The corresponding eigenfunctions are of the form

                                 ψnm (r, θ) = CJm (knm x)eimθ ,

where C is a normalisation constant.

(35)                                             u(r)   u (r) +
                                2M (E − Veff (r))
                               Substituting this into the radial equation leads to the equation
                                        u(r) = rR(r).
                                                                                                          Define the function
lem on the half-line r > 0 with hard-wall boundary conditions ar r = 0.
Figure 10: The three-dimensional radial equation can be mapped onto a one-dimensional prob-
                                                                  ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
                                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                    ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¢
                                                                     ¢¡¡¡¡¡¡¡¡¡ ¢
                                                                        ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ ¡
                                                                          ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                                             ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                            ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 
                                                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢                                   
                                                                                      ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢                               
                                                                                        ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                                                                         ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¢ 
                                                                                            ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                        r                                                                     ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ 
                                                                                                ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢                       
                                                                                                  ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                                                                                   ¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡¢¡ ¢ 
                                                                                                      ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                                                                                          ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢               
                                                                                                            ¢¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢ ¢
                                                      eff                                                    ¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡¢ ¡ 
                                                                                                                ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¢
                                                     V (r)                                                       ¢¡¡¡¡¡¡¡¡¡ ¢ 
                                                                                                                      ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡
                                                                                                                      ¡¡¡¡¡¡¡¡¡ ¢ 
We can make this analogy more precise as follows.
                                                    2M r 2
                               Veff (r) = V (r) +              .
                                                  l(l + 1)¯ 2
We have already remarked that this looks like a one-dimensional equation with the effective
                           r             ¯
                                         h2            r2
                    R (r) + R (r) +               −         R = 0.
                           2        2M (E − V (r)) l(l + 1)
                      r 2 dr   dr        ¯
                                         h2            r2
                             r    +               −         R=0
                      1 d 2 dR      2M (E − V (r)) l(l + 1)
            where the Ylm ’s are the spherical harmonics and R(r) satisfies the radial equation
                                   ψ(r, θ, φ) = R(r)Ylm (θ, φ)
equation are of the form
We have already established in the previous chapter that separated solutions of the Schr¨dinger
                                  Central potentials in three dimensions                                                                       7.3
which is precisely the equation we had in one dimension. The boundary conditions are novel
however. Notice that
                                        R(r) =
is singular at r = 0 unless
                                    u(r) → 0 as r → 0.
Imposing this boundary condition, and solving (35) on the half-line 0 < r < ∞ is equivalent to
solving the problem of a particle moving in one dimension under the influence of the potential

                                                    Veff (r) for r > 0,
                                     v(r) =
                                                    ∞        for r < 0.
Alternatively it is the problem of a particle moving on the half-line 0 < r < ∞ under the
potential Veff (r) and with a hard wall at r = 0 (Figure 10).
   The normalisation conditions are also the natural ones we would have for a one-dimensional
                         ψ       =                 |ψ|2 r2 drdΩ
                                     0        S2
                                 =            |R(r)|2 r2 dr ·            |Ylm (θ, φ)|2 dΩ
                                         0                          S2
                                 =           |u(r)|2 dr.

In addition, the probability that the particle is somewhere in the shell defined by a < r < b is
                                     P (r ∈ (a, b)) =             |u(r)|2 dr

so |u(r)|2 is a kind of radial probability density.

Example Spherical solutions of the three-dimensional harmonic oscillator.
                                        V (r) = kr2
then the spherically symmetric eigenstates with l = 0 lead to the equation
                                              2M      1
                                     u +         2 E − kr2 u = 0
                                               ¯      2
with the boundary conditions
                                             u → 0 as r → 0, ∞.
Except for the boundary condition at r = 0, we recognise here the Schr¨dinger equation for the
harmonic oscillator, which we have already solved.

   Imposing u → 0 as r → 0 means that we select the solutions of the harmonic oscillator with
odd values of the quantum number n, for which the eigenfunctions are odd. We have explicitly
                               2      mω 1/4 −ξ2 /2
                   un (r) = √                   e   Hn (ξ),   n = 1, 3, 5, · · · ,
                              2n n! π¯  h
where ξ = r mω/¯ . There is an extra 2 because the normalisation condition is different to
the one used in Chapter 5 in that it involves integrating over the half-line only.
   In particular the ground-state wavefunction is
                        ψ0 =      u1 (r)Y00 (θ, φ)
                                1 mω 1/4 −mωr2 /(2¯ ) h       mω      1
                            =                   e          2r       √
                                r     π¯h                      ¯
                                                               h       4π
                                 mω 3/4 −mωr2 /(2¯ )
                            =             e          ,
a simple Gaussian. Let’s verify that it is normalised:
                            ψ0         =                     |ψ0 |2 r2 drdΩ
                                            0           S2
                                       = 4π                  |ψ0 |2 r2 dr

                                                    mω           3/2         ∞           2 /¯
                                       = 4π                                      e−mωr          r2 dr
                                                    π¯                   0
                                          4                 ∞        2
                                       = √                      e−ξ ξ 2 dξ
                                            π           0

                                       = 1.
The corresponding ground-state energy is
                                                             1     3
                                       E0 = 1 +                ¯     ¯
                                                               hω = hω.
                                                             2     2
   Notice also that using
                                           r 2 = x2 + y 2 + z 2
in the exponential allows us to write
                                     ψ0 (x, y, z) = ϕ0 (x)ϕ0 (y)ϕ0 (z)
where ϕ0 (x) is the ground state of the one-dimensional harmonic oscillator as calculated in
Chapter 5. The ground state therefore also coincides with the eigenfunction denoted by
ψ000 (x, y, z) in section 7.1. In that case we found the same energy but explained it differently
— as the sum of the ground state energies for the x, y and z degrees of freedom
                                        1      1      1      3
                                  E0 = hω + hω + hω = hω.
                                          ¯      ¯      ¯      ¯
                                        2      2      2      2
The end result is the same however.

7.4    Hydrogen
We will adopt a model for Hydrogen in which an electron of mass me and charge −e orbits
a fixed nucleus of charge Ze and where the force of attraction is derived from the Coulomb
                                   V (r) = −          .
                                             [4πε0 ]r
For Hydrogen proper Z = 1, but it is simple to allow for the possibilities Z = 2, 3, · · · which
would describe ions He+ , Li++ and so on, which are “hydrogen-like ions”. The model of a fixed
centre is a bit of a simplification because, while heavy compared to the electron, nuclei do have
finite masses and can move around. It turns out that a slight modification of the calculation
we are about to perform allows this effect to be incorporated without approximation, but a
description of how this is done would be too much of a diversion so we will simply assert a
centre of infinite mass.
    The factor in square brackets is present if we use SI units but is absent in cgs units, SI units
are easier to compare with experiments but cgs units are better in theoretical work because
they lead to simpler equations with fewer factors of 4π and suchlike. When we deal with atomic
problems it is usual in fact to make an even stronger simplification. In natural units we express
mass in units of the electron mass me , distances in units of the Bohr radius a0 and energy
in units of the quantity E0 = e2 /([4πε0 ]a0 ) defined in the first chapter. This is equivalent to
saying that we choose units in which

                                      h = e2 /[4πε0 ] = me = 1.

It means that we can simplify the Schr¨dinger equation from

                                       ¯       2          Ze2
                                  −                ψ−            ψ = Eψ
                                      2me               [4πε0 ]r
                                           1   2         Z
                                       −           ψ−      ψ = Eψ.
                                           2             r
In natural units the Bohr energies are
                                               En = −            .
Whenever we get an answer like this, appearing to give a physical answer as a dimensionless
quantity, in order to express the result in more familiar units we should simply remember that
the dimensionless number gives us the answer as a multiple of some fundamental unit, E0 in
the case of energy, a0 in the case of distance etc.
    As we did for a general central potential in three dimensions, we look for an eigenfunction
of the form
                                    ψ(r, θ, φ) =      Ylm (θ, φ).

The radial equation for u(r) is
                                              Z l(l + 1)
                              u (r) + 2 E +     −        u(r) = 0.
                                              r     r2
Introduce the new energy-controlling parameter
                                          n= √
(remember bound states have E < 0 and we hope to get the answer E = −1/(2n2 ) with integer
n for Z = 1) and the new radial coordinate
                                            ρ=      r.
Then the radial equation is
                                d2 u    1 n l(l + 1)
                                   2 + − +  −        u = 0.
                                dρ      4 ρ    ρ2
We will perform one more transformation to turn this equation into a standard form. To justify
a little why the transformation is chosen as it is, lets us consider a couple of limits.

The limit ρ → 0. In this limit the leading terms in the radial equation are
                                      d2 u l(l + 1)
                                          −         u = 0,
                                      dρ2     ρ2
for which the solution behaves as
                                              u ∼ ρl+1
(there is also a solution ρ−l that we eliminate because it diverges).

The limit ρ → ∞. In this limit the leading terms in the radial equation are
                                         d2 u 1
                                             − u = 0,
                                         dρ2 4
for which the solution behaves as
                                           u ∼ e−ρ/2 .
In this case we discard an exponentially growing solution if u is to be square-integrable.

   We make a transformation which incorporates both of these limits. We define w(ρ) by

                                        u = ρl+1 e−ρ/2 w.

When the radial equation is worked out in terms of w we get

                      ρw (ρ) + (2(l + 1) − ρ)w (ρ) + (n − l − 1)w(ρ) = 0.

This equation is “well-known” and is called Laguerre’s equation.
   We can write simple solutions of this equation where n and l are integers. For example,

                          w=1          for n = l + 1 and l = 0, 1, 2, · · ·

                              w =2−ρ            for n = 2 and l = 0.
We will find that the most general solution which leads to a square-integrable wavefunction
occurs when n and l are integers and w(ρ) is a polynomial of degree n − l − 1.
   Look for a solution in series form
                                          w(ρ) =         ak ρk .

The individual terms in Laguerre’s equation are
                                   ∞                               ∞
                      ρw (ρ) =           k(k − 1)ak ρk−1 =               (k + 1)kak+1 ρk
                                   k=0                             k=0
                                   ∞                                ∞
                (2l + 2)w (ρ) =          2(l + 1)kak ρk−1 =              2(l + 1)(k + 1)ak+1 ρk
                                   k=0                             k=0
                    −ρw (ρ) =            −kak ρk
              (n − l − 1)w(ρ) =          (n − l − 1)ak ρk .

Adding up and equating coefficients gives

                       [k + 2(l + 1)](k + 1)ak+1 + (n − l − 1 − k)ak = 0                          (36)

                                  ak+1      1+l+k−n
                                   ak    (k + 1)[k + 2(l + 1)]
Notice that
                                          ak+1    1
                                           ak     k
for fixed l and n as k → ∞, which is the ratio we find in the Taylor series of e ρ . From this one
can show that
                                     w(ρ) const. × eρ
unless the series terminates. This would lead to a u = e−ρ/2 w const. × eρ/2 that diverges as
ρ → ∞ and is nonintegrable
   We therefore arrive at the conclusion that the only solutions leading to square-integrable
wavefunctions are those for which the series terminates. This happens when

                                            k = n − l − 1.

From this we deduce that n is an integer which is strictly greater than l. We think of the
terminating value of k as a quantum number and denote it by N . By convention this is called
the radial quantum number and counts the zeros in the radial part of the wavefunction. We
find solutions with N = 0, 1, 2, · · · and we denote

                                          n = 1 + l + N,

which may take the values n = 1, 2, 3, · · · as the principle quantum number.
   In summary we have found that the eigenvalues are
                                             En = −
where n is a positive integer. These eigenvalues are generally quite degenerate. For a fixed n
we can write eigenfunctions ψnlm for all values of l and m for which

                            l = 1, 2, · · · , n − 1 and     − l ≤ m ≤ l.

For the ground state with n = 1 there is a single such state, with l = m = 0 For n = 2 we
may take l = 0 or 1 and these have 1- and 3-fold degenerate spherical harmonics respectively.
So the first excited state is 4-fold degenerate. The degeneracy increases steadily with n in this
   For given values of n and l, we can easily generate the polynomial w(ρ) using (36). We will
not discuss in great detail how the properties of these polynomials are developed but we will
quote some useful results. The eigenfunctions can be written

                              ψ(ρ, θ, φ) = ρl e−ρ/2 L2l+1 (ρ)Ylm (θ, φ)

                                        Lq (ρ) =
                                         p        Lp (ρ)
and the Legendre polynomials Lp (ρ) are generated by
                                                    dp p −ρ
                                      Lp (ρ) = eρ       ρe .

Example: The wavefunction for states with n = l + 1 is of the form

                       ψnlm (ρ, θ, φ) = const. × ρn−1 e−ρ/2 Ylm (θ, φ)
                                      = const. × e−ρ/2+(n−1) ln ρ Ylm (θ, φ).

Along a radial line, this is greatest when
                                      d    ρ
                                0 =       − + (n − 1) ln ρ
                                      dρ   2
                                      n−1 1
                                    =      −
                                        ρ    2

                                           ρ = 2(n − 1).
In the proper radial coordinate this is

                                            n    n(n − 1)
                                      r=      ρ=          .
                                           2Z       Z
When n is large, this tells us that the wavefunction is greatest on a shell of radius r ≈ n 2 /Z or,
in more general units, the faction n2 a0 /Z of the Bohr radius. Therefore the answer has elements
of the Bohr model in it but of course presents a much more sophisticated and complete picture.
    These solutions are of such fundamental importance in atomic theory that special notation,
called spectroscopic notation is in common usage. Letters are used to denote the lower values
of l; S for l = 0, P for l = 1, D for l = 2 and other letters for larger l. We might write for
                                   ψ2S = const. × (2 − ρ)ρe−ρ/2
for a state with l = 0 and n = 2. The first few states are enumerated in the table below.

                 n       Possible angular    degeneracies        total degeneracy
                         momenta     and
                 1       S      ψ1S          1                   1
                 2       S      ψ2S          1
                         P      ψ2P          3                   4
                 3       S      ψ3S          1
                         P      ψ3P          3
                         D      ψ3D          5                   9

    For each n the S-states are spherically symmetric states depending only on r. The P states
are triply degenerate since the angular part of the wavefunction can be chosen to be Y 1,−1 (θ, φ),
Y10 (θ, φ) or Y11 (θ, φ) — or an arbitrary linear combination of all three. We are free to choose

                                                         cos θ
                           ψ2P = const. × ρe−ρ/2 ×          sin θ cos φ
                                                            sin θ sin φ,
                                 = const. × e−ρ/2 ×  x

for example. Five states could be labelled 3D, each of the form,

                              ψ3D = const. × ρ2 e−ρ/2 × Y2m (θ, φ)

and these could be written as an exponential of the radial coordinate multiplied by a quadratic
form in (x, y, z). The degeneracies and shape of these lower eigenfunctions play a very important
role in chemistry.


To top