Docstoc

The Promise of Functional Programming by Konrad Hinsen

Document Sample
The Promise of Functional Programming by Konrad Hinsen Powered By Docstoc
					             S    CIEntIfIC                       ProgrAmmIng
                                                   Editors: Konstantin Läufer, laufer@cs.luc.edu
                                                        Konrad Hinsen, hinsen@cnrs-orleans.fr




The Promises
of funcTional Programming
By Konrad Hinsen


Adopting a functional programming style could make your programs more robust, more compact, and more
easily parallelizable. However, mastering it requires some effort.




S       ince the early days of com-
        puting, software development
        techniques have changed
almost as much as computer tech-
nology itself. Ever more powerful
                                                   (Alonzo Church’s λ-calculus) and as a
                                                   programming technique in the 1950s
                                                   (John McCarthy’s Lisp language)—
                                                   functional programming. Although
                                                   functional programming has been
                                                                                                   probably Mathematica, although most
                                                                                                   Mathematica users don’t write com-
                                                                                                   plex functional programs. Another
                                                                                                   example of symbolic processing is the
                                                                                                   widely used fast Fourier transform li-
hardware made it possible to write                 very popular in computer science                brary, FFTW,1 which uses a function-
ever more complex software, which                  research for several decades, its use           al code optimizer written in OCaml
both required ever better develop-                 for writing real-life programs is rela-         to produce optimized C code for a
ment tools and techniques and made                 tively recent. Of the many reasons              Fourier transform of a given length.
their implementation possible. Pro-                for this, the two most important are            In numerical applications, functional
grammers thus moved from machine                   that functional programming is very             programming doesn’t yet play an im-
code to assembly languages and then                different from traditional program-             portant role. Perhaps the most ambi-
problem-oriented programming lan-                  ming (also referred to as imperative            tious project to introduce it into the
guages, which have evolved to inte-                programming) and thus requires a lot            number-crunching world was Law-
grate techniques such as structural                of learning and unlearning, and that            rence Livermore National Laborato-
and object-oriented programming.                   computer hardware implements the                ry’s Streams and Iteration in a Single
Another evolution went from mono-                  imperative programming model, so                Assignment Language (SISAL) proj-
lithic programs via separately compil-             imperative programs are easier to               ect, which started in 1983. SISAL is a
able modules and libraries to software             compile into efficient machine code             functional language with paralleliza-
component technologies. However, in                than functional programs. However,              tion support, designed specifically for
one respect, today’s popular program-              several clear signs indicate a growing          the needs of scientific applications.
ming techniques are still the same as              interest in functional programming              Unfortunately, it didn’t attract the
those the pioneers used: our programs              techniques—recent        programming            attention it deserved, and funding
consist of statements that modify data             languages (such as Sun’s Fortress or            stopped in 1996. Today, SISAL lives
stored in the computer’s memory until              Microsoft’s F#) have the explicit goal          on as an open source project (http://
that memory contains the desired re-               of supporting it. The reason is that            sorceforge.net/projects/sisal).
sult. This approach closely resembles              functional programming has several                 This article’s purpose is to explain
how a computer works at the hardware               advantages for concurrent and par-              what functional programming is and
level: the processor fetches data from             allel programming. Experience also              how it differs from traditional impera-
memory, performs elementary opera-                 suggests that functional programs are           tive programming. I also explain how
tions on it, and writes the result back            more robust and easier to test than             functional programming helps with
to a memory cell.                                  imperative ones.                                concurrent and parallel program-
   In fact, this approach is so com-                  In scientific computing, research-           ming. The language I use in the ex-
mon that most of you have probably                 ers have mainly used functional pro-            amples is Clojure, a modern dialect
never questioned it. And yet, an al-               gramming for symbolic processing. In            of Lisp (see the sidebar “Clojure and
ternative approach was developed                   fact, the most widely used functional           the Lisp Family”), but everything said
as mathematical theory in the 1930s                programming language in science is              here applies equally to other function-


86       Copublished by the IEEE CS and the AIP    1521-9615/09/$25.00 © 2009 IEEE                       Computing in SCienCe & engineering
                                              Clojure and the lisp Family

                                              t    he language used in the main text’s examples is Clojure, a modern dialect
                                                   of the Lisp language family. Lisp stands for “list processing,” which hints at
                                              the motivations of John mcCarthy, who developed the first Lisp language in
                                              the late 1950s for use in artificial intelligence research. today, the most widely
al languages, only the syntax will be         used Lisp dialects are Scheme and Common Lisp.
different. If you’re interested in func-          the Lisp language family’s distinctive feature is the principle that “code is
tional programming, you should also           data.” Lisp provides a syntax for a simple yet flexible data structure—nested lists.
read (or reread) Jerzy Karczmarczuk’s         It then defines how to interpret nested lists that follow specific conventions as
1999 article in this magazine,2 which         executable code. the advantage of expressing code in a data structure is that
illustrates how functional program-           Lisp code can easily generate (and then execute) other Lisp code, a fact that pro-
ming can yield elegant solutions to           grammers have exploited from the beginning through Lisp macros, the world’s
problems in mathematical modeling.            first meta-programming system and probably still its most powerful one.
                                                  Lisp has the reputation of being slow, but that isn’t true anymore. many
The Basics                                    modern compilers can produce code that can compete with C in performance
The fundamental principle of func-            when given appropriate optimization hints. However, it remains very difficult
tional programming is that you re-            to produce programs that are both efficient and portable between compilers
alize a computation by composing              and platforms.
functions. The word “function” is                 Clojure is a recent Lisp dialect that sets itself apart by three features: it has
used here in the mathematical sense—          four highly optimized data structures (lists, vectors, maps, and sets) designed
a mapping from input values to output         for pure functional programming, it offers extensive support for concurrency,
values—whereas what most program-             and it was designed for the Java Virtual machine with the goal of easy inter-
ming languages call “functions” are           operability with other JVm languages, including Java itself. for more informa-
subroutines that return a value. One          tion about Clojure, see its Web site at http://clojure.org.
important difference is that a func-
tion in the mathematical sense always
produces the same output when given
the same input. An operation such as         write a useful program in a functional       guage like Python and JavaScript). If n
“get the next line from a file” isn’t a      style, but keep reading.                     is zero, the return value of countdown
function because each time you call,            What replaces loops in functional         is (list 0), a list containing the sin-
it produces a different return value.        programs is recursion. A function            gle element 0. Otherwise (we’re look-
Another important difference is that         is called recursive if it calls itself—      ing at lines four to six now), the return
a mathematical function doesn’t “do”         directly or indirectly. Of course,           value is a list constructed by prepend-
anything other than return a value. It       calling itself makes sense only if the       ing n to the return value of count-
isn’t supposed to have side effects—for      arguments change between calls.              down for (- n 1), which is Clojure’s
example, it shouldn’t write anything to      Moreover, if you ever want to get out        way of writing n-1. The function call
a file or change a variable in memory.       of the call-yourself chain, a recursive      (countdown 1) thus returns (cons
   If a program is composed of func-         function must return without calling         1 (countdown 0)), which, after ex-
tions, and functions aren’t supposed         itself for some arguments. Let’s look        ecuting the recursive function call,
to change any variables, then what are       at a simple example of recursion:            becomes (cons 1 (list 0)). Look-
variables good for? Nothing, and that’s                                                   ing up the definitions of the functions
why functional programming doesn’t           (defn countdown [n]                          cons and list will tell you that the
have variables. This is probably the           (if (zero? n)                              final result is the two-element list (1
biggest surprise to those who discover           (list 0)                                 0). This simple example illustrates
functional programming because vari-             (cons n                                  how to use recursion for looping: with
ables are so very fundamental to our                   (countdown                         each recursive call, the argument be-
traditional ways of writing programs.                    (- n 1)))))                      comes smaller, up to the point where
The other missing fundamental con-                                                        it’s handled directly without any fur-
cept is loops. After all, what’s the point   These six lines define a function            ther recursion.
of a loop if nothing can change be-          countdown of a single argument n,               A closer look at the chain of recur-
tween iterations because there are no        which should be an integer (although         sive calls to countdown also reveals
variables? By now, you might be con-         this is neither said nor enforced, Clo-      why functional programming can live
vinced that it’s absolutely impossible to    jure being a dynamicallly typed lan-         without variables. At each recursive


July/AuguSt 2009                                                                                                                  87
                   SCIEntIfIC ProgrAmmIng




call, n decreases by one, which looks a       to the programmer. As is so often the      are called higher-order functions, as op-
bit as if n were a variable decremented       case, both approaches have their good      posed to first-order functions whose
from its initial value to zero; indeed,       and bad sides.                             arguments and return values are all
a compiler could transform the recur-                                                    standard data items. In mathematics,
sive function into a subroutine with a        Functional Abstractions                    you would use the term operator, an
loop over n for efficiency reasons. The       Abstractions are fundamental to writ-      example being the derivative operator
crucial difference is that n isn’t a refer-   ing nontrivial programs. They permit       that maps a function to its derivative,
ence to a piece of memory that could          expressing an algorithm in terms of        which is itself a function.
be modified at will, intentionally or         concepts that are useful in its context,     As a simple example, let’s consider
by mistake. In fact, as anyone with           rather than in terms of operations that    the following operations: calculating
debugging experience has learned the          the computer or the programming            the sum of a list of numbers, calculat-
hard way, the problem with variables          language already happens to provide.       ing the product of a list of numbers,
is often that looking at a variable’s         A programmer would write a least-          and finding the set of all items that oc-
value doesn’t tell you where that value       squares fit problem, for example, in       cur in a list of values. What these (and
came from. In contrast, you can al-           terms of linear-algebra operations—        many more) operations have in com-
ways trace the call chain that leads to       such as matrix multiplication and          mon is a simple algorithmic pattern:
n having a specific value at a specific       solving linear systems of equations—       you start with an initially “empty” ac-
point in the function countdown back          that work on an array data structure.      cumulator value (0, 1, the empty set)
to its beginning. The function call           Compilers and libraries (written by        and then iterate over a list, combining
chain is a complete description of the        you or by someone else) then trans-        at each step the current accumula-
data flow through the program, and            form the algorithm into something          tor value with one list element. The
it’s a very useful feature for verifying a    that a computer can handle.                combination operations for the three
program’s correctness.                           The abstractions provided by popu-      examples given are addition, multi-
   I hope I’ve convinced you that vari-       lar programming languages for sci-         plication, and adding an item to a set.
ables and loops aren’t as essential as        entific computing include basic data       This algorithmic pattern is known as
you might have thought. But what              structures (integer, real and complex      reduction; it’s implemented in Clojure
about other side effects? Is it really        numbers, arrays, and structures)           via the function reduce. We can thus
practical to work with programs that          and a notation for numerical expres-       write our three examples as
can’t write data to a file? Or, in fact,      sions that’s similar to mathemati-
produce any output? Of course, the            cal notation. Programmer-provided          (defn sum [numbers]
answer is no: the pure functional pro-        abstractions are mainly subroutines.         (reduce + 0 numbers))
grams I’ve described to this point will       Object-oriented languages add power-       (defn product [numbers]
just heat up your computer. You need          ful constructs for data abstraction: the     (reduce * 1 numbers))
side effects if you want your program         programmer can add problem-specific        (defn set-of-items [items]
to have any interaction with the real         data types to the general ones provid-       (reduce conj #{} items))
world. But you can—and should—                ed by compiler and runtime systems.
limit side effects to very few places in         In functional programming, al-          The first argument to reduce is the
a program.                                    gorithmic abstractions are the most        combination operation, which is a
   We can categorize functional pro-          prominent. A developer identifies and      function of two arguments. In Clojure,
gramming languages by their attitude          implements patterns that occur re-         we can simply use + and * for addition
to unavoidable side effects. Pure lan-        peatedly in algorithms in the form of      and multiplication because they’re
guages (such as Haskell) allow them           functions, which is made possible by       plain functions—there’s no special
only inside special language con-             the fact that functional programming       notation for mathematical operators.
structs, permitting the compiler to           languages consider functions to be         In the last example, conj is a function
verify the absence of accidental side         data. It’s possible, and even very com-    that adds an item to a collection.
effects everywhere else. Impure lan-          mon, to write functions that take oth-       It’s interesting to see how we could
guages (the majority) leave the respon-       er functions as parameters, returning      implement reduce if it weren’t provid-
sibility for the use of side effects fully    yet another function. Such functions       ed already. Here’s one way to write it:


88                                                                                              Computing in SCienCe & engineering
 other FunCtional languages                                        is possible only in a pure functional setting because side
                                                                   effects would otherwise occur at completely unpredictable

 W      e can group the most popular languages that sup-
        port functional programming into just a few families.
 the oldest one, the Lisp family, is described in the “Clojure
                                                                   times. Lazy evaluation makes it possible to work with infi-
                                                                   nite data structures, avoid unnecessary computations, and
                                                                   define control structures as simple functions. However, lazy
 and the Lisp family” sidebar.                                     evaluation also makes a program’s CPU time and memory
    the second group is the mL family, which first appeared        usage profile more difficult to understand and generally
 in the 1970s. today, its most prominent members are Stan-         leads to slower programs overall because of unavoidable
 dard mL and oCaml, but microsoft’s recently published f#          bookkeeping overhead.
 language might soon catch up with them. the mL languag-              two recent languages, Scala (for the Java Virtual ma-
 es differ from the Lisp family in two respects: they’re stati-    chine) and nemerle (for the .nEt platform), are hybrid
 cally typed, using the Hindley-milner inference algorithm to      languages that add functional programming features to
 permit the compiler to deduce the types of most functions         otherwise quite standard object-oriented languages. Sun’s
 from the way they’re used, and they propose a pattern-            new fortress language, whose main intended application
 matching syntax for defining functions that makes many            domain is high-performance computing, also proposes a
 definitions look similar to common mathematical notation.         mixture of functional and object-oriented features.
    the Haskell language is the result of a collective effort to      moving on to special-purpose languages, Wolfram’s
 define a common functional language for use in program-           computer algebra system mathematica is based on a
 ming language research. It’s similar in many respects to the      proprietary functional programming language. other
 mL family, the most important difference being lazy evalu-        computer algebra systems also propose functional features,
 ation: a data item (such as a list element) is evaluated only     although not always to the point of allowing a full func-
 when its value is actually required in a computation. this        tional programming style.




(defn my-reduce                              make-adder that takes a numerical            eral execution threads operating on
  [op initial items]                         argument x and returns a function of         the same data) and parallelism (the
  (if (empty? items)                         another numerical argument y that            division of a computational task into
    initial                                  adds x and y:                                multiple communicating processes
    (my-reduce                                                                            running in parallel) are two aspects
      op                                     (defn make-adder [x]                         of computing rapidly gaining in im-
      (op initial                              (fn [y] (+ x y)))                          portance. The main reason is that
          (first items))                                                                  single-processor performance is no
      (rest items))))                        Calling (make-adder 2) returns a             longer improving at the pace it used
                                             function that adds 2 to its argument.        to; instead, computers are becom-
This is again a recursive function. If       We can use this function just like any       ing more powerful by integrating
its input list items is empty, it just re-   other one, such as                           more computational cores. Exploiting
turns the initial value. This is the re-                                                  such machines requires concurrency,
cursion’s exit; otherwise, it applies the    ((make-adder 2) 3)                           parallelism, or both. Unfortunately,
function op to the initial value and the                                                  today’s mainstream techniques for
first element of the list ((op ini-          which yields 5. Note that the result         concurrent and parallel programming
tial (first items))) and feeds               of (make-adder 2) is a function              are difficult to learn and quite error-
the result to a recursive call on what’s     that stores an integer value (2) inter-      prone in practice.
left of the list after removing the first    nally that was passed as an argument            The big issue with concurrency is
element ((rest items)). Note that            to make-adder. A function that cap-          the difficulty of maintaining the data
Clojure doesn’t treat the argument op        tures a value in this way is called a clo-   in a coherent state: it must be impos-
in any special way merely because it’s       sure; it’s a widely used programming         sible for one execution thread to mod-
a function. Functions are perfectly          technique in functional programs. As         ify data that another one is accessing
ordinary data items, just like integers      you might have guessed, this is where        (reading or writing) at the same time.
and text strings.                            the Clojure language derived its name,       So, if several threads need to modify a
   Functions can also create and re-         with the “j” hinting at Java.                data item, they must do so in a coordi-
turn other functions, as illustrated in                                                   nated way. This is currently achieved
the following (somewhat contrived)           Concurrency and Parallelism                  via locks, but they’re difficult to use,
example, which defines a function            Concurrency (the existence of sev-           and their incorrect use can go un-


July/AuguSt 2009                                                                                                               89
                    SCIEntIfIC ProgrAmmIng




noticed for a long time before errors             Functional programming is fre-            while guaranteeing that the program’s
show up. As for parallelism, the main          quently cited as a promising tech-           result won’t change.
difficulties are identifying independent       nique in this context. Pure functional
computations inside a program and              code has no variables and thus no data
coordinating them with the required
communication operations such that
the resulting program always produces
                                               coherence issues or need for locking.
                                               Moreover, all data dependencies are
                                               explicit, making it possible to apply a
                                                                                            I  f this makes you hope that auto-
                                                                                               matic parallelization will be your
                                                                                            welcome gift once you succeed in
the correct result and does so efficient-      large number of program transforma-          entering the world of functional
ly for typical input parameters.               tions (with the goal of parallelization)     programming, you’re in for disap-
                                                                                            pointment. Although it’s true that
                                                                                            compilers for functional languages
                                                                                            could in principle transform a se-
                                                                                            rial into an equivalent parallel pro-
                                                                                            gram automatically, there remains
                                                                                            the problem of finding such a trans-
 The American Institute of Physics is a not-for-profit membership corporation chartered
 in New York State in 1931 for the purpose of promoting the advancement and diffusion
                                                                                            formation that actually yields an ef-
 of the knowledge of physics and its application to human welfare. Leading societies in     ficient program for a given parallel
 the fields of physics, astronomy, and related sciences are its members.                    computer and given input data. Com-
 In order to achieve its purpose, AIP serves physics and related fields of science and
                                                                                            piler technology isn’t yet up to this
 technology by serving its member societies, individual scientists, educators, students,    task, although this could well change
 R&D leaders, and the general public with programs, services, and publications—             in the future, in particular with par-
 information that matters. The Institute publishes its own scientific journals as well as   allelizing just-in-time compilers that
 those of its member societies; provides abstracting and indexing services; provides        have access to a program’s execution
 online database services; disseminates reliable information on physics to the public;
 collects and analyzes statistics on the profession and on physics education; encourages
                                                                                            time profile. For the near future, it’s
 and assists in the documentation and study of the history and philosophy of physics;       reasonable to expect compilers that
 cooperates with other organizations on educational projects at all levels; and collects    create parallel programs semiauto-
 and analyzes information on federal programs and budgets.                                  matically based on programmer-pro-
 The scientists represented by the Institute through its member societies number more       vided performance hints. This would
 than 134 000. In addition, approximately 6000 students in more than 700 colleges and       already be a significant step forward
 universities are members of the Institute’s Society of Physics Students, which includes    compared to today’s parallel pro-
 the honor society Sigma Pi Sigma. Industry is represented through the membership of 37     gramming techniques.
 Corporate Associates.

 Governing Board: Louis J. Lanzerotti (chair)*, Lila M. Adair, David E. Aspnes, Anthony     References
 Atchley*, Arthur Bienenstock, Charles W. Carter Jr*, Timothy A. Cohn*, Bruce H. Curran*,    1. m. frigo and S.g. Johnson, “the Design and
 Morton M. Denn*, Alexander Dickison, Michael D. Duncan, H. Frederick Dylla (ex                 Implementation of fftW3,” Proc. IEEE, vol.
 officio)*, Janet Fender, Judith Flippen-Anderson, Judy R. Franz*, Brian J. Fraser, Jaime       93, no. 2, 2005, pp. 216–231.
 Fucugauchi, John A. Graham, Timothy Grove, Mark Hamilton, Warren W. Hein*, William         2. J. Karczmarczuk, “Scientific Computation and
 Hendee, James Hollenhorst, Judy C. Holoviak, Leo Kadanoff, Angela R. Keyser, Timothy           functional Programming,” Computing in Sci-
 L. Killeen, Harvey Leff, Rudolf Ludeke*, Kevin B. Marvel*, Patricia Mooney, Cherry             ence & Eng., vol. 1, no. 3, 1999, pp. 64–72.
 Murray, Elizabeth A. Rogan*, Bahaa E. A. Saleh, Charles E. Schmid, Joseph Serene,
 Benjamin B. Snavely (ex officio)*, A. F. Spilhaus Jr, Gene Sprouse, Hervey (Peter)
 Stockman, Quinton L. Williams.
 *Members of the Executive Committee.                                                       Konrad Hinsen is a researcher at the Centre
                                                                                            de Biophysique moléculaire in orléans and at
 Management Committee: H. Frederick Dylla, Executive Director and CEO; Richard
                                                                                            the Synchrotron Soleil in Saint Aubin, france.
 Baccante, Treasurer and CFO; Theresa C. Braun, Vice President, Human Resources;
 Catherine O’Riordan, Vice President, Physics Resources; John S. Haynes, Vice               His research interests include protein struc-
 President, Publishing; Benjamin B. Snavely, Secretary.                                     ture and dynamics and scientific computing.
                                                                                            Hinsen has a PhD in theoretical physics from

                                w w w. aip.org                                              rWtH Aachen University, germany. Contact
                                                                                            him at hinsen@cnrs-orleans.fr.



90                                                                                                  Computing in SCienCe & engineering
                   This article was featured in




    For access to more content from the IEEE Computer Society,
                 see computingnow.computer.org.




Top articles, podcasts, and more.




                               computingnow.computer.org

				
DOCUMENT INFO
Categories:
Stats:
views:11
posted:7/11/2012
language:English
pages:6
Description: ogrammimg style could make your programs more robust,more compact,and more easily parallelizable.However,mastering it requires some effort.