Chapter 5 Programming Languages

Document Sample
Chapter 5 Programming Languages Powered By Docstoc
					Chapter 5: Programming Languages
• Just as Operating Systems, Computer Hardware,
  and Algorithm Discover processes have evolved,
  so have programming languages
• Here we study the evolution and classification of
  programming languages, and the constructs that
  we find in all (or most) programming languages
• In classifying the languages, we see different
  programming paradigms, imperative, functional,
  object-oriented and declarative
           History and Evolution
• Programs are represented in Machine Language
  – instructions are op codes and represented as binary
    (or hexadecimal) codes
  – data are represented either as register numbers or
    main memory locations
• First generation computers were most commonly
  programmed in this fashion
  – Very difficult -- keeping track of storage locations,
    knowing proper op codes
     • Programming required a lot of patience and led to many
       erroneous programs that either did not run at all or did not
       work properly
             Assembly Language
• During the mid 1950’s (still in the 1st generation),
  programmers realized that they could
     • substitute mnemonics (abbreviations) for op codes
     • and use labels for memory locations
  – Which made it easier to write a program
     • Mnemonics and labels make up assembly language
  – A program must now be translated from assembly to
    machine language via a process called assembling,
    using a program called an assembler
  – While assembly was developed during the first
    generation, it is sometimes erroneously referred to as a
    second-generation programming language
    Machine vs. Assembly Language
•     1556                 •     LD      R5, PRICE
      166D                       LD      R6, TAX
      5056                       ADDI R0, R5, R6
      306E                       ST      R0, TOTAL
      C000                       HLT
• This is the machine      • Notice the use of LD for
  language “add”             “load”, ADDI for
  program from chapter 2     “integer addition”, ST
                             for “store” and HLT for
         Machine Translation
• Assembly language offered programmers an
  easier way to specify programs
  – But it also showed that machine translation,
    from one form to another, was possible
• Why not take this idea further?
  – Converting from Assembly to Machine
    language is a one-to-one mapping
     • one Assembly Language mnenomic becomes one
       Machine Language op code
     • could we specify a more elaborate instruction that is
       translated into several Machine Language
       High Level Languages
– By the late 1950’s, this more complex translation
  idea was being explored
   • FORTRAN -- FORmula TRANslator was developed
     by IBM as the first high level language, to be used for
     scientific and mathematical applications
      – FORTRAN programs are expressed in a mathematical
        notation using English words and variables
      – A FORTRAN program is then compiled into assembly
        language by a compiler followed by assembling to machine
   • COBOL -- COmmon Business Oriented Language
     was developed by the US Navy for business
     Machine Independence
– Another advantage to high level programming
  languages is that they are machine independent
   • All you need is a compiler for the language on the
     given machine and a program written on another
     computer can be compiled on the given computer and
      – To help promote machine independence, organizations such
        as ANSI were created to form standardized languages
        across machine platforms (such as ANSI C)
      – Unfortunately, while this idea promoted High Level
        Languages, there are always certain machine dependent
        features such as I/O, memory management and interrupts,
        making it so that most high level programming languages
        are not truly machine independent
     More High Level Languages
– While the first high level         • Algol -- European fusion of
  languages were developed             FORTRAN and COBOL
  in the first generation, these     • PL/I -- IBM fusion of FORTRAN,
                                       COBOL and Algol
  languages have been
                                     • LISP -- for AI research
  dubbed third-generation
                                     • C, C++ -- portable and system
  languages                            programming
   • In the 1960’s and through the   • Pascal and BASIC -- educational
     1980’s, there have been a         languages
     great number of languages,
     each provided for its own       • Ada -- developed for the DoD
     purpose                         • Simula -- simulation language
   • We study the evolution of       • Smalltalk -- early Object-oriented
     these languages in 3336           language, descendent of Simula
   • A brief look follows            • Prolog -- declarative AI language
          Programming Paradigms
• Just as there has been a great proliferation of
  programming languages, there have been shifts
  in programming style and development
  – These shifts are characterized as different
    programming paradigms
     •   Imperative (or procedural) programming
     •   Functional programming
     •   Object-oriented programming
     •   Declarative programming
          – With each new paradigm comes new languages that fit that
      Imperative Programming
• Basic idea is that code is used to manipulate
  data stored in memory (or elsewhere)
  – Follows the fetch-decode-execute cycle
     • Modular programming was developed as a proper
       way to perform imperative programming, and so it
       is often called procedural programming
  – Most languages fit this paradigm, at least to
    some extent:
     • machine, assembly, FORTRAN, COBOL, Algol,
       PL/I, C, Pascal, C++, Ada, BASIC
        Functional Programming
• Similar to Imperative programming, but based
  entirely around the mathematical notion of a
  – Every instruction is a function that returns a value to
    be used by another function
     • Promotes modularity even more than imperative
        – a function is thought of as a black box and used by other
          functions without having to know how the function works
  – LISP is the predominant functional language, but
    others include ML and Miranda
   Object-Oriented Programming
– From AI research and simulation
   • it is important to model how things works
   • need a way to represent objects in the world
– an object will contain data and processes to
  manipulate that data
   • In C++, these are data members and member functions
   • In other languages, they might be referred to as attributes
     (or slots) and methods
– OOP languages slowly developed and are now
  contained in their own paradigm
   • Languages include C++, Java, Ada 95, Smalltalk, Common
     Lisp, Visual Basic although only Smalltalk and Java are
     true OOPLs
     Declarative Programming
– Rather than having a programmer write a program
  that manipulates data
– Have a programmer declare the things that must take
  place and have the language do the rest
   • Sound like magic? It isn’t
– Prolog is the most common declarative language and
  it works because it has two built-in processes,
  resolution and unification
   • The programmer declares statements of fact and Prolog
     uses the two processes to prove other facts
   • Less a programming language than a tool
Programming Language Concepts
• We will concentrate on imperative language
  concepts here
  – Features: declarative statements, imperative
    statements, comments
     • Declarative statements: variable and type declarations,
       modules (procedures and functions)
     • Imperative statements: executable statements that include
       assignment statements, I/O statements, and control
       statements (loops, selection, procedure/function calls,
     • Comments: non-executed statements
  Data Types and Structures
– Data types are built-in types and include
   •   real/float (different precisions)
   •   integer
   •   character and stirngs (in some languages)
   •   boolean
        – See figure 5.4 p. 239 for examples in C, C++, Java, Pascal
          and FORTRAN
– Data Structures are often built on top of the
  built-in types -- here, the programmer dictates
  the “shape” of the structure
   • arrays (homogeneous structures) and records
     (heterogeneous structures)
        – See figures 5.5 and 5.6 on pages 240-241
          Assignment Statements
• One of the most basic imperative statements is to
  assign a variable a value
  – This is also used in object-oriented programming to
    assign a data member a value
     • Typical form: variable <assignment symbol> expression
        – In Ada and Pascal, assignment symbol is “:=”, in FORTRAN, C,
          C++ and Java, the symbol is “=”, APL used “<--” which is more
        – Expressions may be arithmetic (numeric), relational or boolean, or
          may involve characters and strings (such as string concatenation)
            » Overloading allows symbols to be used by different types of
              instructions such as + for integer addition, real addition and
              string concatenation
            Control Statements
– Because of the fetch-decode-execute cycle, the PC is
  always incremented so that the next instruction
  fetched will be the next in sequence
   • Thus, the built-in control for a program is sequential
     execution of instructions
– Control statements allow the programmer to change
  the PC to some other, predetermined, location to alter
  the sequence
   • Types of control statements:
      – GO TO, procedure/function calls and recursion, iteration (loops),
      – Here, we will concentrate on the last two
      Selection & Iteration Statements
• Given the result of a condition, • Allows a body of code to be
  decide where to go                 repeated
       • One way selection: If-Then       – Counter-controlled loops (for loops)
         statement                           • Repeat a set number of times
       • Two way selection: If-Then-         • In Ada, Pascal, sequence is set
         Else statement                        before the loop begins and
       • Multi-way selection:                  cannot be altered, strictly
         Case/Switch statement                 increasing or decreasing by 1
            – See figure 5.7 p. 244          • In C, C++, Java, Algol, sequence
   – If-Then/If-Then-Else statements           can be altered during execution
     use boolean conditions (evaluate            – see figure 5.8, p. 245
     to TRUE or FALSE)                    – Logically-controlled loops
   – Multi-way might use boolean             • based on a condition which
     conditions, or ordinal expressions        evaluates to TRUE or FALSE
     depending on language                       – While-Do are pre-test loops
                                                 – Do-While are post-test loops
• Comments are discarded during compilation so
  they have no effect on the program
  – Comments are denoted through some special
        –   Pascal: { … } or (* … *)
        –   C, C++, Java: /* … */ or // …
        –   BASIC: rem …
        –   FORTRAN: C (in the 6th column only)
  – Why comment?
     • We will see in Chapter 6 that commenting/documenting
       code is extremely important
        – Good comments lead to easier modification and maintenance of
• The idea behind a procedure is that it is a piece
  of code written independently of other program
  units to achieve modularity
  – There is some dependence in the intended use of the
    procedure -- why use it and how to use it
  – How to use it means what parameters to pass
  – A procedure call is a control statement that causes
    the current procedure to suspend, transferring control
    to the procedure until the procedure terminates and
    control returns to the location in the calling
    procedure where it left off
     • See figure 5.9, p. 248
           Procedure Terminology
– Calling Procedure: program      – Parameters: variables
  unit that invokes the procedure   explicitly passed from calling
– Procedure Header: definition      procedure to called procedure
                                     • Formal Parameters: parameters
  (using some reserved word) that
                                       in the procedure call list
  starts the procedure
                                         • Actual Parameters: parameters in
   • includes procedure name and           the procedure header
     parameters (optional)                   – These are used as variables in
– Local Variables: variables                   the procedure
  declared in the procedure and   – Parameter Passing Method:
  only accessible in the procedure programming languages use
– Global Variables: variables       different mechanisms for
  used in the procedure but not     passing parameters
  defined there or passed as         • Pass by value (fig 5.10 p. 251)
  parameters                         • Pass by reference (fig 5.11 p.
• Another unit for modularity
  – Functions differ from procedures because they return
    a single value
     • No return parameters
        – parameters should only be passed by value, not reference
        – if a parameter is changed, its known as a side effect and this is
          permissible in many languages such as Pascal and C, C++
     • Often used for mathematical computations
        – allows for nesting
     • One drawback of functions in most languages is that they
       are unable to return structures such as an array or record
        – in such cases, use procedures
        – All C, C++ modules are functions, but you can create procedure-
          like modules by specifying void as the return type for the function
     Input/Output Statements
• Another sequential operation
  – like assignment statements
  – most languages offer standard forms of I/O
     • input from keyboard
     • output to monitor
     • to redirect I/O, you have to alter the statements
  – Pascal: readln, read, writeln, write
  – C, C++: scanf, printf; C++ and Java also have
    cin, cout which are objects instead of built-in
            Translation Process
– As said earlier, programs written in a high-level
  language must be translated into machine language
  before execution
   • Source program: program in the high-level language
   • Object program: program compiled into machine language
   • Translation process goes through 3 steps:
      – Lexical analysis: breaks program into distinct elements (tokens)
      – Parser: identifies each instruction and each instruction component
      – Code generation: generates actual assembly (or machine) code
        from the parsed components
      – See figure 5.12 p. 255
   • We will now take a closer look at the parsing component
– Parsing is a syntactic exercise -- identifying the
  grammatical components of the statement
   • We do this in natural language, breaking a sentence into
     noun phrase, verb phrase, verb, auxiliary verbs,
     prepositional phrases, etc
– Complicating the process is that most programming
  languages are free-format instead of fixed-format
   • spacing and indentation are not critical and so are ignored
   • syntax of the language must use other forms of
      – keywords/reserved words
      – delineators for instructions (such as the “;” and “{ … }” symbols)
            The Parsing Process
• Parsing is performed by a parser
     • a pictorial representation of a parser is given by a syntax
       diagram, as shown in figures 5.13 , p. 257 and 5.14, p. 258
• A parser is a program that recognizes statements
  in the language using the language’s grammar
  – The output of the parser is a structure that represents
    the statements through non-terminal symbols
    (categories) and terminal symbols (the actual
    statement components)
     • This can be represented pictorially through a parse tree as
       shown in figures 5.15, p. 259 and 5.16, p. 260
– A language is ambiguous if a statement could generate
  two or more parses
      – parse trees or interpretations
   • “The boy saw the man in the park with a telescope”
      – was the man in “the park with a telescope” or
      – did the boy see the “man in the park” by using a telescope?
   • A programming language must not be ambiguous or else the
     parser may not interpret an instruction correctly
      – If (X < Y) If (Y < Z) Writeln(X); Else Writeln(Y);
      – Which condition does the Else clause go with?
      – Programming Languages often have special mechanisms for such a case
        (endif statements)
           » In Pascal, the rule says that an Else clause is attached to the nearest
             condition (so in this case, the Else goes with Y<Z instead of X<Y)
           Examples of Ambiguity
• Ambiguity in Expressions • Ambiguity in Type
  – X+Y*Z                              – The parser must derive the
     • Taken strictly Left-to-Right,     correct type of an expression
       we would calculate X+Y          – X=Y+Z
       first, giving us the
       equivalent of (X+Y)*Z              • Arises if Y and Z are different
  – We will explore in 3336
    how a parser can properly          – This requires type coercion
                                          • Coercion may result in error
    evaluate such expressions
                                               – Recall the truncation error
     • In some languages, they
                                                 using floating point
       may resort to forcing the
                                                 arithmetic from chapter 1
       programmer to use
       parentheses to denote the          • A language is strongly typed if
       order (precedence) of                all processes involving data can
       evaluation, other languages          be performed without coercion
       have better parsers                     – Java is one of the few
                                                 strongly typed languages
       Compiling Vs. Optimizing
– A large concern in the 1950’s was that compiling would
  produce poor code
   • A compiled program would not be as efficient as a program
     written by a person directly in machine language
   • However, compilers are usually very good at producing
     efficient code
   • Today, as architectures become increasingly complex, it is
     important that compiled code take advantage of the
     architectural features
– Aside from code generation, some compilers perform
  code optimization
   • Using registers carefully, rearranging instructions, etc
      – Compiler Optimization is a topic for 4335 & 6335
                  Linking and Loading
• A compiled program is not – The loader is software (usually
  ready for execution!        part of the OS) that loads the load
   – Most software today uses             module into memory for
     library modules written by           execution
     others                                • This seems like a trivial task, but it
                                             has many repercussions in a multi-
      • A procedure call in your             user and multi-tasking environment
        program (such as cout) could
        raise a run-time error                  – Must ensure that the loaded
                                                  module does not enter another
   – The linker is software that                  process’s memory space
     links several object                       – This information is stored in
     programs together                            registers or in a memory map
      • your compiled program,                    called a page table
        compiled library modules and            – Processes may be moved around
        Operating System routines                 in memory (called relocation)
      • this creates the load module, a    • See figure 5.18 p. 263 for
        consolidated program ready           linking/loading process
        for execution
Object-Oriented Programming
– In Imperative Programming, the idea was to
  write procedures that manipulate data
– In OOP, the idea is to identify the data
  structures as objects
   • the data that makes up the objects (data members or
   • the relationship between objects (class/subclass
   • the routines that are carried out on the objects
     (methods or member functions)
– Many languages contain OOP (C++, Common
  Lisp, Ada 95) although there are few pure OOP
  languages (Smalltalk, Java)
                          OO Features
• Encapsulation                           • Message Passing
   – In 2380 and 3333, you learn of          – The main idea behind OOP is that
     abstract data types and the notion        objects communicate to each other
     of information hiding
                                               through messages
   – When you use a data structure,
     you should be able to use it            – This differs from the imperative
     without having to know how it is          approach where procedures call
     implemented                               each other and pass parameters
   – OOP provides the characteristic of          • Example: Resizing a window -
     encapsulation                                 - send the “resize” message
       • Details are hidden from view              from the “mouse object” to the
         so that other objects do not              “window object”
         know how a given object is          – a message contains the specific
                                               function to be executed (the
       • The definitions of the object         method) along with (optional)
         (the data structure and the
         processes) are “wrapped” up
         in the object itself
                More OO Features
• Polymorphism                      • Inheritance
  – One object might wish to           – One of the advantages of OOP
    pass a message to another            is that a subclass is able to
    and include a parameter              inherit from its parent class any
  – But what if the first object         or all of its parent’s data
    does not know what type              members and member functions
    of parameter is expected?          – This leads to code reusability
  – Polymorphism allows that             and an easier way of defining
    the message be                       objects
    interpretable no matter               • In C++, Java -- what is inherited
    what type of parameter is               can be controlled
    passed                                • In C++ and Common Lisp,
                                            multiple inheritance is allowable
     • In a way, this is like               (a child may have more than 1
       operator overloading, here           parent)
       we are “message
                    More on OOP
• C++ and Java are OOP languages and you will
  learn about them in 2320, 2330, 3333 and 3336
  – Here, we have just seen a brief overview
     • See the code on pages 266-267
        – we see class definitions and creating subclasses
     • See the code on page 268
        – Private: items that are hidden from view to promote information
        – Public: items that are visible so that the programmer knows what
          the object can do
• Some languages provide concurrency
  – the ability to execute multiple procedures at the same
     • parallel processing: procedures executed simultaneously
       on independent processors
     • concurrent processing: procedures executed in an
       overlapped fashion on the same processor
  – concurrent processes are referred to as threads
  – often useful in simulation, games and graphics
     • Languages that offer concurrency include Ada, Java,
       Concurrent Pascal, and Concurrent Lisp
     • We will skip this section, you will see concurrency in
       some detail 3336
      Declarative Programming
• Based on logical deduction
  – modus ponens and modus tollens
     • All men are mortal, Socrates is a man, therefore Socrates
       is mortal
     • In Prolog:
        – mortal(x) :- man(x). man(Socrates).
        – mortal(?). Would return Socrates
  – Automatic problem solving just by providing
    declarations: facts and rules
     • Based on two principles, resolution and unification
  – Prolog (PROgramming LOGic) is one of the only
    declarative languages
     • We study Logic in 4350 and Prolog in 3336 and 6351

Shared By: