Type Checking in Cool

Document Sample
Type Checking in Cool Powered By Docstoc
					Type Checking in Cool

         Alex Aiken
 (Modified by Mooly Sagiv)
•   What is type checking
•   Simple type rules
•   Self_Type
•   Implementation
• What is a type
  – Varies from language to language
• Consensus
  – A set of values
  – A set of operations
• Classes
  – One instantiation of the modern notion of
Why do we need type systems?
• Consider assembly code
  – add $r1, $r2, $r3
• What are the types of $r1, $r2, $r3?
       Types and Operations
• Certain operations are legal for values of
  each type
  – It does not make sense to add a function
    pointer and an integer in C
  – It does make sense to add two integers
  – But both have the same assembly language
             Type Systems
• A language‟s type system specifies which
  operations are valid for which types
• The goal of type checking is to ensure that
  operations are used with the correct types
  – Enforces intended interpretation of values
    because nothing else will!
     Type Checking Overview
• Three kinds of languages
  – Statically typed: (Almost) all checking of types is done
    as part of compilation
     • Semantic Analysis
     • C, Java, Cool, ML
  – Dynamically typed: Almost all checking of types is
    done as part of program execution
     • Code generation
     • Scheme
  – Untyped
     • No type checking (Machine Code)
                 Type Wars
• Competing views on static vs. dynamic typing
• Static typing proponents say:
  – Static checking catches many programming errors
  – Prove properties of your code
  – Avoids the overhead of runtime type checks
• Dynamic typing proponents say
  – Static type systems are restrictive
  – Rapid prototyping difficult with types systems
  – Complicates the programming language and the
  – Compiler optimizations can hide costs
          Type Wars (cont.)
• In practice, most code is written in
  statically typed languages with escape
  – Unsafe casts in C Java
  – union in C
• It is debatable whether this compromise
  represents the best or worst of both worlds
              Types Outline
• Types concepts in Cool
• Notation for type rules
  – Logical rules of inference
• Cool type rules
• General properties of type systems
               Cool Types
• The types are
  – Class Names
• The user declares types for identifiers
• The semantic analysis infers types for
  – Every expression has a unique type
Type Checking and Type Inference
• Type checking
  – The process of verifying fully typed programs
• Type inference
  – The process of filling in missing type
• Different terms used interchangeably
          Rules of Inference
• We have seen two examples of formal
  notions specifying parts of the compiler
  – Regular expressions
  – Context-free grammars
• Appropriate formalisms for static type
  – Syntax directed translations
  – Attribute grammars
  – Logical rules of inference
     Why Rules of Inference?
• Inference rules have the form:
  – “If Hypothesis is true, then conclusion is true”
• Type checking computes via reasoning
  – “If E1 and E2 have certain types, then E1+E2
    have certain type”
• Rules of inference are compact notation of
  “If-Then” statements
 From English to an inference rule
• [Easy to read with practice]
• Start with a simplified system and
  gradually add features
• Building blocks
  – Symbol  is „and‟
  – Symbol  is „if then‟
  – Symbol x:T is „x has type T‟
  From English to an inference rule(2)

• If e1 has type Int and e2 has type Int,
   then e1 + e2 has type Int
• (e1 has type Int  e2 has type Int) 
     e1 +e2 has type Int
• (e1: Int  e2: Int)  e1+e2: Int
  From English to an inference rule(3)

• The statement
  (e1: Int  e2: Int)  e1+e2: Int
     is a special case of
  Hypothesis1  ...  Hypothesisn 
• This is an inference rule
   Notation for Inference Rules
• By tradition inference rules are written
    Hypothesis1  ...  Hypothesisn
• Cool type rules have hypothesis and
       e: T
•  means “it is provable that ...”
                    Two Rules
  i is an integer
   i : Int                        [Int]

e1 : Int
e2 : Int                       [Add]

 e1+e2 : Int
         Type Rules (cont.)
• These rules give templates describing how
  to type integers and + expressions
• By filling the templates, we can produce
  complete typings for expressions
                  Example 1 +2
1 is an integer              2 is an integer
 1 : Int            [Int]    2: Int             [Int]

             1+2: Int
• A type system is sound if
  – whenever  e : T
  – Then e evaluates to a value of type T
• We only want sound rules
  – But some sound rules are better than others:

               1 is an integer
                1 : Object      [Strange]
       Type Checking Proofs
• Type checking proves facts e: T
  – Proof is on structure of the AST
  – Proof has the shape of the AST
  – One type rule is used for each AST node
• If the type rule used for a node e:
  – Hypotheses are the proofs of e‟s
  – Conclusion is the type of e
• Bottom up pass over the AST
Rules for Constants

     false : Bool          [Bool]

   s is a string constant
    s: String
            Rules for New
• new T produces an object of type T
  – Ignore SELF_TYPE for now ...

              new T: T
      Two More Rules

   e: Bool
   e: Bool

 e1: Bool
 e2: T
 while e1 loop e2 pool: Object
                      A Problem
• What is the type of a variable reference
        x is an identifier
         x: ?

• The local structure rules does not carry
  enough information to give x a type
                  A Solution
• Put more information in the rules
• A type environment gives types for free
  – A type environment is a function from
    ObjectIdentifiers to Types
  – Symbol table
  – A variable is free in an expression if it is not
    defined within the expression
         Type Environments
• Let O be a function from ObjectIdentifiers
  to Types
• The sentence
    O  e: T
  is read: under the assumption that
  variables have the types given by O, it is
  provable that e has the type T
                Modified Rules
  i is an integer
  O  i : Int                    [Int]

O  e1 : Int
O  e2 : Int                 [Add]

O  e1+e2 : Int
           New Rules

O  x: T
                        Let Rule

        O(T0/x)  e1 : T1
        O  let x: T0 in e1 : T1

O[T/y] means O modified to return T on argument y

 Enforces a variable scope
• The type environment gives types to free
  identifiers in the current scope
• The type environment is passed down the
  AST from the root to the leaves
• Types are computed up the AST from the
  leaves towards the root
    Let Rule with Initialization

            O  e0: T0
            O(T0/x)  e1 : T1
            O  let x: T0 e0 in e1 :

Weak rule
• Define a relation  on classes
  – X  Y if X inherits from Y
  – X  Z if X  Y and Y  Z

        O  e0: T
        T  T0
        O(T0/x)  e1 : T1
        O  let x: T0 e0 in e1 :
• Both rules are sound but more programs
  typecheck with the second one
• More uses of subtyping

       O (Id) = T0
       O  e1 : T1
       T1  T0
       O  Id e1 : T1
         Initialized Attributes
• Let Oc(x) = T for all attributes x:T in class
• Attribute initialization is similar to let,
  except the scope of names

        Oc(Id)= T0
        Oc  e1 : T1
        T1  T0
        Oc  id T0  e1 : T1 ;
• Consider:
  if e0 then e1 else e2
• The result can be either e1 or e2
• The type is either e1‟s type or e2‟ type
• The best we can do is the smallest
  supertype larger then the type of e1 and
        Least Upper Bounds
• lub(X, Y) is the least upper bound of X and
  Y (denoted by Z)
  – X  Z and Y  Z
    Z is upper bound
  – X  Z‟ and Y  Z‟  Z  Z‟
    Z is the least upper bound
• In Cool, the least upper bound of two
  types is their least common ancestor in the
  inheritance tree

Oc  e0 :Bool
O  e1: T1
O  e2 : T2
O  if e0 then e1 else e2 : lub(T1, T2)
• The rule for case expressions takes lub
  over all branches
        O  e0 :T0
        O[T1/x1]  e1: T‟1
        O[Tn/xn]  en: T‟n
        O  case e0 of x1: T1; ... ; xn:Tn esac : lub(T‟1, .., T‟n)
             Method Dispatch
• There is a problem with type checking
  method calls
       O  e0 :T0
       O  e1: T1
       O  en: Tn
       O  e0.f(e1, ,,,,, en): ?

• We need information about the formal
  parameters of and return type
          Notes on dispatch
• In Cool, the method and object identifiers
  live in different name spaces
  – A method foo and an object foo can coexist in
    the same scope
  – In the type rules this is reflected by a separate
    mapping M for method signatures
    M(C, f) = (T1, ..., Tn, Tn+1)
    means that in class C there is a method f
    f(X1:T1, ..., Xn:Tn): Tn+1
The Dispatch Rule Revisited

  O, M  e0 :T0
  O, M  e1: T1
  O, M  en: Tn
  M(T0, f) = (T‟1, ..., T‟n, T‟n+1)
  Ti  T‟i for 1  i  n
  O  e0.f(e1, ,,,,, en):T‟n+1
            Static Dispatch
• A variation of normal dispatch
• The method is found in the class explicitly
  named by the programmer
• The inferred type of the dispatch
  expression must conform to the specified
           Static Dispatch

O, M  e0 :T0
O, M  e1: T1
O, M  en: Tn
T0  T
M(T0, f) = (T‟1, ..., T‟m, T‟n+1)
Ti  T‟i for 1  i  n
O  e0@T.f(e1, ,,,,, en):T‟n+1
     The Method Environment
• The method environment must be added
  to all rules
• In most cases, M is passed but not
  actually used
   O, M  e1 : Int
   O, M  e2 : Int          [Add]

   O, M  e1+e2 : Int
         More Environments
• For some cases involving SELF_TYPE,
  we need to know the class in which an
  expression appears
• The full environment for COOL
  – A mapping O gives types to Object Id‟s
  – A mapping M giving types to methods
  – The current C
• The form of a sentence in the logic is
  O, M, C  e : C

   O, M, C  e1 : Int
   O, M, C  e2 : Int               [Add]

   O, M, C  e1+e2 : Int
           Effectiveness of
         Static Type Systems
• Static type systems detect common errors
• But some correct programs are disallowed
  – Some argue for dynamic checking instead
  – Other argue for more expressive type
• But more expressive type systems are
  more complex
    Dynamic and Static Types
• The dynamic type of an object is the class
  that is used in the new expression
  – a runtime notion
  – Even languages that are statically typed have
    dynamic types
• The static type of an expression captures
  all the dynamic types that the expression
  could have
  – A compile-time notion
    Dynamic and static types
• In early type systems the set of static
  types correspond directly to the dynamic
• Soundness theorem: for all expressions E
   dynamic_type(E) = static_type(E)
• Gets more complicated in advanced type
 Dynamic and Static Types in Cool
           class A { ... }
           class B inherits A { ... }
           class Main (
              x: A  new A ;
              x  new B ;

• A variable of static type A can hold the
  value of static type B if B  A
    Dynamic and Static Types
• Soundness of the Cool type system:
  –  E. dynamic_type(E) static_type(E)
• Why is this ok
  – All operations that can be used on an object
    of type C can be also used on an object of
    type C‟  C
  – Subclasses only add behavior (attributes or
  – Methods can be redefined but with the same
                     An Example
class Count {
                          class Stock inherits Count {
     i : Int  0 ;
                              name: String ;
 Inc() Count {
        ii+1;            class Main {
        self ;               Stock a  (new Stock). inc();
        }                    ... a.name..
      };                     };
   SELF_TYPE to the Rescue
• We will extend the type system
• Insight
  – Inc returns “self”
  – The return value has type as “self”
  – Any subtype of Count
• Introduce the Keyword SELF_TYPE to use
  for return value of such functions
  – Need to modify the typing rules
              An Example (revisited)
class count {
                         class Stock inherits Count {
     i : Int  0 ;
                             name: String ;
 Inc(): Self_Type {
        ii+1;           class Main {
        self ;              Stock a  (new Stock). inc();
        }                   ... a.name..
      };                    };
             Type Systems
• The rules in this lecture are Cool-specific
• A lot of theory about type systems
• General themes
  – Type rules are defined on the structure of
  – Types of variables are modeled by an
    One Pass Type Checking
• COOL type checking can be implemented
  in a single traversal over the AST
• Type environment is passed down the tree
  – From parent to child
• Types are passed up the tree
  – From child to parent
Implementing Type Systems
O, M, C  e1 : Int
O, M, C  e2 : Int                    [Add]

O, M, C  e1+e2 : Int

 TypeCheck(Environment, e1, e2) {
   T1 = TypeCheck(Environment, e1);
   T2 = TypeCheck(Environment, e2);
   check T1 == T2 == Int;
   return INT‟ }

Shared By: