Docstoc

Domain-Specific Languages and Syntax Extensions

Document Sample
Domain-Specific Languages and Syntax Extensions Powered By Docstoc
					          Compilation 2010


Domain-Specific Languages
          and
   Syntax Extensions



            Jan Midtgaard
        Michael I. Schwartzbach
           Aarhus University
                            GPL Problem Solving

  The General Purpose Language (GPL) approach:
       •   analyze the problem domain
       •   express the conceptual model as an OO/… design
       •   program a framework/library
       •   express concrete application as framework/library client
  Pros:
       • predictable and familiar result
       • (relatively) low cost of implementation
  Cons:
       • difficult to fully exploit domain-specific knowledge
       • only available to general programmers
Domain-Specific Languages                                             2
                            DSL Problem Solving

  The DSL approach:
       • analyze the problem domain
       • express the conceptual model as a language design
       • implement a compiler or interpreter
  Pros:
       • possible to exploit all domain-specific knowledge
       • also available to domain experts
  Cons:
       • (relatively) high cost of implementation
       • risk of Babylonian confusion
       • hard to combine DSLs or DSL and GPL developed this
         way
Domain-Specific Languages                                     3
                            Variations of DSLs

  A stand-alone DSL:
       • a novel language with unique syntax and features
       • example: LaTeX
  An embedded DSL:
       • an existing GPL extended with DSL features
       • example: JSP
  An external DSL:
       • a stand-alone DSL invoked from a GPL
       • example: SQL invoked from Java (JDBC)



Domain-Specific Languages                                   4
                            From DSL to GPL

  A stand-alone DSL may evolve into a GPL:
       •   Fortran  Formula Translation
       •   Algol  Algorithmic Language
       •   Cobol  Common Business Oriented Language
       •   Lisp  List Processing Language
       •   Simula  Simulation Language
       •   ML  Meta Language


  A (successful) DSL design should plan for growth


Domain-Specific Languages                              5
          Using Domain-Specific Knowledge

  Domain-specific syntax:
       • domain-specific syntax clarifies the behavior
       • directly denote high-level concepts
  Domain-specific analysis:
       • consider global properties of the application
  Domain-specific optimization:
       • exploit domain-specific analysis results


  GPL frameworks cannot provide these benefits


Domain-Specific Languages                                6
                 The Joos Peephole Language

  A stand-alone DSL:
       • no general-purpose computing is required
  Domain concepts:
       • bytecodes
       • patterns
       • templates
  Implemented using:
       • a parser
       • a static checker
       • an interpreter

Domain-Specific Languages                           7
                     DSL Syntax for Peepholes

      pattern dup_istore_pop x:
        x ~ dup
            istore (i0)
            pop
      -> 3 istore (i0)




Domain-Specific Languages                       8
                            GPL Syntax Alternative

 boolean dup_istore_pop(InstructionList x) {
   int i0;
   if (is_dup(x) && is_istore(x.next) && is_pop(x.next.next)) {
     i0 = (int)x.next.getArg();
     x = replace(x,3,new Arraylist().add(new Iistore(i0)));
     return true;
   }
   return false;
 }


  Much harder to write correctly
  Fixed implementation strategy



Domain-Specific Languages                                         9
                   DSL Analysis for Peepholes

  Formal type and scope rules:
            |- E: bytecodes[→']     |- P['→'']                      |- E: boolean[→']
                   |- E ~ P: boolean[→'']                            |- ! E: boolean[→']




                            |- E1: boolean[→']     |- E2: boolean['→'']
                                     |- E1 && E2: boolean[→'']



  This is checked by a phase in the DSL interpreter




Domain-Specific Languages                                                                      10
                      GPL Analysis Alternative

  Lots of yellow PostIt notes:




  These cannot be checked by the Java compiler


Domain-Specific Languages                         11
                            The JWIG Language

  An embedded DSL (in Java):
       • lots of general-purpose computing is required
  Domain concepts:
       • XML templates
       • Web services
       • sessions
  Implemented using:
       • a syntax extension
       • a static analysis
       • a framework

Domain-Specific Languages                                12
                            DSL Syntax for JWIG

      public class test extends Service {
          String userid;
          public class Login extends Session {
            XML wrap = [[<html>
                           <body bgcolor="yellow">
                             <[contents]>
                           </body>
                         </html>]];
            public void main() {
              XML login = [[<form>
                             Userid: <input type="text" name="userid">
                             <input type="submit"/>
                           </form>]];
              show wrap<[contents = login];
              userid = receive userid;
              show wrap<[contents = "Welcome "+userid];
            }
          }
      }

Domain-Specific Languages                                                13
                            GPL Syntax Alternative

 XML login = XML.make("<form>\nUserid: <input
   type=\"text\" name=\"userid\">\n<input
   type=\"submit\"/>\</form>");
 show(wrap.plug("contents",login));
 userid = receive("userid");


  The DSL syntax maps directly to methods calls in
   an underlying Java framework
  Avoiding escapes makes the syntax more legible
  But this is just a thin layer of syntactic sugar



Domain-Specific Languages                             14
                            DSL Analysis for JWIG

  A static analysis that at compile time guarantees:
       • only well-formed and valid XML is ever generated
       • only existing form fields are ever received
       • only exisiting gaps are ever plugged


  This is a DSL analysis that is performed on the
   resulting compiled class files




Domain-Specific Languages                                   15
                  JWIG Implementation Model



               JWIG         jwigc    Java       javac
                                                        .class files
               syntax                syntax



                                                               jwiga




                                       JWIG              analysis
                                    framework             results




Domain-Specific Languages                                              16
                            Syntax Extensions

  Programmers may want to extend the syntax of
   their programming language:
       •   introduce domain-specific syntax
       •   abbreviate common idioms
       •   define language extensions
       •   ensure consistency


  Such extensions are introduced through macros




Domain-Specific Languages                          17
                            Macros

  Macros are as old as programming
  Is used as an orthogonal abstraction mechanism
  Two different flavors:
       • lexical macros
                             Main Entry: 2macro
       • syntactic macros
                             Pronunciation: 'ma-(")krO
                             Function: noun
                             Inflected Form(s): plural macros
                             Etymology: short for macroinstruction
                             Date: 1959
                             “a single computer instruction that
                             stands for a sequence of operations”



Domain-Specific Languages                                            18
                            Lexical Macros

  Operate on sequences of tokens
  Are handled by a preprocessor
  Are independent of the host language syntax

  Examples:
       • CPP
       • TeX




Domain-Specific Languages                        19
                     CPP - The C Preprocessor

  Integrated into C compilers
  Also works as a stand-alone expander

  Intercepts directives such as:
       •   #define
       •   #undef
       •   #ifdef
       •   #if
       •   #include



Domain-Specific Languages                       20
                            Lexical Macro Example

  CPP macro to square a number:
      #define square(X) X * X

      square(z + 1)                   z + 1 * z + 1




Domain-Specific Languages                             21
                            Lexical Macro Example

  CPP macro to square a number:
      #define square(X) X * X

      square(z + 1)                   z + (1 * z) + 1


  Adding parentheses as a hack:

      #define square(X) (X) * (X)

      square(z + 1)                  (z + 1)*(z + 1)




Domain-Specific Languages                               22
                            Parsing Problem

      #define swap(X,Y) { int t=X; X=Y; Y=t; }

      if (a > b) swap(a,b); else b=0;


     *** test.c:3: parse error before 'else'




Domain-Specific Languages                        23
                            Parsing Problem Hack

 #define swap(X,Y) { int t=X; X=Y; Y=t; }

 if (a > b) swap(a,b); else b=0;


 *** test.c:3: parse error before 'else'


 #define swap(X,Y) do { int t=X; X=Y; Y=t; } while (0)

 if (a > b) swap(a,b); else b=0;




Domain-Specific Languages                                24
                            Expansion Time

 #define A 87
 #define B A
 #undef A
 #define A 42

 B                 ???


  Eager expansion (definition time):
      B               87

  Lazy expansion (invocation time):
      B                 A     42
  CPP is lazy
Domain-Specific Languages                    25
                            Expansion Order

 #define id(X) X
 #define one(X) id(X)
 #define two a,b

 one(two)                   ???


  Inner (”call-by-value”):
      one(two)                one(a,b)   *** arity error 'one'


  Outer (”call-by-name”):
      one(two)                id(two)    two      a,b


Domain-Specific Languages                                        26
                       Expansion Order in CPP

  CPP uses a pragmatic "argument prescan":
      one(two)              id(a,b)     *** arity error 'id'


  Useful for composing macros:
      #define succ(X) ((X)+1)
      #define call7(X) X(7)

      call7(succ)             succ(7)     ((7)+1)




Domain-Specific Languages                                      27
                            Recursive Expansion

 #define x 1+x

 x               ???


  Definition time:
      *** recursive definition

  Invocation time:
      x                 1+x      1+1+x    1+1+1+x   ...




Domain-Specific Languages                                 28
                  Recursive Expansion in CPP

  CPP uses a pragmatic "intercept-and-ignore":
      int x = 2;
      #define x = 1+x

      x               1+x



  Maintain a stack of macro invocations
  Ignore invocations of macros already on the stack

  At runtime the value of x is 3

Domain-Specific Languages                          29
                            TeX Macros

 \def \vector #1[#2..#3] {
   $({#1}_{#2},\ldots,{#1}_{#3})$
 }

 \vector \phi[0..n-1]
        $({\phi}_{0},\ldots,{\phi}_{n-1})$



     Flexible invocation syntax
     Parsing ambiguities (chooses shortest invocation)
     Expansion is lazy and outer
     Recursion is permitted (conditions allowed)
Domain-Specific Languages                             30
                            Syntactic Macros

  Operate on sequences of ASTs
  Are handled by the parser
  Are integrated with the host language syntax

  Examples:
       • C++ templates
       • Jakarta Tool Suite




Domain-Specific Languages                         31
                            C++ Templates

  Integrated into C++ compilers
  Is intended as a genericity mechanism
  But is often used as a macro language

  Macros accept ASTs for:
       • identifers
       • constants
       • types
  The result is always an AST for a declaration


Domain-Specific Languages                          32
                      Syntactic Macro Example

 template <class T>
   T GetMax(T x, T y) { return (x>y?x:y); }

 int i,j;
 max = GetMax <int> (i,j);


  Template bodies are parsed at definition time
   (unlike CPP macros)
  Templates are syntactically expanded
  Heavy use of templates yields bloated code
   (unlike Java generics that are not macros)

Domain-Specific Languages                          33
                            Metaprogramming

  C++ templates:
       • perform compile time constant folding of arguments
       • allow multiple template definitions and pattern matching
  This combination enables metaprogramming:
       • Turing-complete computations during compilation
  Template libraries exist for:
       •   booleans
       •   control structures
       •   functions
       •   variables
       •   data structures
Domain-Specific Languages                                       34
                   Metaprogramming Example

 template <int X, int Y>
   struct pow { static const int n=X*pow<X,Y-1>::n; };

 template <int X>
   struct pow<X,0> { static const int n = 1; };

 const int z = pow<5,3>::n;


  The value 125 is assigned to z at compile time




Domain-Specific Languages                                35
        Metaprogramming for Specialization

 template <int I>
   inline float dot(float *a, float *b)
     { return dot<I-1>(a,b) + a[I]*b[I]; }
 template <>
   inline float dot<0>(float *a, float *b)
     { return a[0]*b[0]; }

 float x[3], y[3];
 float z = dot<2>(x,y);



 float z = x[0]*y[0] + x[1]*y[1] + x[2]*y[2];




  The overhead of control structures are removed

Domain-Specific Languages                           36
                              Jakarta Tool Suite

  JTS extends Java with simple syntactic macros
  Macros accept ASTs for:
       •   AST_QualifiedName
       •   AST_Exp
       •   AST_Stm
       •   AST_FieldDecl
       •   AST_Class
       •   AST_TypeName

  The result is an AST specified as:
       •   exp{     ...     }exp
       •   stm{     ...     }stm
       •   mth{     ...     }mth
       •   cls{     ...     }cls


Domain-Specific Languages                          37
                            Hygienic Macros

  macro swap(AST_QualifiedName x, AST_QualifiedName y)
    local temp
    stm{ int temp = x; x = y; y = temp; }stm

  int temp = 42;
  int tump = 87;
  #swap(temp,tump);


   Potential name clash problem:
       int temp = temp; temp = tump; tump = temp;
   But local names are renamed uniquely:
       int temp143 = temp; temp = tump; tump = temp143;
   Hygeinic macros are available in Scheme, various macro
    extensions of Java such as JSE, …

Domain-Specific Languages                                    38