Microsoft Word Ac Os X Memo Templates by agd11897

VIEWS: 54 PAGES: 33

Microsoft Word Ac Os X Memo Templates document sample

More Info
									Memo on SMLFormat



                          Memo on SMLFormat
                         Wonseok Chae and Matthias Blume {wchae, blume}@tti-c.org
                                          Toyota Technological Institute at Chicago


1. About this document

   SMLFormat, which is developed as part of the SML# Project, is a pretty printer
   generator for Standard ML. This document describes our experience on installing
   and using SMLFormat. A more official manual for SML# is available at the Project
   Home (http://www.pllab.riec.tohoku.ac.jp/smlsharp/).

   Most examples and explanation which we use in this memo come from
   Yamatodani Kiyoshi’s “SMLFormat: Pretty Printer for Standard ML” at
   http://www.pllab.riec.tohoku.ac.jp/smlsharp/SMLFormat/doc/OVERVIEW_en.txt

2. Basic idea

   The task of implementing pretty printers for intermediate languages in a compiler
   is often considered boring, but compiler developers cannot live without them.
   SMLFormat can save us this tedious work by automatically generating formatter
   functions (or pretty printers) for SML source files from minimal annotations. The
   following diagram shows the required steps:1) annotate type/data type declaration
   with format comments, 2) generate formatter functions by running smlformat, and
   3) obtain a SML source file with formatters. Other programs can call this
   format_exp function to print datatype exp:

      exp.ppg                                      exp.ppg.sml
      (*% *)                                       (*% *)
      datatype exp =                               datatype exp =
       (*%@format (value) value *)                  (*%@format (value) value *)
        Int of int                                   Int of int

                                                   fun format_exp x =
                                                       case x of
                                   smlformat           INT x => …



   NOTE: We believe the name of extension, ppg, comes from pretty printer
       generator.




W. Chae & M. Blume                         1                                    12/1/2006
Memo on SMLFormat


3. Install

   Our computing environment is the following:
       - Mac OS X Darwin 8.8.0
       - SML/NJ v110.58

   NOTE: Any recent version of SML/NJ and Linux-like systems should work the
       same.

                  SMLFormat-0.10.tar.gz from
   Download
                  http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Download
                  1. untar
                      [wchae@hana SML]$ tar xvfz SMLFormat-0.10.tar.gz

                      NOTE: “README.txt” under SMLFormat-0.10 which was
                          created by the first step will instruct the next steps:

                  2. run ./configure

                     [wchae@hana SMLFormat-0.10]$ ./configure
     Install      *** NOTICE: --with-smlunit option is not specified.
                  ***     generation for SMLUnit is skipped.

                     NOTE: At this time, we did not install SMLUnit which is
                         another gem provided by the SML# Project.

                  3. make

                     [wchae@hana SMLFormat-0.10]$ make


   In our case, running ‘make’ failed to build “smlformat”. The followings were our
   hacks to make it work:

   1) Modify configure (SMLFormat-0.10/configure)

       88 ####################################################
       89
       90 HEAPNAME="$pwd/bin/smlformat-heap"
       91
       92 case "$host_os" in
       93 *cygwin*)
       94 SML=sml.bat
       95 SMLCM=sml-cm.bat
       96 MAINFUNC='(fn (p,_::ps)=> Main.main(p,ps)
                             | (p,nil)=> Main.main(p,nil))'


W. Chae & M. Blume                        2                                    12/1/2006
Memo on SMLFormat


       97 HEAPNAME="$HEAPNAME.x86-win32"
       98 HEAPNAME=`cygpath -am "$HEAPNAME"`
       99 ;;
      100 *)
      101 SML=sml
      102 SMLCM=sml-cm
      102 SMLCM=sml
      103 MAINFUNC=Main.main
      104 ;;
      105 esac
      106

      NOTE: We couldn’t check whether a cygwin version had sml-cm.bat or not.

   2) Modify Makefile.in (SMLFormat-0.10/Makefile.in)

       36 heap:
       37       cd generator/main && \
       38       echo "CM.make();" \
       38       echo "CM.make \”sources.cm\”;" \
       39           "SMLofNJ.exportFn(\"$(HEAPNAME)\",$(MAINFUNC));" \
       40       | $(SML)
       41
       42 test:
       43       cd formatlib/test && \
       44       echo "SMLofNJ.Internals.GC.messages false;" \
       45           "CM.make();" \
       46           "TestMain.test();" \
       47       | $(SML_CMD)

   3) Modify sources.cm (SMLFormat-0.10/generator/main/sources.cm)

        1 Group is
        2     Utility.sml
        3     ….
       14     MLPARSER.sig
       15     MLParser.sml
       16
       17     ml-yacc-lib.cm
       17     $/ml-yacc-lib.cm
       18     $/basis.cm
       19     $/smlnj-lib.cm

   4) Modify sources.cm (SMLFormat-0.10/formatlib/main/sources.cm)




W. Chae & M. Blume                      3                                  12/1/2006
Memo on SMLFormat


        1 Group
        2 is
       3      $/basis.cm
       4      $/smlnj-lib.cm
        3     FORMAT_EXPRESSION.sig
        4     FormatExpression.sml
        5     PRINTER_PARAMETER.sig
        6     PrinterParameter.sml
        7     PRETTYPRINTER.sig
        8     PrettyPrinter.sml
        9     PREPROCESSOR.sig
       10     PreProcessor.sml
       11     SMLFORMAT.sig
       12     SMLFormat.sml
       13     BASIC_FORMATTERS.sig
       14     BasicFormatters.sml

      NOTE: Smlformat can be built without this modification, but later whenever
          we want to use the formatters generated by smlformat, we would see a
          lot of “unbound structure” errors.

   5) Modify MLParser.sml (SMLFormat-0.10/generator/main/MLParser.sml)

      130
      131     fun getLine length = TextIO.inputLine sourceStream
      131     fun getLine length = case TextIO.inputLine sourceStream of
                                       NONE => ""
                                     | SOME st => st


      NOTE: The type of TextIO.inputLine was recently changed from int -> string
          to int -> string option.

   6) Modify ml.lex (SMLFormat-0.10/generator/main/ml.lex)

       46 local
       47 fun cvt radix (s, i) =
       48       #1(valOf(Int.scan radix Substring.getc (Substring.triml i
                                                       (Substring.all s))))
       48       #1(valOf(Int.scan radix Substring.getc (Substring.triml i
                                                          (Substring.full s))))
       49 in
       50 val atoi = cvt StringCvt.DEC
       51 val xtoi = cvt StringCvt.HEX
       52 end (* local *)




W. Chae & M. Blume                       4                                   12/1/2006
Memo on SMLFormat


      NOTE: Since smlnj v110.67, deprecated function Substring.all has been
          removed.

   7) Repeat step 2 & 3 in the section of Install. The executable file called
      “smlformat” will be created under SMLFormat-0.10/bin.

4. Test

   The current distribution of SMLFormat-0.10.tar.gz has several examples under
   SMLFormat-0.10/example. This section shows how to compile and run them. For the
   same reason as before, we need to modify sources.cm under each example. For
   instance, if we want to test an example under SMLFormat-0.10/example/MLParser,
   then we should modify sources.cm under SMLFormat-0.10/example/MLParser:

   - sources.cm (SMLFormat-0.10/example/MLParser/sources.cm)
        1 Group is
        2 (*
        3 to build this example, see ../../README.txt
        4 *)
        5       ../../smlpplib.cm
        5      ../../smlformatlib.cm
        6 (*

           7      Absyn.ppg:ppg
           8 *)
           9      Absyn.ppg.sml
          10 (*
          11      AbsynUseDefault.ppg:ppg
          12 *)
          13      ml.lex
          14      ml.grm
          15      ml-yacc-lib.cm
          15      $/ml-yacc-lib.cm
          16      $/basis.cm
          17      $/smlnj-lib.cm

   In addition to this modification, we need to replace all SMLPP with SMLFormat in
   both Absyn.ppg (3 times) and Main.sml (4 times). Otherwise, “unbound structure:
   SMLPP in path” is going to pop up. We guess a late change to the structure name
   before distribution of this package may have caused this trouble. We expect later
   versions will fix this problem. For example, SMLPP should be replaced with
   SMLFormat in Absyn.ppg:

   - Absyn.ppg (SMLFormat-0.10/example/MLParser/Absyn.ppg)
       84 fun formatPrependedOpt (formatter, prefixIfSome) =
       85       fn NONE => [SMLPP.FormatExpression.Term (0, "")]


W. Chae & M. Blume                          5                                   12/1/2006
Memo on SMLFormat


       85       fn NONE => [SMLFormat.FormatExpression.Term (0, "")]
       86        | SOME value => prefixIfSome @ (formatter value)

   You also need to change Main.sml:

   - Main.sml (SMLFormat-0.10/example/MLParser/Main.sml)
      211       fun getLine length =
      212          case
      212         (
      213           if #promptMode arg
      214           then
      215             if !(#doFirstLinePrompt arg)
      216             then
      217               (
      218                 #doFirstLinePrompt arg := false;
      219                 print firstLinePrompt;
      220                 TextIO.flushOut TextIO.stdOut
      221               )
      222             else
      223               (
      224                 print secondLinePrompt;
      225                 TextIO.flushOut TextIO.stdOut
      226               )
      227           else ();
      228           TextIO.inputLine (#stream arg)
      229         )
      231          of NONE => "" | SOME st => st

      NOTE: Same trouble with TextIO.inputLine. All SMLPPs in Main.sml should
           be replaced with SMLFormat.

   The last change is to replace Substring.all with Substeing.full in ml.lex:

   - ml.lex (SMLFormat-0.10/example/MLParser/ml.lex)
        41 local
        42 fun cvt radix (s, i) =
        43    #1(valOf(Int.scan radix Substring.getc (Substring.triml i
                                                     (Substring.all s))))
        43    #1(valOf(Int.scan radix Substring.getc (Substring.triml i
                                                        (Substring.full s))))
        44 in

      NOTE: Same trouble with Substring.all.




W. Chae & M. Blume                         6                                    12/1/2006
Memo on SMLFormat


   The next step is to generate formatters from ppg files and run sml to call Main.main().
   The following shows these steps (which are described in README.txt in the same
   directory.)

   1) Generate a formatter

      [wchae@hana MLParser]$ ../../bin/smlformat Absyn.ppg

      NOTE: This step creates Absyn.ppg.sml which is declared in “sources.cm”

   2) Test under smlnj (Italic bold font refers to what we typed in sml session)

       [wchae@hana MLParser]$ sml
       Standard ML of New Jersey v110.58 [built: Fri Sep 29 10:10:32 2006]
       - CM.make "sources.cm";
       …
       val it = true : bool
       - Main.main ();
       ->case x of SOME y =>(y + 1) * 2 | NONE => let val a=1 in f(g(h)) end;

       val it =
          case x
            of SOME y => y + 1 * 2
             | NONE =>
               let val a = 1 in f (g h) end
       ->

      NOTE: You could find more interesting sample texts in SampleInput.txt under
            SMLFormat-0.10/example/MLParser.




W. Chae & M. Blume                            7                                    12/1/2006
Memo on SMLFormat


5. Quick guide

   This section will describe how to annotate format comments. There are two kinds
   of format comments:


                                                       type declaration header comment
      (*%
       *)

      datatype exp =

       (*%                                              type expression comment
        * @format ( pattern ) template
        *)

            Int of …
                                                    format template
     foo.ppg                                        type pattern


      1) Type declaration header comment – It instructs smlformat to generate
         formatters optionally with additional information. If this declaration is
         omitted, smlformat will not generate any formatter even when there are
         type expression comments.

      2) Type expression comment – It describes how to print the specific type; it
         consists of two components – type patterns and format templates.
             i. Type pattern: The type pattern corresponds to a type expression.
                 The reason it is called a pattern is that its behavior is similar to that
                 of a pattern in a function definition. When generating formatter
                 functions, type patterns are translated into formal arguments for
                 formatters and matched constructors in type definition. In the
                 following diagram, for example, x in type pattern is matched int:

                              (*% @format ( x ) “Int” + x *)

                                                        pattern matched
                               Int of int

               ii. Format template: It specifies the format to be used in pretty printer
                   (i.e., formatter) with format expression which will be covered by the
                   next section. In the above example, a formatter is generated to print
                   the string “Int” followed by the value of x.




W. Chae & M. Blume                           8                                       12/1/2006
Memo on SMLFormat


6. Format expression

   A format expression is constructed by combining double-quote-enclosed strings
   and formatters. The following list explains these formatters:

   NOTE: ‘A =(n)=> B’ implies translation of a list of format expressions A into the
        string B (optionally with the number of columns.)

      1) ░ (white space): It concatenates two strings.

          ex. “Norwegian”░“Wood” => “NorwegianWood”

          NOTE: No space is inserted between two strings.

      2) + (space indicator): It inserts white space.

          ex. “Norwegian” + “Wood” => “Norwegian Wood”

      3) number (preferred newline indicator): It specifies where to insert
         newline by giving priority. If the total length of a string exceeds the
         specified number of columns, a newline will be added according to the
         given priority. Smaller values mean higher priority.

          ex. “Hear” 1 “The” 2 “Wind” 3 “Sing”
              =(15)=> HearTheWindSing
              =(12)=> Hear
                      TheWindSing
              =(10)=> Hear
                      The
                      WindSing

          ex. “Hear” 1 “The” 1 “Wind” 3 “Sing”
              =(15)=> HearTheWindSing
              =(12)=> Hear
                      The
                      WindSing

          NOTE: Newlines are inserted to all places with the same priority if one of
               them is needed. In the above example, even though TheWindSing
               can be fit in 12 columns, a newline is inserted between The and
               WindSing because the newline between Hear and The is required,
               and both indicators have the same priority.

      4) d (deferred newline indicator): This is the newline indicator with the
         lowest priority. Only if the output does not fit, a newline will be inserted.




W. Chae & M. Blume                          9                                      12/1/2006
Memo on SMLFormat


         Moreover, unlike in the case of preferred newline indicators there is no
         precedence (or dependency) between deferred newline indicators.

         ex. “Hear” 1 “The” d “Wind” d “Sing”
             =(15)=> HearTheWindSing
             =(10)=> Hear
                     TheWind
                     Sing

         NOTE: Even though the first d inserts a newline, the second does not.
         NOTE: d is reserved for this purpose, so we can not use d in type pattern.

      5) { and } (guards): Braces delimits the scope of priorities.

         ex. “A” +1 “Wild” +2 “Sheep” +1 “Chase”
             =(15)=> A
                     Wild Sheep
                     Chase

         ex. { “A” +1 “Wild” +2 “Sheep” } +1 “Chase”
             =(15)=> A Wild Sheep
                      Chase

         NOTE: Guards separate the scope of priorities locally, so even they have
              the same value, outside priority is higher than inside priority.

      6) number [ and ] (indentation stack): Brackets followed by a number define
         an indentation level in case a newline is inserted. This integer number
         specifies the total number of spaces.

         ex. “The” +1 “Wind-up” +2 “Bird” +1 “Chronicle”
             =(15)=> The
                      Wind-up Bird
                      Chronicle

         ex. “The” 5[ +1 “Wind-up” +2 “Bird” ] +1 “Chronicle”
             =(15)=> The
                           Wind-up Bird
                       Chronicle

         ex. “The” 5[ +1 “Wind-up” 5[ +2 “Bird” ]] +1 “Chronicle”
             =(5)=> The
                          Wind-up
                                Bird
                      Chronicle




W. Chae & M. Blume                        10                                    12/1/2006
Memo on SMLFormat


         ex. “The” 5[ +1 “Wind-up” ~5[ +2 “Bird” ]] +1 “Chronicle”
             =(5)=> The
                          Wind-up
                     Bird
                     Chronicle

         NOTE: “~” can be used for negative indentation.

         ex. “A” +d { “Wild” 5[ +1 “Sheep” +1 “Chase” ]}
             =(15)=> A Wild
                             Sheep
                             Chase

         NOTE (guards and indentation): The guard starts at Wild, so the base
              indentation starts at the position of Wild, not that of A.


      7) Ln, Rn, or Nn (association indicator): They specify the associativity and
         precedence. If necessary, parentheses will be inserted.

         ex. L1{ L1 { "Dance" + "Dance"} + "Dance" }
             => Dance Dance Dance

         ex. L2{ L1 { "Dance" + "Dance"} + "Dance" }
             => (Dance Dance) Dance

         ex. R2 { " Dance" + R1{"Dance" + "Dance"}}
             => Dance (Dance Dance)

      8) ! (cut): It specifies no insertion of parentheses regardless of association
         indicators.

         ex. L1 { "The" + R1{"Elephant" + "Vanishes"}}
             => The (Elephant Vanishes)

         ex. L1 { "The" + !R1{"Elephant" + "Vanishes"}}
             => The Elephant Vanishes




W. Chae & M. Blume                         11                                      12/1/2006
Memo on SMLFormat


7. Usage patterns

   This section describes frequently used annotation patterns for reference:

      1) Type

                 (*% *)
                 type number = (*% @format (value) value*) int

      2) Datatype

                 (*% *)
                 datatype value =
                          (*% @format (value) "vInt(" value ")" *) vInt of int
                        | (*% @format (value) "vStr(" value ")" *) vStr of string

      3) Tuple

                 (*% *)
                 type region = (*% @format (left * right)
                                 *         "left=" + left + ", right=" + right
                                 *)
                               int * int

      4) Record

                 (*% *)
                 type person = (*% @format ({name, age})
                                *           "{name=" name ", age=" age "}"
                                *)
                              {name:string, age:int}

      or

                 (*% *)
                 type person = (*% @format ({name:n, age:a})
                                *           "{name=" n ", age=" a "}"
                                *)
                              {name:string, age:int}




W. Chae & M. Blume                         12                                    12/1/2006
Memo on SMLFormat


      5) List

       1 (*% *)
       2 datatype item = (*% @format (value) value *)
       3                  Item of string
       4                | (*% @format (value values) values(value) ("," +1) *)
       5                  Items of item list

      NOTE: In the definition of Items (line 5), item matches value while list type
           constructor matches values in the type pattern. In the type expression,
           values is applied to value with additional type expression ("," + 1)
           which implies the separator “,” between items. Thus, if item is Items
           [Item "A", Item "B", Item "C"], then the output will be "A,B,C".

                 4 (*% @format ( value values ) values(value) ("," +1) *)

                 5 Items of item       list           pattern matched



      6) List & Record

       1        (*% *)
       2        datatype books = (*%
       3                          * @format (bk bks) bks(bk) (";" +)
       4                          * @format:bk ({title:t, year:y})
       5                          *               "(title=" t ",year=" y ")"
       6                          *)
       7                          BOOKS of {title:string, year:int} list

      NOTE: The second format (line 4) is called "local format tag" because these
           tags specify locally defined type pattern (for example, bk) while the
           first ordinary format is called "primary format tag".
                     3    * @format( bk bks ) bks(bk) (“;” +)

                     4    * @format:bk ( { title:t, year:y} )

                     5                        "(title=" t ",year=" y ")" pattern matched

                     6    *)

                     7    BOOKS of { title:string, year:int} list


                When type patterns bk bks in primary format tag matches {title:string,
                year:int} list, bk matches {title, year} and bks matches list. Then, by


W. Chae & M. Blume                              13                                     12/1/2006
Memo on SMLFormat


             using the local format tag (i.e., @format:bk ...), t and y in {title:t,
             year:y} matches title and year, respectively. If books is BOOKS
             [{title="Sputnik Sweetheart", year=1999}, {title="Kafka on the
             Shore", year=2002}], for example, then the output will be
             (title=Sputnik Sweetheart,year=1999); (title=Kafka on the
             Shore,year=2002).

      7) Datatype replication or custom formatter

       1        (*%
       2         * @formatter ( Label.label) Label.format_label
       3         *)
       4         datatype value = (*% @format (value) value *)
                                  LABEL of Label.label

      NOTE: In this example, Label.format_label function is set to format the
          datatype of Label.label (line 2). Label.format_label may be another
          generated formatter function or we may need to modify Label structure
          to have it export a custom format_label function. An alternative is the
          following:

              fun format_label l =
                  let val s = Label.name l
                  in [SMLFormat.FormatExpression.Term (String.size s, s)]
                  end

              (*% @formatter (Label.label) format_label *)
              datatype value = (*% @format (value) value *)
                              LABEL of Label.label

      NOTE: The first example specifies a formatter function defined elsewhere
          while this example provides a custom formatter within the same
          structure (or file). This custom formatter should have a type of exp ->
          FormatExpression.expression list.

      7) Default format tag

              (*% *)
              datatype value = vInt of int
                              | vStr of string

      NOTE: It seems to be an error because there are no type expression comments
          at all. However, smlformat auto-generates default format tags for any
          missing comment. The auto-generated formatter for the above example
          looks very similar to what we did in the example 2.



W. Chae & M. Blume                         14                                      12/1/2006
Memo on SMLFormat


8. Case Study

   The current MLPolyR compiler uses at least seven intermediate languages (i.e., Ast,
   Absyn, Lambda, ANF, Closed ANF, Cluster, BBTree, and TraceTree). Three of them
   currently have pretty printers. This section shows how SMLFormat library can be
   used in the MLPolyR compiler. We compare this with the current implementation.

                                     ast                  absyn                 lambda
        source               parse         elaborate              translate               convert
                   anf

                     anf                           closed                     cluster
      optimize                 closure convert                clusterify                 optimize
                   cluster

                 bbtree                       tracetree
     treeify                 trace schedule                   code generate               asm


   For demonstration purpose, only the Lambda intermediate language (whose structure
   is defined in lambda.sml) will be shown in this case study. However, the final goal is
   to apply it to all intermediate languages.

      1) Preparation for adaptation (without using CM functionalities)
         Let us assume that we already installed smlformat and smlformatlib in the
         following locations:

               smlformat         /Users/wchae/working/SML/SMLFormat-0.10/bin
               smlformatlib      /Users/wchae/working/SML/SMLFormat-
                                 0.10/smlformatlib.cm
               mlpolyr           /Users/wchae/working/MLPolyR/

          Then, change Makefile to make smlformat work:

          - Makefile (/Users/wchae/working/MLPolyR/Makefile)
           SMLFORMAT=/Users/wchae/working/SML/SMLFormat-
           0.10/bin/smlformat

               all: runtime compiler
               all: format runtime compiler
                      @echo all done

               format:
                   $(SMLFORMAT) *.ppg

               runtime:


W. Chae & M. Blume                               15                                             12/1/2006
Memo on SMLFormat


                 (cd rt; $(MAKE); cd ..)

             compiler:
                 ml-build mlpolyr.cm Main.main mlpolyr

         Next, create lambda.ppg (which will be reviewed in the next section) based on
         lambda.sml and modify mlpolyr.cm to include smlformatlib.cm and
         lambda.ppg.sml (which smlformat generates) instead of lambda.sml:

         - mlpolyr.cm (/Users/wchae/working/MLPolyR/mlpolyr.cm)
          (* mlpolyr.cm
           *
           * CM description file for MLPolyR compiler.
           *
           * Copyright (c) 2005 by Matthias Blume (blume@tti-c.org)
           *)
          Library
          …
          is
               …
               /Users/wchae/working/SML/SMLFormat-0.10/smlformatlib.cm

                 lambda.sml
                 lambda.ppg.sml
                 …

      2) Preparation for adaptation (utilizing CM functionalities)
         Alternatively, we can teach CM how to deal with ppg files directly. CM can
         play the role of make in SML programming. There are many ways to do so,
         but one particular way (which does not require global modification to the
         SML/NJ installation itself) is presented here.

         First, we need to create “smlformat-tool.cm” and “smlformat.sml” which
         register ppg files for CM to recognize and handle. For example, we might put
         them into /Users/wchae/working/MLPolyR/tools.

         -     smlformat-tool.cm (tools/smlformat-tool.cm)
                Library
                       structure SMLFormatTool
                is
                       $smlnj/cm/tools.cm
                       smlformat.sml

         -     smlformat.sml (tools/smlformat.sml)
                structure SMLFormatTool = struct
                   val _ = Tools.registerStdShellCmdTool


W. Chae & M. Blume                         16                                 12/1/2006
Memo on SMLFormat


                  { tool = "SMLFormat",
                    class = "smlformat",
                    suffixes = ["ppg"],
                    cmdStdPath = fn () => ("smlformat_path/smlformat", []),
                    template = NONE,
                    extensionStyle = Tools.EXTEND [("sml", SOME "sml", fn
             too => too)],
                    dflopts = [] }
             end

         NOTE: Smlformat should be in the path or you should write the full path like
              “/Users/wchae/working/SML/SMLFormat-0.10/bin/smlformat” in
              cmdStdPath. Alternatively, you can register the directory containing
              smlformat in SML/NJ’s lib/pathconfig file. We are considering
              moving of smlformatlib into mlpolyr directories, it might cause the
              problem when we want to distribute the mlpolyr compiler; we need to
              check the license of SMLFormat and smlformatlib.

         Then, modify mlpolyr.cm to include the above smlformat-tool.cm and list all
         ppg files.

         -   mlpolyr.cm (/Users/wchae/working/MLPolyR/mlpolr.cm)
              (* mlpolyr.cm
               *
               * CM description file for MLPolyR compiler.
               *
               * Copyright (c) 2005 by Matthias Blume (blume@tti-c.org)
               *)
              Library
              …
              is
                   $/basis.cm
                   $/ml-yacc-lib.cm
                   $/smlnj-lib.cm
                   $smlnj/viscomp/basics.cm
                   $smlnj-tdp/back-trace.cm

                   /Users/wchae/working/SML/SMLFormat-0.10/smlformatlib.cm

                   tools/smlformat-tool.cm : tool
                   ppg : suffix (smlformat)
                   …
                   label.sml
                   lambda. sml
                   lambda. ppg
                   …


W. Chae & M. Blume                      17                                    12/1/2006
Memo on SMLFormat



           When you run CM.make, you can see the following messages which confirm
           CM can now recognize a ppg extension.

[attempting to load plugin (7011-export.cm):(mlpolyr.cm):tools/smlformat-tool.cm]
[scanning (7011-export.cm):(mlpolyr.cm):tools/smlformat-tool.cm]
[plugin (7011-export.cm):(mlpolyr.cm):tools/smlformat-tool.cm loaded successfully]
[/Users/wchae/working/SML/SMLFormat-0.10/bin/smlformat lambda.ppg]
[parsing (mlpolyr.cm):lambda.ppg.sml]

           NOTE: Which way is better? With clever CM, you do not need to differentiate
                ppg files and their generated sml files. In fact, you do not even want to
                know about the existence of generated files once they do not have any
                errors. For example, whenever we are using sml-lex and sml-yacc, we
                focus on input files (i.e., foo.lex and foo.grm) not on output.ofiles (i.e.,
                foo.lex.sml and foo.grm.sml). This practice should be applied to ppg
                files.

           NOTE: This way to teach CM to handle external tools is only applied locally,
                for the current project. If you want to install ppg support globally, you
                should consult the CM manual.

       3) lambda.ppg
          The following shows the whole lambda.ppg file which defines structure
          Lambda with format comments. More detailed explanation will follow:

    - lambda.sml (/Users/wchae/working/MLPolyR/lambda.sml)
          1 (* lambda.sml
          2 *
          3 * The Lambda intermediate language of the MLPolyR compiler.
          4 * Lambda is the output of the Translate phase.
          5 *
          6 * Copyright (c) 2005 by Matthias Blume (blume@tti-c.org)
          7 *)
          8 structure Lambda = struct
          9 (* util for smlformat *)
         10 fun plus pri = [SMLFormat.FormatExpression.Indicator(
         11                {space = true,newline = SOME{priority =
                            SMLFormat.FormatExpression.Preferred(pri)}})]
         12 fun output s = [SMLFormat.FormatExpression.Term
                                                                 (String.size s, s)]
         13
         14 fun format_lvar' l = output (LVar.toString l)
         15 fun format_label l = output (Label.name l)
         16 fun format_integer i = output (LiteralData.toString i)
         17 fun format_bool' b =
         18       let val bstr = if b then "=" else ""


W. Chae & M. Blume                          18                                     12/1/2006
Memo on SMLFormat


       19     in output bstr
       20     end
       21   fun format_args (formatter, prefix) =
       22     fn [] => output "()"
       23       | ls => List.concat [(SMLFormat.BasicFormatters.format_list
                                     (formatter, prefix)) ls]
       24
       25   (*% @formatter (LVar.lvar) format_lvar'
       26    * @formatter (Purity.purity) Purity.format_purity
       27    *)
       28   type lvar = (*% @format (value) value *) LVar.lvar
       29   type purity = (*% @format (value) value *) Purity.purity
       30
       31   (*% @formatter (Label.label) format_label
       32    * @formatter (LiteralData.integer) format_integer
       33    *)
       34   datatype value =
       35       (*% @format (value) value *)
       36       VAR of lvar
       37     | (*% @format (value) value *)
       38       LABEL of Label.label
       39     | (*% @format (value) value *)
       40       INT of LiteralData.integer
       41
       42   (*% @formatter (Oper.arithop) Oper.format_arithop
       43    * @formatter (Oper.cmpop) Oper.format_cmpop
       44    * @formatter (purity) Purity.format_purity
       45    * @formatter (Bool') format_bool'
       46    * @formatter (Args) format_args
       47    *)
       48   datatype exp =
       49       (*% @format (value) value *)
       50       VALUE of value
       51     | (*% @format (v * e1 * e2)
       52        *   !N0{ "let" {2[ + v + "=" ~2[+2 e1]]} +1
       53        *      { "in" 2[ +1 e2 ]} +1 "end" }
       54        *)
       55       LET of lvar * exp * exp
       56     | (*% @format (f fl * e)
       57        *   !N0{ "fix" + fl(f)(+1)
       58        *     +1 "in-fix"
       59        *     2[ +1 {e} ]
       60        *     +1 "end-fix" }
       61        *)
       62       FIX of function list * exp
       63     | (*% @format (aop * e1 * e2) N0{ aop 2[ + e1 ] 2[ +1 e2 ]} *)



W. Chae & M. Blume                    19                                  12/1/2006
Memo on SMLFormat


       64       ARITH of Oper.arithop * exp * exp
        65    | (*% @format ({purity:p, len, slices:s sl})
        66       *    !N0{ "record" p + "(" len ")"
        67       *      +1 "{" 1[sl(s)("," +2)] +1 "}" }
        68       * @format:s(value) value
        69       *)
       70       RECORD of { purity: purity, len: exp, slices: slice list }
        71    | (*% @format (e1 * e2 * p)
       72        *    !N0{ "select" p
       73        *      2[ +d e2 ]
        74       *      2[ +2 "from" +d e1 ] }
        75       *)
        76      SELECT of exp * exp * purity
        77    | (*% @format (e1 * e2 * e3)
        78       *    !N0{ "update" 2[ +1 e2 ]
        79       *      +1 "to" + 2[ +1 e3 ]
        80       *      +1 "in" + 2[ +1 e1 ] }
        81       *)
        82      UPDATE of exp * exp * exp
        83    | (*% @format (cop * e1 * e2 * et * ef)
        84       *    !N0{ "if" + cop { 5[ + e1 +1 e2 ] }
        85       *      +1 "then" 2[ +d {et}]
        86       *      +1 "else" 2[ +d {ef}] }
        87       *)
        88      CMP of Oper.cmpop * exp * exp * exp * exp
        89    | (*% @format (p * e * arg args:Args)
        90       *    L10{ "app" p { + 2[e] }
        91       *        2[ +1 {args(arg)(+d)} ] }
        92       *)
       93       APP of purity * exp * exp list
        94 and slice =
       95       (*% @format (value) value *)
        96      SGT of exp
        97    | (*% @format ({base, start, stop})
       98        *    !N0{ base + "{" + start + "..." + stop + "}" }
        99       *)
       100       SEQ of { base: exp, start: exp, stop: exp }
       101 (* the boolean flag, when set to true, is a strong
       102    * hint to have this function inlined *)
       103 withtype function =
       104       (*% @format (f * v vl * e * inl:Bool')
       105        *    !N0{ f + "(" vl(v)(",") ")" + inl + "=" 0[+1 e ] }
       106        *)
       107       lvar * lvar list * exp * bool
       108 end




W. Chae & M. Blume                       20                                  12/1/2006
Memo on SMLFormat


      4) Rationale
         This section explains my experience with format comment annotations in
         some detail:

             •  Type definitions
              9 (* util for smlformat *)
             12 fun output s = [SMLFormat.FormatExpression.Term
                                                              (String.size s, s)]
             13 …
             14 fun format_lvar' l = output (LVar.toString l)
             15 fun format_label l = output (Label.name l)
             16 fun format_integer i = output (LiteralData.toString i)
             17 fun format_bool' b =
             18       let val bstr = if b then "=" else ""
             19       in output bstr
             20       end
             …
             25 (*% @formatter (LVar.lvar) format_lvar'
             26    * @formatter (Purity.purity) Purity.format_purity
             27    *)
             28 type lvar = (*% @format (value) value *) LVar.lvar
             29 type purity = (*% @format (value) value *) Purity.purity

         Formatters define which formatting function will be used for each type (line
         25-26). In line 25, format_lvar’ is set for LVar.lvar which is defined in line
         14 while Purity.format_purity is set for Purity.purity which is presumably
         defined in purity.ppg. In our experience, annotating simple datatype with
         SMLFormat is too burdensome: it requires changing file names as well as
         annotating the datatype. Therefore, in case that a structure provides string
         conversion (i.e. toString function), we defined custom formatter instead of
         building another ppg file. That explains the first three custom formatters (line
         14-6) in the above “lambda.ppg”. In the fourth case, we wanted to print out
         only when value is true. Therefore, we needed to have this custom formatter,
         format_bool’ (line 17-20).

         Sometime, there is no choice: if we do not have source codes or can not
         change source files (for example, basis libraries), we should provide custom
         formatters ourselves. Even when source files are available, you may still need
         to modify the generated foo.ppg.sml (or foo.ppg) files to add generated
         formatters’ signatures.

         One more problem we want to mention is within choosing the right name for
         the custom formatter. When we wrote the formatter for LVar.lvar (line 28),
         we defined formatter as format_lvar (line 14) following the convention.
         However, smlformat also created another format_lvar from the comment (line
         28) and called format_lvar whenever it saw LVar.lvar (instructed by line 25).


W. Chae & M. Blume                        21                                     12/1/2006
Memo on SMLFormat


         The result was a non-terminating program. It took us a while to detect what’s
         going on.

             • Expressions
           17 fun format_bool' b =
           18      let val bstr = if b then "=" else ""
           19      in output bstr
           20      end
           21 fun format_args (formatter, prefix) =
           22      fn [] => output "()"
           23        | ls => List.concat [(SMLFormat.BasicFormatters.format_list
                                              (formatter, prefix)) ls]
           …
           42 (*% @formatter (Oper.arithop) Oper.format_arithop
           43   * @formatter (Oper.cmpop) Oper.format_cmpop
           44   * @formatter (purity) Purity.format_purity
           45   * @formatter (Bool') format_bool'
           46   * @formatter (Args) format_args
           47   *)
           ...
           89    | (*% @format (p * e * arg args:Args)
           90       *     L10{ "app" p { + 2[e] }
           91       *        2[ +1 {args(arg)(+d)} ] }
           92       *)
           93      APP of purity * exp * exp list
           …
          103 withtype function =
          104       (*% @format (f * v vl * e * inl:Bool')
          105        *     !N0{ f + "(" vl(v)(",") ")" + inl + "=" 0[+1 e ] }
          106        *)
          107       lvar * lvar list * exp * bool

         In line 93, exp list represents actual arguments for a function application. In
         the Lambda language, it could be empty which should be interpreted as
         having an UNIT (or ()) expression. However, the default format_list formatter
         ignored a nil case. To properly print (), we defined the format_args formatter
         (line 21-23) and set exp list as a Args type (line 89).

         Type pattern inl (line 104) had type bool (line 107), so smlformat would call
         the default format_bool function; however, we did not want to print anything
         if inl was false; otherwise we wanted to print “=”. How could we do that? We
         annotated inl as Bool’ (line 104) and Bool’ is set to use the format_bool’
         formatter (line 17-20) instead of the “default” format_bool (line 45).




W. Chae & M. Blume                       22                                    12/1/2006
Memo on SMLFormat


      5) Hand-made Pretty Printer
         For the purpose of comparison, we are attaching the current implementation of
         the pretty printer:

         - prlambda.sml (/Users/wchae/working/MLPolyR/prlambda.sml)
               1 (* prlambda.sml
               2 *
               3 * Pretty-printing Lambda.
               4 *
               5 * Copyright (c) 2005 by Matthias Blume (blume@tti-c.org)
               6 *)
               7 structure PrintLambda : sig
               8
               9 val exp : (string -> unit) -> Lambda.exp -> unit
              10 val function : (string -> unit) -> Lambda.function -> unit
              11 val value : (string -> unit) -> Lambda.value -> unit
              12
              13 end = struct
              14
              15 structure L = Lambda
              16 structure O = Oper
              17
              18 fun var pr v = pr (LVar.toString v)
              19
              20 fun value pr (L.VAR v) = var pr v
              21      | value pr (L.LABEL l) = pr (Label.name l)
              22      | value pr (L.INT i) = pr (LiteralData.toString i)
              23
              24 fun list pr f [] = ()
              25      | list pr f [x] = f x
              26      | list pr f (h :: t) = (f h; pr ","; list pr f t)
              27
              28 fun spaces pr 0 = ()
              29      | spaces pr n = (pr " "; spaces pr (n-1))
              30
              31 fun function0 indent pr (f, vl, e, inl) =
              32        (pr "\n"; spaces pr indent;
              33         var pr f;
              34         pr " ("; list pr (var pr) vl; pr ")=";
              35         if inl then pr "=" else ();
              36         exp0 (indent+1) pr e)
              37
              38 and exp0 indent pr e =
              39        let fun purity Purity.Pure = ()
              40              | purity Purity.Impure = pr "!"
              41            fun ar () = pr " <- "



W. Chae & M. Blume                       23                                   12/1/2006
Memo on SMLFormat


            42       fun indentation () = spaces pr indent
            43       val var = fn v => var pr v
            44       val value = fn x => value pr x
            45       fun newl s = (pr "\n"; indentation (); pr s)
            46       fun arithop aop = pr (O.astring aop)
            47       fun cmpop cop = pr (O.cstring cop)
            48
            49       fun exp e =
            50         (case e of
            51             L.VALUE x => (pr "VALUE ("; value x; pr ")")
            52           | L.LET (v, e1, e2) =>
            53              (pr "LET "; var v; pr " = "; exp0 (indent+1) pr e1;
            54               newl "IN "; exp0 (indent+1) pr e2;
            55               newl "END")
            56           | L.FIX (fl, e) =>
            57              (pr "FIX";
            58               app (function0 (indent+1) pr) fl;
            59               indentation ();
            60               newl "IN-FIX";
            61               exp0 (indent+1) pr e;
            62               newl "END-FIX")
            63           | L.ARITH (aop, e1, e2) =>
            64              (arithop aop;
            65               exp0 (indent+1) pr e1;
            66               exp0 (indent+1) pr e2)
            67           | L.RECORD {purity=p, len=x, slices=sl} =>
            68              (pr "RECORD "; purity p; pr "("; exp x; pr ")";
            69               newl "["; list pr slice sl;
            70               newl "]")
            71           | L.SELECT (e1, e2, p) =>
            72              (pr "SELECT "; purity p; exp0 (indent+1) pr e2;
            73               newl "FROM "; exp0 (indent+1) pr e1)
            74           | L.UPDATE (e1, e2, e3) =>
            75              (pr "UPDATE "; exp0 (indent+1) pr e2;
            76               newl "TO"; exp0 (indent+1) pr e3;
            77               newl "IN"; exp0 (indent+1) pr e1)
            78           | L.CMP (cop, e1, e2, et, ef) =>
            79              (pr "IF "; cmpop cop; exp0 (indent+1) pr e1;
                             exp0 (indent+1) pr e2         ;
            80               newl "THEN"; exp0 (indent+1) pr et;
            81               newl "ELSE"; exp0 (indent+1) pr ef)
            82           | L.APP (p, e, el) =>
            83              (pr "APP"; purity p;
            84               exp0 (indent+1) pr e;
                             newl "("; list pr (exp0 (indent+1) pr) el; newl ")"))
            85



W. Chae & M. Blume                   24                                    12/1/2006
Memo on SMLFormat


             86        and expl e = (newl ""; exp e)
             87
             88        and slice (L.SGT e) = (newl "SGT "; exp0 (indent+1) pr e)
             89           | slice (L.SEQ { base, start, stop }) =
             90              (newl "SEQ "; exp base;
             91               newl "FROM"; exp0 (indent+1) pr start;
             92               newl "TO"; exp0 (indent+1) pr stop)
             93
             94      in expl e
             95      end
             96
             97 fun function pr f = (function0 0 pr f; pr "\n")
             98 fun exp pr e = exp0 0 pr e
              99
             100 end

         NOTE: Source lines of code (SLOC) is not a good measure, but it may give
             some feel. SMLFormat requires lambda.ppg (108 SLOC) and generates
             lambda.ppg.sml (151 SLCO) while the hand-made function requires
             lambda.sml (34 SLOC) and prlambda.sml (100), 134 SLOC in total. It
             is too early to say who beats whom. However, the advantage of getting
             definition and formatting put together in the same file seems to
             compensate for all the trouble we mentioned in the previous section.

      6) Build
         In the current implementation, the MLPolyR compiler calls the pretty printer
         function, PrintLambda.function, in compile.sml. To call the generated
         formatter function, Lambda.format_function, line 53 should be changed:

         - compile.sml (/Users/wchae/working/MLPolyR/compile.sml)
             …
             53       val _ = PrintLambda.function print lambda
             53       val parameter =
             54          {
             55            spaceString = " ",
             56            newlineString = "\n",
             57            columns = 40
             58          }
             59       val _ = print (SMLFormat.prettyPrint parameter
                                           (Lambda.format_function lambda))

         NOTE: SMLFormat.prettyPrint requires one more argument which is
              PrinterParameter.printerParameter. It defines the number of
              columns, etc.




W. Chae & M. Blume                       25                                   12/1/2006
Memo on SMLFormat


      7) Test
         The following simple example is used for demonstration:

         -      fact.mlpr
             let fun fact n =
                   if n > 0 then n * fact (n-1)
                            else 1
             in fact 5
             end

         The result from smlformat seems more compact because of the more flexible
         newline policy, that is, only when needed, newlines are inserted. In the handy-
         made pretty-printer, newlines are inserted explicitly. See the result:

                               smlformat                   Handy-made pretty printer
             main_35 () =                                 main_35 ()=
             fix fact_32 (n_33) =                          fix
                let n_34 = n_33                              fact_32 (n_33)=
                in                                             let n_34 =
                  if > n_34 0                                    n_33
                  then (* n_34 app! fact_32 (- n_34 2))        in
                  else 2                                         if >
                end                                                n_34
             in-fix                                                0
               app! fact_32 10                                   then
             end-fix                                               *
                                                                     n_34
                                                                     app!
                                                                       fact_32
                                                                     (
                                                                       -
                                                                         n_34
                                                                         2
                                                                     )
                                                                 else
                                                                   2
                                                               end
                                                           in-fix
                                                             app!
                                                               fact_32
                                                             (
                                                               10
                                                             )
                                                           end-fix




W. Chae & M. Blume                           26                                 12/1/2006
Memo on SMLFormat


9. Summary (or Wish list?)

   While we were working with SMLFormat, there were several minor problems
   which we want to point out. We provide our solutions to each problem, but
   sometimes we also would like to suggest some extension to improve the current
   implementation of SMLFormat:

    1) Update up to the latest sml/nj compiler, so the hacks mentioned in chapter 2 are
       not required anymore.

    2) Support explicit newline indicator. In SMLFormat, newline would be inserted
       only if required. However, some situations require newline all the time. For
       example, when we worked with one of MLPolyR’s intermediate languages, we
       ran into the following formatting result:

       l_2_area(idx_138,idx_139,idx_140,rec_141):
           rec_130 <- rec_141 idx_129 <- idx_140 idx_128 <- idx_139 idx_127
       <- idx_138 goto l_1_area

      What we expected to have is as follows:

       l_2_area(idx_138,idx_139,idx_140,rec_141):
          rec_130 <- rec_141
          idx_129 <- idx_140
          idx_128 <- idx_139
          idx_127 <- idx_138
          goto l_1_area

      Here one of the most compelling features (make output compact, see the case
      study) betrayed me. To get what we wanted, we added bogus space in the
      annotation, so the total length of a line just exceeds the maximum:

            | (*% @format (le * e * p)
               *  !N0 { { le + "<-" + e} “                                       ” +1 p}
               *)
              MOVE of lexp * exp * preblock (* assignment *)

      We are not sure if there is a nicer way to deal with this problem; otherwise, we
      need the explicit newline indicator.

    3) Support “Don’t Care” term (aka wildcard) in type pattern. We worked with
       datatypes which contained type information as in BINOPexp in the below
       example.

            | (*% @format (b * eLeft * eRight * typ )
               *  !N0{ {eLeft} + b + {eRight} }


W. Chae & M. Blume                         27                                    12/1/2006
Memo on SMLFormat


               *)
               BINOPexp of binop * exp * exp * typ

      But we did not want to print the typ field. We did not even want to make up a
      name for it, so we used “_” as in SML, but that was rejected by smlformat.

   4) Support signatures. One of the modules defining an intermediate language had a
      signature. We annotated the implementation in the usual way shown as follows:

          …
          15 structure FunctionClusters : sig
          16
          17 type clusters = { entrylabel: Label.label,
                                  clusters : Closed.cluster list }
          18
          19 val clusterify : Closed.program -> clusters
          20
          21 end = struct
          22
          …
          29 (*% @formatter (Label.label) format_label
          30    * @formatter (Closed.cluster) Closed.format_cluster
          31    *)
          32 type clusters =
          33       (*% @format ({entrylabel, clusters:c cl})
          34        *    "**ENTRYPOINT:" + entrylabel "                                    "
          35        *    +1 cl(c)(+1)
          36        *)
          37       { entrylabel: Label.label, clusters: Closed.cluster list }
          38

      However, the result was somewhat unexpected. While format_clusters was
      created as usual, the signature was not modified, so “unbound variable or
      constructor: format_clusters in path FunctionClusters.format_clusters” error
      occurred.

   5) Walk around awkward recursive situation. When we worked with the Ast
      intermediate language, a generated formatting function had a type error and it
      took us a while to understand the problem itself. Here, we explain this particular
      problem with a simpler example:

           1      (*% *)
           2      datatype exp = (*% @format "ONE" *) ONE
           3                    | (*% @format (exp field)
           4                       * "MANY" + field(exp)
           5                       *)


W. Chae & M. Blume                        28                                     12/1/2006
Memo on SMLFormat


           6                      MANY of exp field
           7
           8    and pat = (*% @format "PONE" *) PONE
           9            | (*% @format (pat field)
          10               * "PMANY" + field(pat)
          11               *)
          12              PMANY of pat field
          13
          14     withtype 'a field = (*% @format (i * a) "(" i "," a ")" *) (int * 'a )

      Smlformat generates three formatting functions (format_exp, format_pat and
      format_filed), but format_field does not type-checked because the compiler
      infers (exp -> expression list) -> int * exp -> expression list for format_filed
      instead of ('a -> expression list) -> int * 'a -> expression list (due to the lack of
      polymorphic recursion in SML). For more explanation, see the difference of the
      following expressions:

             expression                                     type
       fun h x = x               val h = fn : 'a -> 'a
       fun f x = h 5             val f = fn : 'a -> int
       and g x = h "hello"       val g = fn : 'a -> string
       fun f x = h 5             Error: operator and operand don't agree [literal]
       and g x = h "hello"        operator domain: int
       and h x = x                operand:           string
                                  in expression:
                                    h "hello"

      How did we solve it? We realized ‘a field did not need to mutually recursively, so
      we changed the definition and eliminated unnecessary recursion.

   6) Handle d in type pattern. When we worked with the Absyn intermediate language,
      we struggled with an error that the following annotation produced:

          datatype exp =
            (*% @format (def defs * exp)
             *      !N0{ {"let" 2[ +1 defs(def)(+1) ]} +1
             *            {"in" 2[ +2 exp ]} +1 "end" }
             * @format:def( d * td ) d
             *)
            LETexp of (def * Types.depth) list * exp

      Only after some time we understood what the error message “syntax error:
      deleting FORMATINDICATOR ASTERISK ID” meant. How about giving a
      more meaningful error message saying “Do not use FORMATINDICATOR in
      type pattern.” Or, do not use d as a format indicator.



W. Chae & M. Blume                         29                                      12/1/2006
Memo on SMLFormat


   7) Provide better diagnostics when a header comment is missing. Smlformat has
      several nice features: for example, smart newline indicator and smart default
      format tag. However, sometimes these smart features may hide potential bugs.
      The below shows that particular example:

       1 (*%
       2 *)
       3 datatype exp =
       4          …
       5      and lexp =
       6               (* @format (value) "M[" value "]" *) MEM of exp
       7             | (* @format (value) value *) TEMP of temp
       8             | (* @format "$allocptr" *) ALLOCPTR

      It took us a long time to detect that “%” was missing in a type expression
      comment (line 6, 7 and 8). In order to understand the potential problem, let us see
      the following examples:

                         foo.ppg                               foo.ppg.sml
       (*% *)                                    (*% *)
       type item =                               type item =
       (*% @format (value) value *) int          (*% @format (value) value *) int
                                                 fun format_item x = …
       (*% *)                                    (*% *)
       type item = int                           type item = int
                                                 fun format_item x = …
       (*% *)                                    (*% *)
       type item =                               type item =
       (* @format (value) value *) int           (*% @format (value) value *) int
                                                 fun format_item x = …
       type item =                               type item =
       (*% @format (value) value *) int          (*% @format (value) value *) int

      The first example shows the general case which has both header and expression
      comment. The second example does not define any type expression comment, but
      the default mechanism generates some proper formatting. However, this nice
      feature even applies to the third example which has an ill-formed type expression
      comment ( that is, “(*” should be followed by “%”). It could hide bugs. The
      fourth example (we called it dangling comments) does not have a header
      comment that occurs really often; so, some attention is required here. Normally,
      the header comment will not contain any formatter, so it is usually empty and can
      be easily omitted by accident, which causes a lot of trouble later.

      Therefore, we would like to have some warnings or notices whenever default
      formats are used or dangling comments appear. At least, we need some verbose
      option which tells us what’s going on in smlformat.




W. Chae & M. Blume                        30                                     12/1/2006
Memo on SMLFormat


   8) Handle nil case in lists. When we extended the MLPolyR language to have
      reccases list in FUNdef definition, we found that most of time the reccases list
      is empty:

           | (*% @format (fun funs * rec recs * r)
              *  !N0{ "fun" 2[ +d funs(fun)( "|" +1 ) ]
                        +1 “with cases” + 2[ recs(rec)( "and" +1 ) ] }
              *)
             FUNdef of function list * reccases list * region

      In this situation we wanted to suppress the “with cases” output. This turned out
      to be not that easy. There was a way to define the separator between items in the
      list, but no way to differentiate the empty list from non-empty lists. Consequently,
      we added a custom formatter to do the job:

           9    (* util for smlformat *)
          10     fun plus pri = [SMLFormat.FormatExpression.Indicator
          11                      …
          12     fun output s = [SMLFormat.FormatExpression.Term …
          13
          …
          20     fun formatRecc (formatter, prefix) =
          21        fn [] => output ""
          22         | ls => (* +1 "with cases" +1 + ...*)
          23           List.concat [plus 1, output "with cases", plus 2,
          24              (SMLFormat.BasicFormatters.format_list
                                                           (formatter, prefix)) ls]
          …
          68     (*% @formatter (prependedOpt) formatPrependedOpt
          69      * @formatter (Purity.purity) Purity.format_purity
          70      * @formatter (Recc) formatRecc
          71      *)
          72     datatype exp =
          …
         142     and def =
         …
         145       | (*% @format (fun funs * rec recs:Recc * r)
         146          *  !N0{ "fun" 2[ +d funs(fun)( "|" +1 ) ]
         147                2[ recs(rec)( "and" +1 ) ] }
         148          *)
         149         FUNdef of function list * reccases list * region
         …

       We treated rec recs as Recc type (line 145) which was set to have formatRecc
      custom formatter (line 70). This formatter did handle empty list. There were
      several cases where we encountered this kind of problem: for example, only if


W. Chae & M. Blume                         31                                    12/1/2006
Memo on SMLFormat


      true, we wanted to do print … or only if option had some value, we wanted to…
      we wonder our solution is the best one can do here. What if we could have basic
      control expression in format templates? If we can go further, we may have real
      pattern matching in type pattern. This way, we can remove the custom formatter
      formatRecc:

          145       | (*% @format (fun funs * [] * r)
          146          *       !N0{ "fun" 2[ +d funs(fun)( "|" +1 ) ] }
          147          * @format (fun funs * rec recs * r)
          148          *      !N0{ "fun" 2[ +d funs(fun)( "|" +1 ) ]
          149                +1 “with cases” + 2[ recs(rec)( "and" +1 ) ] }
          150          *)
          151         FUNdef of function list * reccases list * region


      If the reccases list is empty, then formatting it (line 146) does nothing; otherwise
      formatting contains normal case (line 148-149). This is how we handle a list in
      SML.

   9) Make type declaration header comments more flexible. Sometimes, we felt the
      current style of comment system especially for formatters was too rigid. For
      example, we thought the following three different styles of comments should be
      treated the same way, but they were not. Only the first example created the right
      formatting functions. If we could have header comments in the style of second
      example, however it might conflict with the default format tag: smlformat can not
      decide where to generate formatting functions.

                             foo.ppg                                 foo.ppg.sml
       (*% @formatter (Item.item) Item.format_item *)       …
       type item = (*% @format (value) value *)             type item = … Item.item
                                                            fun format_item x = …
       (*% @formatter (Item.item) List.format_item *)       …
       type litm = (*% @format (value) value *) List.item   type litm = … List.item
                                                            fun format_litm x = …
       (*% @formatter (Item.item) Item.format_item          …
        * @formatter (List.item) List.format_item
        *)
       type item = (*% @format (value) value *) Item.item   type item = … Item.item
       type litm = (*% @format (value) value *) List.item   fun format_item x = …

                                                            type litm = … List.item
       (*% @formatter (Item.item) Item.format_item *)       t1.ppg:7.32-7.38 formatter of
       (*% @formatter (List.item) List.format_item *)       type not found:Item.item
       type item = (*% @format (value) value *) Iem.item
       type litm = (*% @format (value) value *) List.item

      10) Pretty-print the generated code. Smlformat is a tool like lex or yacc. We
          often had to deal with awkward generated code. Here is some output of
          smlformat that we encountered:


W. Chae & M. Blume                            32                                      12/1/2006
Memo on SMLFormat



        | LET x => (case x of (v, e1, e2) =>
        (List.concat[[SMLFormat.FormatExpression.Guard (SOME{cut = true, strength = 0,
        direction = SMLFormat.FormatExpression.Neutral}, List.concat
        [[SMLFormat.FormatExpression.Term(3,"LET")],[SMLFormat.FormatExpression.I
        ndicator({space = true, newline = NONE})],
        format_lvar(v),[SMLFormat.FormatExpression.Indicator({space = true, newline =
        NONE})],[SMLFormat.FormatExpression.Term(1,"=")],[SMLFormat.FormatExpres
        sion.StartOfIndent(2)],[SMLFormat.FormatExpression.Indicator({space = true,
        newline = SOME{priority = SMLFormat.FormatExpression.Preferred(1)}})],
        format_exp(e1),[SMLFormat.FormatExpression.EndOfIndent],[SMLFormat.Format
        Expression.Indicator({space = true, newline = SOME{priority =
        SMLFormat.FormatExpression.Preferred(1)}})],[SMLFormat.FormatExpression.Gu
        ard (NONE, List.concat [[SMLFormat.FormatExpression.Term(2,"IN")],
        [SMLFormat.FormatExpression.StartOfIndent(2)],[SMLFormat.FormatExpression.I
        ndicator({space = true, newline = SOME{priority =
        SMLFormat.FormatExpression.Preferred(1)}})],format_exp(e2),[SMLFormat.Forma
        tExpression.EndOfIndent]])],[SMLFormat.FormatExpression.Indicator({space =
        true, newline = SOME{priority = SMLFormat.FormatExpression.Preferred(1)}})],
        [SMLFormat.FormatExpression.Term(3,"END")]])]]))

       11) Specify a style of parentheses. While we can define the associativity and its
           length, we can not choose one among various styles (e.g, [ ] or { } instead of
           ordinary ( ) ).

10. Acknowledgments
    We would like to thank Atsushi Ohori for introducing us to SMLFormat during his
    visit to TII-C and UofC. Yamatodani Kiyoshi and the SML# Project members did a
    great job! This work was done within the MLPolyR project.




W. Chae & M. Blume                          33                                     12/1/2006

								
To top