Flex

Document Sample
Flex Powered By Docstoc
					Flex




       1
Flex

        A Lexical Analyzer Generator
         generates a scanner procedure directly, with
          regular expressions and user-written procedures
        Steps to using flex
    1.    Create a description or rules file for flex to
          operate on
    2.    Run flex on the input file. flex produces a C file
          called lex.yy.c with the scanning function yylex().
    3.    Run the C compiler on the C file to produce a
          lexical analyzer

                                                                2
Flex Files and Procedure


                                    Scanner in c code
 Rule file
             *.l    Flex compiler   lex.yy.c


         lex.yy.c    C compiler     scanner.exe
                         -lfl

        Test file   scanner.exe     tokens


                                                        3
Flex Programs
The flex input file consists of three sections
    separated by a line with just %%


                    %{
                    auxiliary declarations
                    %}
                    regular definitions
                    %%
                    translation rules
                    %%
                    auxiliary procedures
                                                 4
Regular Expression Definitions
Section
   The definitions section contains
    declarations of simple name definitions
    to simplify the scanner specification.
   Name definitions have the form:
        name definition
   Example:
        DIGIT       [0-9]
        ID          [a-z][a-z0-9]*


                                              5
Translation Rules Section
            P1         action1
            P2         action2
                 ...
            Pn         actionn

   where Pi are regular expressions and
   actioni are C program segments



                                          6
Auxiliary Procedure Section

   is simply copied to lex.yy.c.
   this section is optional;
       if it is missing, the second %% in the input file
        may be skipped.
   In the definitions and rules sections, any
    indented text or text
       enclosed in %{ and %}
       is copied to the output (with the %{}'s removed).


                                                            7
Rules
   Look for the longest token
       number
   Look for the first-listed pattern that
    matches the longest token
       keywords and identifiers
   List frequently occurring patterns first
       white space



                                               8
Rules
   View keywords as exceptions to the rule
    of identifiers
       construct a keyword table
   Lookahead operator: r1/r2 - match a string
    in r1 only if followed by a string in r2
       DO 5 I = 1. 25
        DO 5 I = 1, 25
        DO/({letter}|{digit})* = ({letter}|{digit})*,



                                                        9
Functions and Variables
   yylex()
     a function implementing the lexical analyzer and returning
      the token matched

   yytext
     a global pointer variable pointing to the lexeme matched



   yyleng
     a global variable giving the length of the lexeme matched



   yylval
     an external global variable storing the attribute of the token




                                                                       10
Example
   %{
   #define EOF        0
   #define LE         25
   ...
   %}
   delim      [ \t\n]
   ws         {delim}+
   letter     [A-Za-z]
   digit      [0-9]
   id         {letter}({letter}|{digit})*
   number     {digit}+(\.{digit}+)?(E[+\-]?{digit}+)?
   %%
                                                        11
Example
   {ws}                 { /* no action and no return */ }
   if                   {return (IF);}
   else                 {return (ELSE);}
   {id}                 {yylval=install_id(); return (ID);}
   {number}             {yylval=install_num(); return (NUMBER);}
   “<=”                 {yylval=LE; return (RELOP);}
   “==”                 {yylval=EQ; return (RELOP);}
    ...
   <<EOF>>              {return(EOF);}
   %%
   install_id() { ... }
   install_num() { ... }



                                                                   12
Lexical Error Recovery

   Error: none of patterns matches a prefix
    of the remaining input
   Panic mode error recovery
       delete successive characters from the remaining
        input until the pattern-matching can continue
   Error repair:
       delete an extraneous character
       insert a missing character
       replace an incorrect character
       transpose two adjacent characters
                                                          13
    Maintaining Line Number
   Flex allows to maintain the number of the
    current line in the global variable yylineno
    using the following option mechanism

       %option yylineno

    in the first section



                                               14
Flex : Regular Expression
x        match the character 'x'
.        any character (byte) except newline
[xyz]    a "character class"; in this case, the pattern
         matches either an 'x', a 'y', or a 'z'
[abj-oZ]  a "character class" with a range in it; matches
         an 'a', a 'b', any letter from 'j' through 'o',
         or a 'Z'
[^A-Z]   a "negated character class", i.e., any character
         but those in the class. In this case, any
         character EXCEPT an uppercase letter.
[^A-Z\n] any character EXCEPT an uppercase letter or
         a newline




                                                            15
Flex : Regular Expression
r*          zero or more r's, where r is any regular expression
r+           one or more r's
r?           zero or one r's (that is, "an optional r")
r{2,5}      anywhere from two to five r's
r{2,}        two or more r's
r{4}         exactly 4 r's
{name}       the expansion of the "name" definition
            (see above)
"[xyz]\"foo“ the literal string: [xyz]"foo
\X          if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
             then the ANSI-C interpretation of \x.
             Otherwise, a literal 'X' (used to escape
             operators such as '*')



                                                                  16
Flex : Regular Expression

\0     a NUL character (ASCII code 0)
\123   the character with octal value 123
\x2a   the character with hexadecimal value 2a
(r)    match an r; parentheses are used to override
       precedence (see below)
rs     the regular expression r followed by the
       regular expression s; called "concatenation"
r|s    either an r or an s
^r     an r, but only at the beginning of a line (i.e.,
       which just starting to scan, or right after a
       newline has been scanned).
r$     an r, but only at the end of a line (i.e., just
        before a newline). Equivalent to "r/\n".
                                                          17
Execute Flex

   Create a directory in cygwin
       Example /usr/src/compiler
   Downalod calc.l or c.l
   Execute flex
       Flex calc.l
   Lex.yy.c
       will be generated



                                    18

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:21
posted:4/1/2011
language:English
pages:18