Regular expressions

Document Sample
Regular expressions Powered By Docstoc
					Regular Expressions




                      1
          Regular Expression

• A regular expression (RE) is
  defined inductively
  a       ordinary character
          from S
  e       the empty string



                                 2
         Regular Expression


R|S = either R or S
RS      = R followed by S
            (concatenation)
R*      = concatenation of R
         zero or more times
        (R*= e |R|RR|RRR...)


                               3
        RE Extentions



R?   = e | R (zero or one R)
R+   = RR* (one or more R)




                               4
          RE Extentions


[abc] = a|b|c (any of
  listed)
[a-z] = a|b|....|z (range)
[^ab] = c|d|... (anything
  but
                  ‘a’‘b’)
                             5
         Regular Expression

RE       Strings in L(R)
a         “a”
ab        “ab”
a|b           “a” “b”
(ab)*     “” “ab” “abab” ...
(a|e)b        “ab” “b”
                               6
          Example: integers

• integer: a non-empty string
          of digits
• digit       = ‘0’|’1’|’2’|’3’|’4’|
            ’5’|’6’|’7’|’8’|’9’
• integer = digit digit*

                                   7
        Example: identifiers

• identifier:
  string or letters or digits
  starting with a letter
• C identifier:
    [a-zA-Z_][a-zA-Z0-9_]*

                                8
                   Regular Definitions
• To write regular expression for some languages can be
  difficult, because their regular expressions can be quite
  complex. In those cases, we may use regular definitions.
• We can give names to regular expressions, and we can use
  these names as symbols to define other regular
  expressions.

• A regular definition is a sequence of the definitions of
  the form:
  d1  r1          where di is a distinct name and
  d2  r2          ri is a regular expression over symbols in
      .                             S{d1,d2,...,di-1}
  dn  rn
                                                              9
   Specification of Patterns for Tokens: Regular
                     Definitions

• Example:

  letter  AB…Zab…z
    digit  01…9
      id  letter ( letterdigit )*

• digits  digit digit*




                                                   10
                    Regular Definitions (cont.)
• Ex: Identifiers in Pascal
       letter  A | B | ... | Z | a | b | ... | z
       digit  0 | 1 | ... | 9
       id  letter (letter | digit ) *
   – If we try to write the regular expression representing identifiers without using
     regular definitions, that regular expression will be complex.
       (A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *


• Ex: Unsigned numbers in Pascal
       digit  0 | 1 | ... | 9
       digits  digit +
       opt-fraction  ( . digits ) ?
       opt-exponent  ( E (+|-)? digits ) ?
       unsigned-num  digits opt-fraction opt-exponent




                                                                                        11
Specification of Patterns for Tokens: Notational
                     Shorthand

• The following shorthands are often used:
  – + one or more instances of
  – ? Zero or one instance

        r+ = rr*
        r? = re
     [a-z] = abc…z

• Examples:
  digit  [0-9]
  num  digit+ (. digit+)? ( E (+-)? digit+ )?

                                                   12
                     Definition


• For primitive regular expressions:

                  L   

                  L    

                  La   a
                                       13
              Definition (continued)


• For regular expressions r1 and      r2

•
          Lr1  r2   Lr1   Lr2 

           Lr1  r2   Lr1  Lr2 

             Lr1 *   Lr1  *

             Lr1   Lr1 
                                           14
              Concatenation of Languages

• If L1 and L2 are languages, we can define the
  concatenation
  L1L2 = {w | w=xy, xL1, yL2}
• Examples:
  – {ab, ba}{cd, dc} =? {abcd, abdc, bacd, badc}
  – Ø{ab} =? Ø
                     Kleene Closure

• L* = i=0Li
     = L0  L1  L2  …
• Examples:
  – {ab, ba}* =? {e, ab, ba, abab, abba,…}
  – Ø* =? {e}
  – {e}* =? {e}
                       Example


• Regular expression     r  (0  1) * 00 (0  1) *


      L(r ) = { all strings with at least
                two consecutive 0 }




                                                  17
                       Example


• Regular expression     r  (1  01) * (0   )


      L(r ) = { all strings without
                two consecutive 0 }




                                                   18
           Equivalent Regular Expressions


• Definition:

•   Regular expressions   r1   and   r2

•   are   equivalent if L(r )  L(r )
                           1       2




                                            19
                       Example

•   L = { all strings without
          two consecutive 0 }

        r1  (1  01) * (0   )
       r2  (1* 011*) * (0   )  1* (0   )

                                   r1  and r2
L(r1)  L(r2 )  L
                                   are equivalent
                                   regular expr.
                                                    20
                         Assignment
• Σ = {0, 1}
• What is the language for
  – 0*1*
• What is the regular expression for
  –   {w | w has at least one 1}
  –   {w | w starts and ends with same symbol}
  –   {w | |w|  5}
  –   {w | every 3rd position of w is 1}
  –   L + = L1  L2  …
  –   L? (means an optional L)
Regular Expressions
        and
 Regular Languages




                      22
                Theorem


Languages
Generated by
Regular Expressions
                         Regular
                          Languages




                                      23
  Standard Representations
    of Regular Languages

      Regular Languages




FAs

                      Regular
        NFAs
                      Expressions


                                    24
Elementary Questions

       about

 Regular Languages



                       25
            Membership Question
Question:   Given regular language L
            and string w
            how can we check if w     L?



Answer:     Take the DFA that accepts L
            and check if w is accepted

                                            26
DFA
w
      w L



DFA
w
      w L

             27
Question:    Given regular language L
             how can we check
             if L is empty: ( L  ) ?




Answer:     Take the DFA that accepts     L

            Check if there is any path from
            the initial state to a final state
                                                 28
DFA

      L



DFA

      L

            29
Question:   Given regular language   L
            how can we check
            if L is finite?




Answer: Take the DFA that accepts        L

        Check if there is a walk with cycle
        from the initial state to a final state
                                              30
DFA


      L is infinite



DFA


      L is finite

                      31
              From RE to e-NFA



• For every regular expression R, we can
  construct an e-NFA A, s.t. L(A) = L(R).
• Proof by structural induction:

        Ø:

        e:
                         a
        a:
       From RE to e-NFA


R+S:
        e      R          e

        e                 e
               S
RS:
                  e
        R                     S

R*:           e
        e             e
               R

              e
                    Example: (0+1)*1(0+1)


                                                e
            0                                   0
    e               e                   e           e
                                    e                   e
    e       1       e                   e       1   e

                                                e
                e
                0                                   0
        e               e                   e               e
e                           e   e   1   e
        e       1       e                   e       1       e

                e
Example : (a+b)*aba

				
DOCUMENT INFO
Categories:
Tags:
Stats:
views:1
posted:1/14/2013
language:English
pages:35
Mohammad Fasihuddin Mohammad Fasihuddin Fasihuddin http://
About