L08 by wuzhengqin

VIEWS: 4 PAGES: 25

									Regular Expressions

     Lisa Higham
•Automata (hardware)
  • good for recognizing (deciding) a given
  language.
  • does the machine accept a given string?


• Software models
   •good for generating strings in a language.
   •produce strings that are in a given language.


                                                    2
For regular languages two simple, natural ways
to generate strings.

     regular grammars (later)

     regular expressions (today)




                                                 3
Regular expressions are defined inductively.

• say what the basic building blocks are

• say how to combine building blocks to get to
more elaborate members




                                             4
Let  be an alphabet.

Basis :
  • each symbol a in  is a regular expression
  •  is a regular expression

Inductive part :
  If  and  are regular expressions over , then:
  • ( + ) is a regular expression
  • () is a regular expression
   • ()* is a regular expression
 Closure:
    Nothing else is a regular expression.
                                                     5
Regular expressions represent languages
  Regular Expression   Language represented

                          

  a                        {a}

  ( + )                  L ()  L ()

  ( )                    L () L ()

  () *                    L ()*
       Regular expressions examples
Let  ={a,b}

Regular Expression        Language represented

( ((a+b)(aa)) (a+b) )
                           ({a}  {b}){aa}({a}  {b})

                           =  {aa} 

i.e. set of all strings of length four such that the
middle two symbols are “a”
      Regular expressions examples
Let  ={a,b}

 ( ( ( ((a+b)*) (aa) ) (a+b)) + ()* )


     ({a}  {b})*{aa}({a}  {b})  {}


         = * {aa}   {}

 i.e. set of all strings that end in aab or aaa, or the
 empty string.
  Shortcuts and Simplifications
1. Use precedence to remove parenthesis
  *    star
      concatenation
  +    union

  Similar to arithmetic: xy + 7 means ((xy) +7)


( ( ( ((a+(b+c))*) (aa) ) (a+b)) + ()* )
       (a+b+c)* aa (a+b) + *
          (a+b+c)* aa (a+b) + *                  9
   Shortcuts and Simplifications
2. Abbreviations:

    abbreviates *

    abbreviates (a + b + c + d) for  = {a, b, c, d}


         (a+b+c+d)* aa (a+b) + *
                 * aa (a+b) + 

                                                    10
  Shortcuts and Simplifications
3. Cautions:
   Regular Expression   Simplified
     R
                         R
     R+
                         R
     R
                         

                                     11
   Regular expressions: more examples
Let  ={a,b}

Let A = {w in {a,b}* | w contains abba }

reg. exp. for A:       * abba *

B=
{w in {a,b}* | | w | = 7 and middle symbol of w is b}

reg. exp. for B:       b
    Regular expressions are strings
A regular expression over alphabet  is itself a string
over the new alphabet ’:

     ’   =     { +, *, (, ) }  {  , ,  }

But not all strings from this alphabet are regular
expressions! So we could define a new language:

R() = { w in ’ * | w is a valid regular expression }

Challenge question: Is R() a regular language?
   Regular expressions capture
       regular languages

Theorem:
  A language L is regular iff there
 exists a regular expression R such
 that
                      L = L (R)


                                      14
    Regular expressions capture
        regular languages
Recall
• Definition:
A language L is regular if there exists a
  DFA that accepts exactly the strings in L.

• Theorem:
Any NFA has an equivalent DFA.

                                           15
    Regular expressions capture
        regular languages
Thus:
• A language L is regular if there exists
  a DFA that accepts exactly the strings
  in L.

• A language L is regular if there exists
  an NFA that accepts exactly the
  strings in L.
                                            16
    Regular expressions capture
        regular languages
A language L is regular iff
• there exists a DFA, M, satisfying
                   L = L (M)
• there exists an NFA, N, satisfying
                   L = L (N)
• there exists a regular expression R
  satisfying
                   L = L (R)
                                        17
    Regular expressions capture
        regular languages
Proof strategy:
• Given regular expression R construct an
  NFA, N, satisfying
                  L (N) = L (R)

• Given DFA, M, construct regular
  expression R satisfying
                 L (M) = L (R)
                                            18
      Every regular expression
       corresponds to an NFA

Part 1: Given regular expression R construct
 an NFA, N, satisfying L (N) = L (R).

Idea: mimic the inductive definition of
  regular expressions.



                                           19
          Base Case
1. If R = a in , construct N:



                a
 N



              L (N) = L (R)
                                 20
          Base Case
2. If R = , construct N:




 N



             L (N) = L (R)
                             21
         Inductive Case
Let  and  be regular expressions and

let N and and N be NFA’s satisfying
L (N) = L () and L (N) = L ()

Suppose R = ( + )

   We need to construct NR satisfying:
             L (NR) = L (R).
                                         22
         Inductive Case 1.     R = ( + ) :



                                   N
     
NR
     
                                    N


             L (NR) = L (R).             23
     Inductive Case 2.     R = ( ) :

                           N


NR

                                   N


         L (NR) = L (R).             24
     Inductive Case 3.     R = * :


                           
                     
                                 N
     
NR

                     



         L (NR) = L (R).
                                      25

								
To top