Docstoc

Theory_of_Computation

Document Sample
Theory_of_Computation Powered By Docstoc
					Lecture 1
                 Theory of Computation

                       Planning Schedule
Week 1-2 : Introduction, Finite Automata, Regular Expression
Week 3-4 : Different Kinds of FA, Properties of Regular Sets
                Quiz 1 – Open book and notes
Week 6-7 : Context-Free Grammars, Normal Forms
Week 8-9 : Pushdown Automata, Properties of Context-Free
           Languages
                              Quiz 2
Week 10-11 : Turing Machines, Undecidability
Week 12-13 : Chomsky Hierarchy, LR(k) Grammars, Parsing
                       Final Examination

                                                               2
    Formal Languages and Automata Theory

Assessment Scheme
  Quiz 1 (12.5%)
  Quiz 2 (12.5%)
  Final Examination (40%)
  Homework (25%)
  Class Participation (10%)

For a student to pass this course, he/she must obtain
an average score of not less than 50% in the
examination and quizzes.

                                                        3
  General Topics


 Automata Theory
 Grammars and Languages
 Complexities




                           4
        Why Automata Theory?

To study abstract computing devices which are closely
related to today’s computers. A simple example of finite
state machine:




                                1

               start   off           on

                                1

  There are many different kinds of machines.
                                                           5
        Another Example


        1

                   0             0

start       off          off           on
                    1             0

                           1

                  When will this be on?
             Try 100, 1001, 1000, 111, 00, …

                                               6
       Grammar and Languages

Grammars and languages are closely related to
automata theory and are basis of many important
software components like:
  Compilers and interpreters
  Text editors and processors
  Text searching
  System verification




                                                  7
                Complexities

Study the limits of computations. What can a computer
do? What kinds of problems can be solved with a
computer? What kinds of problems can be solved
efficiently?

Can you write a program in C which can check if
another C program will terminate?




                                                        8
Preliminaries

    Alphabets
    Strings
    Languages
    Problems




                 9
                  Alphabets


 An alphabet is a finite set of symbols.
 Use  to represent an alphabet usually.
 Examples:
     = {0,1}, the set of binary digits.
     = {a, b, … , z}, the set of all lower-case letters.
     = {(, )}, the set of open and close parentheses.




                                                            10
                      Strings

 A string is a finite sequence of symbols from some
  alphabet.
 Examples:
    0011 and 11 are strings from  = {0,1}
    abc and bbb are strings from  = {a, b, … , z}
    (()(())) and )(() are strings from  = {(, )}




                                                       11
                         Strings

   Empty string: 
   Length of string: |0010| = 4, |aa| = 2, ||=0
   Prefix of string: aaabc, aaabc, aaabc
   Suffix of string: aaabc, aaabc, aaabc
   Substring of string: aaabc, aaabc, aaabc




                                                    12
                      Strings


   Concatenation: =abd, =ce, =abdce
   Exponentiation: =abd, 3=abdabdabd, 0=
   Reversal: =abd, R = dba
   k = set of all k-length strings formed by the symbols
    in 
    e.g., ={a,b}, 2={ab, ba, aa, bb}, 0={}
    What is 1? Is 1 different from ? How?



                                                         13
                     Strings


 Kleene Closure * = 012… = k0 k
  e.g., ={a, b}, * = {, a, b, ab, aa, ba, bb, aaa, aab,
  abb, … } is the set of all strings formed by a’s and
  b’s.
 + = 123… = k>0 k
  i.e., * without the empty string.




                                                             14
                     Languages

 A language is a set of strings over an alphabet.
 Examples:
    ={(, )}, L1={(), )(, (())} is a language over . The
     set L2 of all strings with balanced left and right
     parentheses is also a language over .
    ={a, b, c, … , z}, the set L of all legal English
     words is a language over .
    The set {} is a language over any alphabet.
     What is the difference between  and {}?


                                                             15
                 Languages
 Other Examples:
   ={0, 1}, L={0n1n | n1} is a language over 
    consisting of the strings {01, 0011, 000111, …
    }
   ={0, 1}, L = {0i1j | ji0} is a language over 
    consisting of the strings with some 0’s
    (possibly none) followed by at least as many
    1’s.




                                                        16
                     Problems

 In automata theory, a problem is the question of
  deciding whether a given string is a member of some
  particular language.

 This formulation is general enough to capture the
  difficulty levels of all problems.




                                                        17
             Class Exercise

Let alphabet = {(, )}. Let L be a language over 
such that L contains all the strings with balanced
parentheses, e.g., (()(())), (), , etc.

Write the pseudocode of a program which can
decide whether an input string is in L?




                                                     18
Lecture 2
           Finite Automata
     ( or Finite State Machines)

 This is the simplest kind of machine.
 We will study 3 types of Finite Automata:
    Deterministic Finite Automata (DFA)
    Non-deterministic Finite Automata (NFA)
    Finite Automata with -transitions (-NFA)




                                                  20
Deterministic Finite Automata (DFA)

We have seen a simple example before:



                           1

            start   off         on

                           1
     There are some states and transitions (edges)
     between the states. The edge labels tell when
     we can move from one state to another.

                                                     21
         Definition of DFA


A DFA is a 5-tuple (Q, , , q0, F) where
  Q is a finite set of states
   is a finite input alphabet
   is the transition function mapping Q   to Q
  q0 in Q is the initial state (only one)
  F  Q is a set of final states (zero or more)



                                                    22
         Definition of DFA

For example:
                        1

        start   off           on

                        1
  Q is the set of states: {on, off}
   is the set of input symbols: {1}
   is the transitions: off  1  on; on  1  off
  q0 is the initial state: off
  F is the set of final states: {on}
                                                     23
             Definition of DFA

Another Example:

              1        0            0

                           q1           q2
     start        q0
                       1            0

                                1

   We use double circle to specify a final state.
   What are Q, , , q0 and F in this DFA?

                                                    24
              Transition Table

We can also use a table to specify the transitions.
   For the previous example, the DFA is (Q,,,q0,F) where
   Q = {q0,q1,q2},  = {0,1}, F = {q2} and  is such that

                           Inputs
         States       0              1
            q0        q1             q0
            q1        q2             q0
            q2        q1             q0
  Note that there is one transition for each input
  symbol from each state.
                                                             25
                  DFA Example

   Consider the DFA M=(Q,,,q0,F) where Q =
   {q0,q1,q2,q3},  = {0,1}, F = {q0} and  is:

             Inputs                                  1
                                     Start    q0            q1
States     0          1                              1
    q0     q2         q1
                                             0 0          0 0
    q1     q3         q0       OR
                                                     1
    q2     q0         q3                      q2            q3
                                                     1
    q3     q1         q2

We can use a transition table or a transition diagram to specify
the transitions. What input can take you to the final state in M?
                                                                    26
          Language of a DFA

Given a DFA M, the language accepted (or
recognized) by M is the set of all strings which,
starting from the initial state, will reach one of the
final states after the whole string is read.
For example, the language accepted by the
previous example is the set of all 0 and 1 strings
with even number of 0’s and 1’s.




                                                         27
             Class Discussion
                                  0                     0
                             q0
                                      1            q1
            Start
                                      1
                                  0                     1
                             q0
                                      1            q1
            Start
                                      0

                         0                     1                 0,1

    Start           q0        1           q1            0   q2


What are the languages accepted by these DFA?
                                                                       28
            Class Discussion


Construct a DFA which accept a language L
over  = {0, 1} such that:
 (a) L contains “010” and “1” only.
 (b) L is the set of all strings ending with “00”.
 (c) L is the set of all strings containing no
   consecutive “1”s nor consecutive “0”s.




                                                     29
       Non-deterministic FA (NFA)

 For each state, zero, one or more transitions are
  allowed on the same input symbol.
 An input is accepted if there is a path leading to a final
  state.




                                                               30
              An Example of NFA
In this NFA (Q,,,q0,F), Q = {q0,q1,q2},  = {0,1},
F = {q2} and  is:

Start             q1        0
          1                                      Inputs
                                     States     0         1
 q0                1            OR       q0          {q1,q2}
              0                          q1   {q1 }    {q2 }
         1                               q2   {q0 }      
                       q2
  Note that each transition can lead to a set of states,
  which can be empty.
                                                                31
           Language of an NFA
Given an NFA M, the language recognized by M is
the set of all strings which, starting from the initial
state, has at least one path reaching a final state
after the whole string is read.
Consider the previous example:
  For input “101”, one path is q0q1q1q2 and
  the other one is q0q2q0q1. Since q2 is a final
  state, so “101” is accepted. For input “1010”,
  none of its paths can reach a final state, so it is
  rejected.

                                                          32
              Class Discussion

Consider the language L which consists of all the
strings over  = {0, 1} such that the second last symbol
is a “1”,
(a) Construct a DFA for L.
(b) Construct an NFA for L.

Is NFA more powerful than DFA?



                                                       33
Lecture 3
    An Example of NFA



                          0
                     q1
             1
                      1
Start   q0
                 0
             1       q2



                              35
        More Examples of NFA


                     a
              q0                q1         b          q2
 Start                                                           b
                      b
                          0-9                   0-9                        0-9

                     q1                    q4                    q6
               0-9          .                    E         0-9
                                     0-9
Start    q0                     q3                    q5             0-9
                      0-9
              +,-                                          +,-
                     q2                                          q7
                          0-9                                          0-9

                                                                                 36
              Class Discussion

Consider the language L which consists of all the
strings over  = {0, 1} such that the second last symbol
is a “1”,
(a) Construct a DFA for L.
(b) Construct an NFA for L.

Is NFA more powerful than DFA?



                                                       37
      DFA and NFA

So, is NFA more powerful than DFA?
   NO! NFA is equivalent to DFA.




             Trivial
    DFA                   NFA
           Constructive
              Proof


                                     38
     Constructing DFA from NFA
Given any NFA M=(Q,,,q0,F) recognizing a language
L over , we can construct a DFA N=(Q’, ,’,q0’,F’)
which also recognizes L:
 • Q’ = set of subsets of Q
   e.g., if Q = {q0, q1}, Q’ = {{}, {q0}, {q1}, {q0, q1}}
 • q0’ = {q0}
 • F’ = set of all states in Q’ containing a final state of M
 • ’({q1,q2, … qi}, a) = (q1,a)  (q2,a) ...  (qi,a)


        a state in N                 a state in N
                                                           39
  An Example of NFA  DFA
Consider a simple NFA:

               0               0    1
                    q0         1        q1
       Start
                                1
Construct a corresponding DFA:
                               1    {q1}
       Start        {q0}
                               1
                         0              0
                   {q0, q1}         {}

                              1,0            1,0
                                                   40
    NFA with -Transitions (-NFA)

 There exist -transitions which allow state changes
  without consuming any input symbol.
 Similar to NFA, an input is accepted if there is a path
  leading from the start state to a final state after the
  whole string is read.




                                                            41
          Examples of -NFA

                             c

           q0      q1          q2      q3

                         a,b

a-z
           q1   w   q2   e       q3   b   q4
      
q0
      
           q5       q6           q7       q8       q9
                e        b            a        y

                                                        42
                        -Closures
In an -NFA, the ECLOSE(q) of a state q is the
set of states that can be reached from q by
following a path whose edges are all labeled by
.

                             a
   Start                        
           q0           q1               q3

                        b
                                     b
                                         ECLOSE(q1) = {q1, q3}
                        q2               ECLOSE(q2) = {q1, q2, q3}

                                                                     43
 DFA, NFA and -NFA




                
DFA      NFA      -NFA




                            44
Lecture 4
  An Example of NFA  DFA
Consider a simple NFA:
               0               0    1
                    q0         1        q1
       Start
                                1
Construct a corresponding DFA:
                               1    {q1}
       Start        {q0}
                               1
                         0              0
                   {q0, q1}         {}

                              1,0            1,0
                                                   46
An Informal Proof of Correctness
 Each state in the DFA represents a set of states in
  the original NFA.
 After reading an input string , the DFA is in one
  state that represents the set of states the original
  NFA would be in after reading .
 Since any state in the DFA that includes a final
  state of the NFA is a final state, the DFA and the
  NFA will accept the same set of strings.




                                                         47
          Notes on NFA  DFA

Sometimes we do not need to consider all possible
subsets of the states in the original NFA (especially
when the original NFA is complicated). We can
construct the states in the DFA one by one, starting
from the initial state {q0} where q0 is the initial state of
the original NFA.




                                                               48
    NFA with -Transitions (-NFA)

 There exist -transitions which allow state changes
  without consuming any input symbol.
 Similar to NFA, an input is accepted if there is a path
  leading from the start state to a final state after the
  whole string is read.




                                                            49
                  Examples of -NFA
                                     c

          Start    q0      q1          q2      q3

                                 a,b

        a-z
                   q1   w   q2   e       q3   b   q4
              
Start   q0
              
                   q5       q6           q7       q8       q9
                        e        b            a        y


                                                                50
                       -Closures
In an -NFA, the ECLOSE(q) of a state q is the
set of states that can be reached from q by
following a path whose edges are all labeled by
.

                            a
  Start
          q0          q1              q3

                       b
                                    b
                                        ECLOSE(q1) = {q1, q3}
                       q2               ECLOSE(q2) = {q1, q2, q3}

                                                                    51
 DFA, NFA and -NFA




                
DFA      NFA      -NFA




                            52
-NFA and NFA




        Trivial
NFA                  -NFA
      Constructive
         Proof




                             53
                -NFA  NFA
Given any -NFA M=(Q,,,q0,F) recognizing a
language L over , we can construct an NFA
N=(Q,,’,q0,F’) which also recognizes L:
   ’(qi,a) = qj iff there is a path from qi to qj using
   exactly one arc labeled ‘a’ and zero or more arcs
   labeled ‘’ in M.
  F’ = F  {q0} if a final state is reachable from q0
   using -transitions in M. Otherwise, F’ = F.



                                                            54
     An Example of -NFA  NFA

                           a                                a
Start q     ,b       q1       Start q       a,b       q1
        0                              0


                                      a,b         a
      M                           N

            q2                               q2




                                                                55
An Informal Proof of Correctness
In the -NFA, the acceptance of the string
                            a1a2…an
causes it to go through a sequence of states:
                   q0  q1 q2  …  qm
with m  n. In general, some -transitions may be included in
the sequence:
             q0  ...  qi1 ...  qi2  …  qin
               a          a2               
In the DFA, each1 sequence of state ofthe form:
                        qik ...  qik+1
will be represented by a single transition qik qik+1
                         ak+1 

                                             ak+1

                                                           56
               -NFA  NFA
The transition function ’(qi,a) in the DFA can also
be constructed systematically:
 Find ECLOSE(qi) in M.
 Find the set of states S reachable from
  the states in ECLOSE(qi) using exactly
  one ‘a’-transition in M.
 Find ECLOSE(qk).
        qkS


                                                       57
                 Class Exercise
Construct an NFA equivalent to this -NFA:

                       a

        q0       q1          q3

                  c b

                  q2

                 -NFA

                                             58
 DFA, NFA and -NFA




DFA      NFA      -NFA




                            59
       Regular Expression (RE)
Let  be an alphabet, a RE over  can be defined
recursively as:
•    = {} is a RE
•    = {} is a RE
•    a  , a = {a} is a RE
•   If r, s are RE over  denoting the set R,
    S over , then (r+s), (rs) and (r*) are RE
    over  denoting RS, RS and R*
    respectively.
                                                   60
        Regular Expression (RE)
Note:
• If R is a set of strings, R* denotes the
  set of all strings formed by
  concatenating zero or more strings
  from R.
• We can neglect the parentheses
  assuming that * has a higher
  precedence than concatenation and
  concatenation has a higher
  precedence than +
                                             61
  e.g., ((0(1*)) + 1) = 01* + 1
                Examples of RE
 01* = {0, 01, 011, 0111, …..}
 (01*)(01) = {001, 0101, 01101, 011101, …..}
 (0+1)* = {0, 1, 00, 01, 10, 11, …..}, i.e., all strings of
  0 and 1
 (0+1)*00(0+1)* = {00, 1001, …..}, i.e., all 0 and 1
  strings containing a “00”




                                                               62
             More Examples of RE

 (1+10)* = all strings starting with “1” and containing no
  “00”
 (0+1)*011 = all strings ending with “011”
 0*1* = all strings with no “0” after “1”
 00*11* = all strings with at least one “0” and one “1”,
  and no “0” after “1”




                                                              63
           Class Discussion

What languages do the following RE represent?
• (1+01+001)*(+0+00)
• ((0+1)(0+1))*+((0+1)(0+1)(0+1))*




                                                64
             Class Discussion

Construct a RE over ={0,1} such that
• It does not contain any string with two
  consecutive “0”s.
• It has no prefix with two or more “0”s than
  “1” nor two or more “1”s than “0”




                                                65
Lecture 5
    DFA, NFA, -NFA and RE


RE can describe all the languages represented by
a DFA, NFA or -NFA.


  DFA             NFA             -NFA      RE




                                                     67
 -NFA and RE



        Constructive
           Proof
-NFA                  RE
        Constructive
           Proof




                            68
          Example of RE  -NFA
 -NFA for 0
          Start            q0        0       q1
 -NFA for 1
          Start            q0        1       q1
 -NFA for 0+1

                                q2       0        q3
                                                      
  Start           q0                                       q1
                               q4       1        q5   



                                                                69
        RE  -NFA (Base Cases)
    = {} is a RE,
                   Start        q0        -NFA for 

    = { } is a RE,

                   Start        q0        -NFA for 
    a  , a = {a} is a RE,



           Start           q0   a    q1   -NFA for a


                                                        70
  RE  -NFA (Inductive Cases)
  If r and s are RE represented by -NFA, Mr and Ms
  respectively, the -NFA for r+s, rs and r* can be
  constructed as:


                             Mr         
                    
Start        q0                                q1      -NFA for r+s
                                       
                             Ms

        q0 is connected to        All final states of Mr and
        the start states of Mr    Ms are connected to q1,
        and Ms, labeled by .     labeled by . q1 becomes
                                  the only final state.

                                                                       71
      RE  -NFA (Inductive Cases)
Start q                                                
        0                Mr                Ms                q1      -NFA for rs

    q0 is connected to    Final states of Mr are   Final states of Ms are
    the start state of    connected to the start   connected to q1, labeled
    Mr, labeled by .     state of Ms, labeled     by . q1 becomes the
                          by .                    only final state.
                                    
                                    
        Start q                              
               0                  Mr                q1            -NFA for r*
                 q0 is connected to      Final states of Mr are
                 the start state of      connected to q1, labeled
                 Mr, labeled by .       by . q1 becomes the
                                         only final state.
                                                                                 72
                Class Exercise
Construct an -NFA for the RE 01*(00+1) over {0,1}




                                                     73
                   DFA  RE
How to construct a RE for the following DFA?

             0                   1
                       1
              q0                 q1
                       0
 • By observation: (0+1)*0 +0*
 • A systematic method?




                                               74
                  DFA  RE
We define a term Rijk to denote the set of all strings which
take the DFA M from qi to qj with intermediate states
going through q0, q1, q2, … or qk only.
e.g.           0                    1
                           1
                 q0                  q1
                           0

R00-1 = {, 0}, R000 = {, 0, 00, 000, ...}= 0*
R01-1 = {1},    R010 = {1, 01, 001, 0001, ...}= 0*1
Note that 101  R010. Why?


                                                               75
                  DFA  RE
            {a | (qi, a) = qj}           if i  j
  Rij-1 =
            {a | (qi, a) = qj}{}       if i = j
  Rijk = Rijk-1  Rikk-1(Rkkk-1)*Rkjk-1

                       qk

                                             qj
             qi          a path in M

We can recursively build the RE for Rij0, Rij1,
Rij2, …, Rijn.
                                                     76
                       DFA  RE
   Let rijk be a RE for Rijk. Compute rij-1 i,j = 0…n, then rij0
    i,j = 0…n, then rij1 i,j = 0…n, …, until rijn i,j = 0…n
    according to the formula:

             rijk = rijk-1 + rikk-1(rkkk-1)*rkjk-1
   The RE for the language accepted by M:


                r0j1n + r0j2n + ….. + r0jpn

    where F = {qj1 qj2  …..  qjp}

                                                                     77
        Example of DFA  RE
           0               1
                       1
           q0              q1
                       0
R00-1           R000           R001

R01-1           R010           R011

R10-1           R100           R101

R11-1           R110           R111


                                      78
Lecture 6
         More Examples on RE
Construct a RE for each of the following languages
over the alphabet ={0,1}:
  The set of all strings ending with “00”
  The set of all strings with 3 consecutive 0’s
  The set of all strings beginning with “1”, which
   when interpreted as a binary no., is divisible by
   5.
  The set of all strings with a “1” at the 5th position
   from the right.
  The set of all strings not containing 101 as a
   sub-string.
                                                           80
Algebraic Rules for RE

     Commutative Rule
     Associative Rule
     Distributive Rule
     Identity




                          81
Commutative and Associative Rules
Let L, M and N are regular expressions, which of
the followings are correct?
 L+M=M+L
  LM = ML         Commutative Rules
  (L + M) + N = L + (M + N)
  (LM)N = L(MN)               Associative Rules



                                                   82
           Distributive Rules


Which of the followings are correct?
 L(M + N) = LM + LN
                                       Left Distributive
 L + (MN) = (L + M)(L + N)            Rules
 (M + N)L = ML + NL
                                       Right Distributive
 (MN) + L = (M + L)(N + L)            Rules




                                                           83
                      Identities

  is the identity for union:
      +L=L+=L
  is the identity for concatenation:
      L = L = L




                                         84
Other Rules for Closures

     (L*)* = L*
     L+ = LL* = L*L
     L* = L+ + 
     * = 
     * = 




                           85
      Other Algebraic Rules
Which of the followings are correct, and why?
  (L + M)* = (L*M*)*
  L + ML = (L + M)L
(To prove, we need to show that any string
generated by the RE on the right can also be
generated by the RE on the left, and vice versa. To
disprove, we need to find a counter-example.)




                                                      86
           Applications of RE

Two common applications of RE:
 Lexical analysis in compiler
 Finding patterns in text




                                 87
             Lexical Analyzer


 Recognize “tokens” in a program source code.
 The tokens can be variable names, reserved
  words, operators, numbers, … etc.
 Each kind of token can be specified as an RE,
  e.g., a variable name is of the form [A-Za-z][A-
  Za-z0-9]*. We can then construct an -NFA to
  recognize it automatically.




                                                     88
                Lexical Analyzer

 By putting all these -NFA’s together, we obtain one
  which can recognize all different kinds of tokens in the
  input string.
 We can then convert this -NFA to NFA and then to
  DFA, and implement this DFA as a deterministic
  program - the lexical analyzer.




                                                             89
                 Text Search
 “grep” in Unix stands for “Global (search for)
  Regular Expression and Print”.
 Unix has its own notations for regular
  expressions:
    Dot “.” stands for “any character”
    [a1a2…ak] stands for a1+a2+…+ak, e.g.,
     [bcd12] stands for b+c+d+1+2
    [x-y] stands for all characters from x to y in the
     ASCII sequence.


                                                          90
                 Text Search
    | means “or”, i.e., + in our normal notation.
    * means “Kleene star”, as in our normal notation.
    ? means “zero or one”, e.g., R? is  + R
    + means “one or more”, e.g., R+ is RR*
    {n} means “n copies of”, e.g., R{5} is RRRRR
   (You can find out more by “man grep”, “man regex”)
 We can use this notations to search for string
  patterns in text.



                                                         91
                 Text Search
 For example, credit card numbers:
    ‘[0-9]{16}’
    ‘[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}’
 For example, phone numbers:
    ‘[0-9]{8}’
    ‘[0-9]{3}-[0-9]{5}’
    ‘852-[0-9]{8}’
    ‘852-[0-9]{3}-[0-9]{5}’


                                             92
Lecture 7
  DFA, NFA, -NFA and RE



DFA, NFA, -NFA and RE are equivalent.


DFA                NFA                -NFA       RE



      A language describable by them is called a
                Regular Language

                                                          94
           Class Discussion


Can you draw a DFA which accepts the language
{anbn | n  N} over the alphabet ={a,b}?




                                                95
      Limitations of FA

Many languages are non-regular:
•   {anbn | n  N}
•   {0i2 | i  N}
•   {0p | p is prime }
•   set of well-formed parentheses
•   set of palindromes
•   …...




                                     96
              Why Impossible?
 We want to prove that L = {anbn | n  N} is non-
 regular. Prove by contradiction:

 • Assume that there is a DFA M which
   recognizes L. Let n be the no. of states in M.

 • Consider the acceptance of the input arbr
   where r  n:


q0 a q1 a q2 a             a q b q     b      …... b q2r
                    …...      r    r+1



                                                           97
              Why Impossible?
• Since r  n and M has only n states, there must be
  at least one state visited twice in the first r
  transitions. Let this state be visited at the ith and the
  jth steps, where j > i.


                        qi = qj

                                            qk  F
                          a path in M
               q0

• By skipping the loop, ar-(j-i)br should also be
  accepted by M, but this is contradictory since ar-(j-i)br
                                                            98
  L
Lecture 8
            Another Example

We want to prove that L = {1k2 | k > 0} is non-regular.
Prove by contradiction:
• Assume that there is a DFA M which
  recognizes L. Let n be the no. of states in M.
• M should also accept the string 1n2


           q0 1 q1 1 q2 1              1 q2
                                …...      n




                                                          100
           Another Example
• Since n2  n and M has only n states, there
  must be at least two equal states from q0 to qn2.
  Let them be qi and qj where j-i = m  n.



   A loop of           qi = qj
   length m
                          a path in M     qn2  F
               q0
• By repeating the loop one more time, 1(n2+m) is
  also accepted by M, which is a contradiction,
  since (n2+m) cannot be a square (the next
  square after n2 is (n+1)2 but n2+m < (n+1)2).
                                                      101
          Yet Another Example

We want to prove that L = {1p | p is a prime} is non-
regular. Prove by contradiction:
• Assume that there is a DFA M which
  recognizes L. Let n be the no. of states in M.
• From Number Theory, we know that the no. of
  primes are infinite, so there exists a prime p 
  n.

            q0 1 q1 1 q2 1              1 q
                                 …...       p



                                                        102
         Yet Another Example
• Since p  n and M has only n states, there
  must be at least two equal states from q0 to
  qp. Let them be qi and qj where j-i = m > 0.



    A loop of           qi = qj
    length m
                                           qp  F
                           a path in M
                q0
• By repeating the loop (p-m) times, 1(p-m)m + (p-
  m) = 1(p-m)(m+1) is also accepted by M, which is

  a contradiction since (p-m)(m+1) is not a
  prime.
                                                     103
Pumping Lemma for Regular Set


Let L be a regular language. Then there is a constant n
such that if z is any word in L and |z|  n, we can write
z = uvw in such a way that |uv|  n, |v|  1, and for all i
 0, uviw is in L. Furthermore, n is no greater than the
number of states of the smallest FA accepting L.




                                                              104
         Proof of Pumping Lemma
   If L is a regular language, there are DFA’s which
    recognize L. Let n be the no. of states in the smallest
    one M.
   If z is a word in L with |z| = k  n:

           q0 z(1) q1 z(2) q2 z(3) …... z(k) qk


    where z(i) is the ith symbol of the string z.
   Since k  n and M has only n states, there must be at
    least one repeated states from q0 to qk. Let qi be the
    first such repeated state.
                                                              105
      Proof of Pumping Lemma

                     qi

                                         qk  F
                          a path in M
           q0

Let u be the string obtained by traversing from q0
to qi, v be the string obtained by traversing the
loop once (so |v|  1). In the traversal from q0 to qi
and then through the loop once back to qi, nothing
except qi repeats. Thus |uv| n. By traversing the
loop 0 or more times, we obtain uviw for all i  0 and
they should all be accepted by the DFA, i.e., in L.      106
           Class Discussion
Prove that the set L={ww | w(0+1)*} is non-
regular.




                                               107
              Closure Properties
Regular sets are said to be closed under an operation op if
the application of op to a regular set will result in a regular
set.

For example, if the union of two regular sets will result in a
regular set, regular sets are said to be closed under union.




                                                                  108
            Closure Properties

Are regular sets closed under:
  Union ?
  Concatenation ?
  Kleene star ?
Why?




                                 109
            Closure Properties

Are regular sets closed under complementation?
If A is a regular set over , is A’ = *-A regular? Why?
If A is regular, there exists a DFA M recognizing A. Given
M, we can construct a DFA M’ for A’ by copying M to M’
except that all final states in M are changed to non-final,
and all non-final states to final.
Can we apply this construction to an NFA. Why?




                                                              110
         Closure Properties
Are regular sets closed under intersection?
If A and B are regular sets, is C = AB regular?
Why?

                C = AB = AB

Since regular sets are closed under union and
complementation, they are also closed under
intersection.



                                                   111
                Substitution

A substitution f maps each symbol in  onto a -
language (a subset of *)

For example,
   = {0,1},  = {a, b}
  f(0) = a + b, f(1) = b* is a substitution
  Then, f() = , f(01) = (a+b)b*,
          f(0*1*) = (a+b)*(b*)* = (a+b)*
          f(L) = wL f(w)


                                                   112
             Closure Properties

Regular sets are closed under substitution, i.e. if the
original language over  is regular and all the -
languages are regular, the resultant language after
substitution is also regular. Why?




                                                          113
Closure Properties (Summary)

   Regular sets are closed under:
    Union
    Concatenation
    Kleene Star
    Complementation
    Intersection
    Substitution



                                    114
         Equivalence of FA’s

               M1        M2
      Minimize            Minimize
                   ???
               M1 ’  M 2 ’
There is a unique minimum state DFA for
every regular set (unique up to isomorphism,
i.e., the states may have different names but
the structures are the same)
                                                115
          Minimizing a DFA

Two steps:
– Removing states which are
  inaccessible from the start state.
– Combining states which are
  equivalent.
Two states p and q are equivalent iff
                                       
  For any string , if p  r and q  s, r
  is final iff s is final.
                                            116
        Example of FA Minimization
                                  b
        d         h
            1 1                   c
        0             0
                                  d
 1      c         g
                          0
                                  e
        1         1           0
            0 0                   f
 0      b         f       1

        0
            1
                  1               g
Start
        a         e               h
                                      a   b   c   d   e   f   g
                                                              117
  Example of FA Minimization
We finally get:
                             Start


                             [a,e]
                 0       0              1             1
                     [b,h]      0           [g]
                 1                                1
                                    0
           [c]                                        [d,f]
                                0
      1



                                                              118
                          Class Discussion
  Minimize the following DFA:

                      a
                  0       1
          c                   b
      0       1           0           1
          1           0       0
0 d           e           f               g   1
                      1
          0                       1

                                                  119
Lecture 9
Many languages are not regular, e.g., balanced
parentheses, begin-end matches, {0n1n | n>0},
{1n2 | n>0} …

In this second part, we will study another class of
languages higher in the hierarchy that can
describe a larger set of languages.




                                                      121
 Context Free Grammar (CFG)
CFG is invented originally for describing natural
languages:
    <sentence>  <noun phrase><verb phrase>
    <noun phrase>  <adjective><noun phrase>
    <noun phrase>  <noun>
    <noun>  boy
    <adjective>  little
<sentence>, <noun phrase>, <verb phrase>, <adjective>
and <noun> are called non-terminals or variables. “boy”
and “little” are called terminals. The rules are called
productions.


                                                          122
     Another Example of CFG


<expression>  <expression> + <expression>
<expression>  <expression> * <expression>
<expression>  (<expression>)
<expression>  id

Variables: <expression>
Terminals: +, *, (, ), id




                                             123
        Formal Definition of CFG
A CFG is denoted by a 4-tuple:
                        (V, T, P, S)
where V is a set of variables (non-terminals)
        T is a set of terminals
        P is a set of productions (rules) of the form:
                          A
            where A is a variable and   (VT)*
         S is a special variable called start symbol



                                                         124
           Production Rules
A set of productions:
                        A  1
                        A  2
                          …..
                        A  k
can be written as A  1|2|….. |k|
The previous example for <expression> can be
written as E  E+E | E*E | (E) | id



                                               125
                  Derivation
Lets look the CFG G: E  E+E | E*E | (E) | id
               E*E
        E
                (E) * E
                (E) * id
                                  Derivation
                ( E + E ) * id
                ( E + id ) * id
                ( id + id ) * id
We say    if  can be obtained from  by applying a
production once. We say    if  can be obtained from 
by applying the productions zero or more times. We use 
                            *
to specify clearly which grammar we are using.


G
                                                            126
Context Free Languages (CFL)
The language generated by a CFG G, denoted by
L(G), is :
           L(G) = { |   T* and S   }
                                     *
                                      G
Therefore,
  every string consists solely of terminals, and
  every string can be derived from S
It is called a context free language (CFL).



                                                    127
           Examples of CFG
Example 1
How can we represent the language {anbn | n>0}
Consider the following CFG G1:
    S  aSb
    S  ab
In this example, variables are {S}, terminals are {a,b},
start symbol is S and productions are the above two
rules.




                                                           128
         Examples of CFG
Example 1
 How to generate aabb?
S  aSb             (use Rule 1)   S  aSb
                                   S  ab
   aabb            (use Rule 2)


How to generate aaabbb?
S  aSb             (use Rule 1)
   aaSbb (use Rule 1)
   aaabbb (use Rule 2)


                                             129
                     CFG

Notice that:
 We must start with the start symbol
 We can use any production any number of times.
 The final string can only contain terminals.




                                                   130
             Examples of CFG
Example 2
How to represent the language of balanced
parentheses, i.e., {, (), (()), ()(), (()()), …..}?
Consider the following CFG G2:
    S  SS
    S  (S)
    S
In this example, variables are {S}, terminals are {(, )},
start symbol is S and productions are the above three
rules.


                                                            131
Example 2
           Examples of CFG
How to generate ()?
                                       S  SS
 S  (S)               (use Rule 2)    S  (S)
    ()                (use Rule 3)    S
How to generate (()())?
 S  (S)                (use Rule 2)
    (SS)               (use Rule 1)
    ((S)S) (use Rule 2)
    ((S)(S)) (use Rule 2)
    (()(S)) (use Rule 3)
    (()())             (use Rule 3)

                                                 132
Example 3
             Examples of CFG
How can we represent all the arithmetic expressions
with “plus”, “minus”, “parentheses” and variables x, y
and z, i.e., {x, y, z, x+y, x-y, y+x, x+(y-z), ….. }?
Consider the following CFG G3:
    E  E + E E  (E) V  x                 Vz
    EE-EEV                      Vy
In this example, variables are {E, V}, terminals are {+, -,
(, ), x, y, z}, start symbol is S and productions are the
above seven rules.


                                                              133
Example 3
                       CFG
How to generate x+(y-z)?
                                    E  E + E | E - E | (E) | V
 EE+E        (use Rule 1)      Vx|y|z
    V + E (use Rule 4)
   x+E                (use Rule 5)
                                   * Note that we can write the
    x + (E) (use Rule 3)            productions in this way.
    x + (E - E)       (use Rule 2)
    x + (V - E)       (use Rule 4)
    x + (y - E)       (use Rule 6)
    x + (y - V)       (use Rule 4)
    x + (y - z)       (use Rule 7)

                                                                  134
           Class Discussion

Write a CFG for the set of all palindromes over
the alphabet ={a,b}.




                                                  135
            Class Discussion

 Consider the grammar:
     S  aB | bA
     A  a | aS | bAA
     B  b | bS | aBB
 Generate 4 strings from this grammar.
 Do you know what language does this CFG
  represent?




                                            136
Lecture 10
                 Derivation

Consider the context free grammar G3:
 EE+E
 EE-E            E  (E)
 EV              Vx
 Vz              Vy
How to derive the string x+(y-z)?



                                        138
  Parse Tree (Derivation Tree)
We can represent the derivation with a tree:
                                  E
Grammar:
E  E + E | E - E | (E) | V               Derivation:
                              E + E
Vx|y|z                                   EE+E
                                           V+E
                              V ( E )      x+E
We can get x+(y-z) by                       x + (E)
reading the leaves                          x + (E - E)
                              x
from left to right. This          E - E     x + (V - E)
                                            x + (y - E)
is called the yield of                      x + (y - V)
the tree.                         V   V
                                            x + (y - z)

                                  y   z
                                                           139
                 Parse Tree
Formally, let G = (V, T, P, S) be a CFG, a parse tree
for G must be such that:
• Every vertex has a label from VT {}.
• The root is labeled S.
• The label of any internal vertex is in V.
• If a vertex is labeled , it must be a leaf and
  has no sibling.
• If a vertex is labeled A and its children are
  labeled X1, X2, ..., Xk from left to right, then A
   X1 X2 … X3 is a production in P.


                                                        140
            Left Derivation
Always derive the leftmost variable first:
                E
                              EE+E
            E + E
                               V+E
                               x+E
            V ( E )             x + (E)
                                x + (E - E)
                                x + (V - E)
            x
                E - E           x + (y - E)
                                x + (y - V)
                V   V           x + (y - z)


                y   z
                                               141
           Right Derivation
Always derive the rightmost variable first:
                E
                             EE+E
            E + E
                               E + (E)
                               E + (E - E)
            V ( E )            E + (E - V)
                               E + (E - z)
                               E + (V - z)
            x
                E - E          E + (y - z)
                               V + (y - z)
                V   V          x + (y - z)


                y   z
                                              142
                  Ambiguity
 Each parse tree has one unique leftmost
  derivation and one unique rightmost derivation.
 A grammar is ambiguous if some strings in it
  have more than one parse trees, i.e., it has more
  than one leftmost derivations (or more than one
  rightmost derivations).




                                                      143
                        Ambiguity
Consider the following grammar G:
               S  AS | a | b
               A  SS | ab
An output string can have more than one parse trees.
Consider the string abb generated by G:

                S
                                      S
            A       S
                                  A       S

        S       S       b
                              a       b       b
        a       b
                                                       144
                    Ambiguity
As another example, consider the following grammar:
              E  E + E | E * E | (E) | x | y | z

There are 2 leftmost derviations for x + y + z:


    E         EE+E                    E  EE+E
               x+E                        E+E+E
E
    + E        x+E+E               E + E  x+E+E
               x+y+E                      x+y+E
x y                                     z
        + z    x+y+z             x + y    x+y+z


                                                      145
Ambiguity of if-then-else statement
  <if-stat>             if <cond> then <stat>
  <if-stat>             if <cond> then <stat> else
  <stat>
  <cond>                P|Q
  <stat>                <if-stat> | R | S

  Consider the following if-statement:
            if P then if Q then R else S
  This statement is ambiguous since it can have two
  parse trees. What are they?


                                                      146
        Simplification of CFG
There are 3 kinds of productions or variables that
we can remove:
• Useless variables
• -productions, e.g., A  
• Unit productions, e.g., A  B




                                                     147
            Useless Variables
A variable X is useless if:
• (Type-1) X does not generate any string
  of terminals, or
• (Type-2) The start symbol, S, cannot
  generate X.
                  SX
                   *   *

  where ,   (VT)*, X  V and   T*

                                            148
Removal of Type-1 Variables

1 Mark each production of the form:
        X           where   T*
 Repeat
    Mark X   where  consists of
    terminals or variables which are on
    the left side of some marked
    productions.
  Until no new production is marked.
3 Remove all unmarked productions.        149
     Removal of Type-1 Variables
For example:

   S  ABE     S  ABE   S  ABE
   A a        A a      A a
   B         B       B
               C  ED    C  ED
   C  ED      D  BC    D  BC
   D  BC      Eb       Eb
   Eb


                                   150
 Removal of Type-2 Variables
1 Mark each production of the form:
     S           where   (VT)*
 Repeat
    Mark X   where X appears on
    the right side of some marked
    productions.
  Until no new production is marked.
3 Remove all unmarked productions.
                                       151
     Removal of Type-2 Variables
For example:
   S  AAb      S  AAb      S  AAb
                S  AB       S  AB
   S  AB
                Aa          Aa
   Aa          B  b | bB   B  b | bB
   B  b | bB   CA|B        CA|B
   CA|B



                                          152
                Class Exercise
Consider the following grammar:
     SA                 Aa
     A  BC              C  Ab
     A  aA              D  aD
 Remove type-1 and then type-2 variables.
 Remove type-2 and then type-1 variables.
 Why are they different? Which is correct?
  S -> AB | a
  A -> aA
  B -> b
                                              153
             -Productions
A variable A is nullable if:
                        A *
For example:
        S  ABCD
        A a
        B
        C  ED |      B, C and D are nullable
        D  BC
        Eb



                                                 154
           Nullable Variables


How to find nullable variables ?
1 Mark all variables A for which there
  exists a production of the form A  .
2 Repeat
    Mark X for which there exists X  
    and all symbols in  have been
    marked.
  Until no new variable is marked.
                                           155
    Removal of -Productions
We can remove all -productions (except S  
if S is nullable) by replacing some productions.
e.g., If X1 and X3 are nullable, we should
replace A  X1X2X3X4 by:
A  X1X2X3X4 | X2X3X4 | X1X2X4 | X2X4




                                                   156
  Removal of -Productions
For example:
Assume that B, C and D are nullables:
    S  ABCD
    A a
    B
    C  ED | 
    D  BC
    Eb



                                        157
           Unit Productions
A production of the form A  B where both A and B
are variables is called a unit production.
If A1  A2  A3  …, Ak  , where A1, A2, ..., Ak 
V and   (VT)*\V, we can add productions A1 
, A2  , …, Ak   and remove all the unit
productions.
If there is a cycle of unit productions: A1  A2  A3
 …  A1, All occurrences of A1, A2, A3 , …, Ak can
be replaced by any one of them.




                                                        158
 Removel of Unit Productions
For example:
  SA|B
  A  B | C | aB | b
  BC
  C  B | Aa




                               159
            Class Exercise
Consider the grammar:
                S  ABC
                A  aA | B
                B|B
                C  c | cC
Remove the -productions and all the unit
productions.




                                            160
Lecture 11
    Chomsky Normal Form (CNF)

A CFG is in Chomsky Normal Form if all its productions are
of the form:
              A  BC or
              Aa
where A, B, C  V and a  T. Also, S   may be one of the
productions.




                                                        162
             Examples of CNF

Example 1:   S  AB
             A  BC | CC | a
             B  CB | b
             Cc
Example 2:   S  AB | BC | AC | 
             A  BC | a
             B  AC | b
             C  AB | c




                                    163
                      CNF
Is that all Context Free Grammars can be expressed in
Chomsky Normal Form?

Consider the following simple grammar:
                      A  cA | a
                      B  ABC | b
                      Cc
How to convert this grammar to CNF?




                                                        164
            Conversion into CNF
Step 1: Convert every production into either:
         A B1B2…Bn or
         A a
   e.g. A  bCDeF becomes:
         A  BCDEF
         Bb
         Ee




                                                165
           Conversion into CNF
Step 2: Convert production of the form A B1B2…Bn into A
    C1C2 :
   e.g. A  BCDEF becomes:
         A  BX
         X  CY
         Y  DZ
         Z  EF




                                                       166
            Class Exercise
Convert the following CFG into Chomsky Normal
Form:
               S
               S  ABBA
               B  bCb
               Aa
               Cc




                                                167
   Greibach Normal Form (GNF)

A CFG is in Greibach Normal Form if all its productions are
of the form:
              A  a
where A  V, a  T and   V*. Also, S   may be one of
the productions.




                                                          168
             Examples of GNF

Example 1:   S  aABC
             A  aA | a
             B  bB | b
             C  cC | c
Example 2:   S  bAB | 
             A  aBAA | aAAB | a
             B  bABB | bBBA | b




                                   169
                      GNF
Is that all Context Free Grammars can be expressed in
Greibach Normal Form?

Transformation can be done by a combination of
substitution and removal of left recursion.




                                                        170
              Substitution
Replace:
     A  B
     B  B1 | B2 | … | Bn
by:
     A  B1 | B2 | … | Bn

where         A, B  V
              , , Bi  (VT)*




                                  171
           Substitution
For example:
     S  TT
     T  S | aSb | c
can be replaced by:
     S  ST | aSbT | cT
     T  S | aSb | c




                          172
            Conversion to GNF

Our goal is to remove all the productions of the form:
                A  B
where A, B  V and   V*, by substitution to convert the
productions to GNF. However, we need to take care of the
productions in the form of:
                A  A
This is called left recursion.




                                                            173
   Removal of Left Recursion
Replace:
     X  1 | 2 | … | n
                                 where X  V
     X  X1 | X 2 | … | X n         i, i  (V T)*
by:                                    such that i does
     X  1 | 2 | … | n              not start with X
     X  1 Z | 2Z | … |  nZ
     Z  1 |  2 | … |  n
     Z  1Z |  2Z | … |  nZ




                                                       174
   Removal of Left Recursion
For example:
     E  E+T
     ET
can be replaced by:
     E  T | TZ
     Z  +T | +TZ




                               175
   Removal of Left Recursion
Another example:
     S  c | Sa | SbA
     Ac
can be replaced by:
     S  c | cZ
     Z  a | bA | aZ | bAZ
     Ac




                               176
          Conversion into GNF

Assume that the grammar is given originally in Chomsky
Normal Form. For example:
     A  BC
     B  CA | b
     C  AB | a




                                                         177
          Conversion into GNF

Step 1: Rename variables and impose an arbitrary order on
them.
     A1  A2A3
     A2  A3A1 | b
     A3  A1A2 | a                  A  BC
                                    B  CA | b
                                    C  AB | a




                                                        178
           Conversion into GNF

Step 2: Transform the resulting CFG into one in which all
productions of the form Ai  Aj must have i < j. This is
done by substitution and removal of left recursion.
     A1  A2A3
     A2  A3A1 | b
     A3  A3A1A3A2 | bA3A2 | a
                                        A1  A2A3
                                        A2  A3A1 | b
                                        A3  A1A2 | a




                                                            179
           Conversion into GNF

Step 2 (cont’d): Remove any left recursion.

                                 A1  A2A3
                                 A2  A3A1 | b
     A1  A2A3
                                 A3  A3A1A3A2 | bA3A2 | a
     A2  A3A1 | b
     A3  bA3A2 | a | bA3A2B3 | aB3
     B3  A1A3A2 | A1A3A2 B3




                                                             180
                      Conversion into GNF
          Step 3: Substitute in the backward direction, i.e. Ak,
          Ak-1, …, A1 and then the B’s.
                A1  bA3A2A1A3 | aA1A3 | bA3A2B3A1A3 |
                         aB3A1A3 | bA3
                A2  bA3A2A1 | aA1 | bA3A2B3A1 | aB3A1 | b
                A3  bA3A2 | a | bA3A2B3 | aB3
                B3  bA3A2A1A3A3A2 | aA1A3A3A2 |
                bA3A2B3A1A3A3A2 | aB3A1A3A3A2 |
                bA3A3A2 | bA3A2A1A3A3A2 B3 |
                aA1A3A3A2 B3 | bA3A2B3A1A3A3A2 B3 |
                aB3A1A3A3A2 B3 | bA3A3A2 B3
A1  A2A3
A2  A3A1 | b
A3  bA3A2 | a |
     bA3A2B3 | aB3
B3  A1A3A2 | A1A3A2 B3                                            181
Lecture 12
             Class Exercises

Convert the following grammar to GNF:
          S  XaS | YbS | 
          X  Yb | a
          Y  Xa | b




                                        183
What have we covered for CFL?
 Grammar specification
 Derivation tree
 Simplification of CFL:
    Removal of useless symbols, -productions
     and unit productions.
 Chomsky Normal Form
 Greibach Normal Form



                                                 184
        Pushdown Automata


L = {wcwR | w  (a+b)*} is a CFL:

          S  aSa | bSb | c

How to write a program to recognize L?




                                         185
              Pushdown Automata

It’s easier to use a stack:




                         abbacabba
                               (scan from left to right)


                1        Before seeing a “c”:
                1        - Push a 0 whenever see an “a”.
                         - Push a 1 whenever see a “b”
                0

                                                           186
          Pushdown Automata
                     abbacabba
                                 (scan from left to right)


          0          After seeing a “c”:
          1          - Pop whenever see an “a” and
          1            check if the stack top is a 0.
                     - Pop whenever see a “b” and
          0            check if the stack top is a 1.

Accept if all the checkings are correct and the stack
is empty at the end.
                                                             187
          Pushdown Automata
A pushdown automata (PDA) M is a system
(Q, , , , q0, Z0, F):
   Q is a finite set of states;
    is an alphabet, called input alphabet;
    is an alphabet, called stack alphabet;
   q0 in Q is the initial state;
   Z0 in  is the start symbol;
   F  Q is a set of final states;
    is a mapping from Q({}) to finite
    subsets of Q*

                                                188
         Pushdown Automata
There are three things in a PDA:

               Stack

 State
                              An input tape
  q0

                       input head
                          Scan from left to right
                Z0


                                                    189
                    Moves
The mapping  defines the change from one
configuration to another:
      (q, a, Z) = {(p1, 1), {(p2, 2), … {(pm, m)}
where q and pi’s are states, a is an input symbol, Z is
a stack symbol and i’s are strings of stack symbols,
means that the PDA in state q, with input symbol a
and Z at the top of the stack can enter state pi,
replace Z by string i and advances the input head
one symbol.




                                                          190
                     Moves

      (q, , Z) = {(p1, 1), {(p2, 2), … {(pm, m)}

means that the PDA in state q with Z at the top of the
stack and independent of the input being scanned,
enter state pi, replace Z by string i and the input
head does not advance.




                                                         191
         Acceptance by PDA
The language accepted is the set of all input strings
that some sequence of moves causes the PDA to
empty its stack after scanning the whole input.

The language accepted is the set of all input strings
                        OR
that some sequence of moves causes the PDA to
enter a final state after scanning the whole input.




                                                        192
                 An Example
The pushdown automata M = (Q, , , , q0, Z0, F)
recognizes L = {wcwR | w  (a+b)*} where
  Q = {q0, q1};
   = {a, b, c};
   = {0, 1, S}
  q0 is the initial state;
  Z0 = S is the start symbol;
  F = {};



                                                    193
                 An Example
The mapping  is defined as follows:
(q0, a, 0) = {(q0, 00)}         (q0, c, 0) = {(q1, 0)}
(q0, a, 1) = {(q0, 01)}         (q0, c, 1) = {(q1, 1)}
(q0, a, S) = {(q0, 0S)}         (q0, c, S) = {(q1, S)}
(q0, b, 0) = {(q0, 10)}         (q1, a, 0) = {(q1, )}
(q0, b, 1) = {(q0, 11)}         (q1, b, 1) = {(q1, )}
(q0, b, S) = {(q0, 1S)}         (q1, , S) = {(q1, )}




                                                           194
                          An Example
           Consider an input abbacabba:

                             (q0, a, 0) = {(q0, 00)}    (q0, c, 0) = {(q1, 0)}

        Initial State =   q0 (q0, a, 1) = {(q0, 01)}    (q0, c, 1) = {(q1, 1)}
                             (q0, a, S) = {(q0, 0S)}    (q0, c, S) = {(q1, S)}
                              (q0, b, 0) = {(q0, 10)}    (q1, a, 0) = {(q1, )}
                              (q0, b, 1) = {(q0, 11)}    (q1, b, 1) = {(q1, )}
          Input Tape
                              (q0, b, S) = {(q0, 1S)}    (q1, , S) = {(q1, )}
         abbacabba
 S      Input
        Head
Stack
                                    This PDA is deterministic.

                                                                           195
                Class Exercise
Try to trace the configurations for the input:
  aabbcbbaa
  aabcaabb.


(q0, a, 0) = {(q0, 00)}   (q0, c, 0) = {(q1, 0)}
(q0, a, 1) = {(q0, 01)}   (q0, c, 1) = {(q1, 1)}
(q0, a, S) = {(q0, 0S)}   (q0, c, S) = {(q1, S)}
(q0, b, 0) = {(q0, 10)}   (q1, a, 0) = {(q1, )}
(q0, b, 1) = {(q0, 11)}   (q1, b, 1) = {(q1, )}
(q0, b, S) = {(q0, 1S)}   (q1, , S) = {(q1, )}

                                                     196
            Another Example
The pushdown automata M = (Q, , , , q0, Z0, F)
recognizes L1 = {wwR | w  (a+b)*} where
  Q = {q0, q1};
   = {a, b};
   = {0, 1, S}
  q0 is the initial state;
  Z0 = S is the start symbol;
  F = {};



                                                    197
             Another Example
The mapping  is defined as follows:
(q0, a, 0) = {(q0, 00)}         (q0, a, 0) = {(q1, )}
(q0, a, 1) = {(q0, 01)}         (q0, b, 1) = {(q1, )}
(q0, a, S) = {(q0, 0S)}         (q1, a, 0) = {(q1, )}
(q0, b, 0) = {(q0, 10)}         (q1, b, 1) = {(q1, )}
(q0, b, 1) = {(q0, 11)}         (q1, , S) = {(q1, )}
(q0, b, S) = {(q0, 1S)}         (q0, , S) = {(q1, )}




                                                           198
                    Another Example
           Consider an input abbaabba:


        Initial State = q0   (q0, a, 0) = {(q0, 00)}   (q0, a, 0) = {(q1, )}
                             (q0, a, 1) = {(q0, 01)}   (q0, b, 1) = {(q1, )}
                             (q0, a, S) = {(q0, 0S)}   (q1, a, 0) = {(q1, )}
          Input Tape         (q0, b, 0) = {(q0, 10)}   (q1, b, 1) = {(q1, )}
                             (q0, b, 1) = {(q0, 11)}   (q1, , S) = {(q1, )}
        abbaabba
                             (q0, b, S) = {(q0, 1S)}   (q0, , S) = {(q1, )}

  S     Input
        Head
Stack
                               This PDA is nondeterministic.

                                                                         199
Lecture 13
           Deterministic and
         Non-deterministic PDA

A PDA is deterministic if the followings are true:


 For each q in Q and Z in , whenever (q, , Z) is
  nonempty, then (q, a, Z) is empty for all a in .

 For no q in Q, Z in  and a in {} does (q, a,
  Z) contain more than one element.



                                                       201
          Deterministic and
        Non-deterministic PDA
Deterministic PDA can only represent a subset of CFL, e.g.,
L1 = {wwR | w  (a+b)*} cannot be represented.

Therefore, PDA  DPDA

Unless otherwise stated, we assume that a PDA is
nondeterministic.




                                                          202
            Group Discussion

Can you define a PDA for the language:
                  L3 = {anbn | n  0 }




                                         203
  Instantaneous Description


An instantaneous description (ID) describes the
configuration:
                          (q, w, )
where q is a state, w is a string of the remaining
inputs and  is a string of the symbols in the stack,
and:
                (q, w, ) M (q1, w1, 1)

if we can move from the configuration (q, w, ) to the
configuration (q1, w1, 1) in one step.

                                                         204
      Instantaneous Description

If (q, a, Z) contains (p, ):
                     (q, aw, Z) M (p, w, )
Note the M can be dropped if it is clear which PDA we are
referring to.

* Denotes the reflexive and transitive closure of M
M




                                                            205
                   Acceptance

 A pushdown automata can either accept by empty
  stack or by final states.
 For acceptance by empty stack, the set of final states F
  will be empty.
 Acceptance by empty stack and acceptance by final
  states are equivalent.




                                                        206
  Acceptance by Empty Stack
Any PDA M1 accepts by empty stack can be
converted to a PDA M2 which accepts by final
states, i.e., N(M1) = L(M2).


                          Remove X0 and enter
                          a final state when it is
 Simulate exactly
                          seen again.
 the operations by
 M1 above X0.
                          Put a new start symbol
                     X0
                          at the bottom of the stack.
                                                        207
   Acceptance by Empty Stack
Let M1 = (Q, , , , q0, Z0, F) accept by empty stack.
Construct M2 = (Q {q0’, qf}, , {X0}, ’, q0’, X0, {qf})
that accepts by final states as follows:

• ’(q0’, , X0) = {(q0, Z0X0)}
• ’(q, a, Z) includes all the elements of (q, a, Z)
  for all q in Q, a in  {} and Z in .
• For all q in Q, ’(q, , X0) contains (qf, ).




                                                              208
                   An Example
For example, let M1 = (Q, , , , q0, Z0, F) where
 Q = {q0};
  = {a, b};
  = {Z0};
 F = {};
(q0, a, Z0) = {(q0, Z0), (q0, )}
(q0, b, Z0) = {(q0, Z0)}

What is M2?


                                                      209
                  An Example
          M1                             M2
Q = {q0};                   Q = {q0, q0’, qf);
 = {a, b};                  = {a, b};
 = {Z0};                    = {Z0, X0};
F = {};                     F = {qf};
(q0, a, Z0) = {(q0, Z0)}   (q0’, , X0) = {(q0, Z0X0)}
(q0, a, Z0) = {(q0, )}    (q0, a, Z0) = {(q0, Z0)}
(q0, b, Z0) = {(q0, Z0)}   (q0, a, Z0) = {(q0, )}
                            (q0, b, Z0) = {(q0, Z0)}
                            (q0, , X0) = {(qf, )}

                                                           210
  Acceptance by Final States
Any PDA M1 accepts by final states can be
converted to a PDA M2 which accepts by empty
stack, i.e., L(M1) = N(M2).


                          Pop when entering a
                          final state.
 Simulate exactly
 the operations by
 M1 above X0.
                          Put a new start symbol
                     X0
                          at the bottom of the stack.
                                                        211
    Acceptance by Final States
Let M1 = (Q, , , , q0, Z0, F) accept by final states.
Construct M2 = (Q {q0’, qe}, , {X0}, ’, q0’, X0, ) that
accepts by empty stack as follows:

• ’(q0’, , X0) = {(q0, Z0X0)}
• ’(q, a, Z) includes all the elements of (q, a, Z)
  for all q in Q, a in  {} and Z in .
• For all q in F and Z in {X0}, ’(q, , Z) contains
  (qe, ).
• For all Z in {X0}, ’(qe, , Z) contains (qe, ).


                                                                212
                  An Example
For example, let M1 = (Q, , , , q0, Z0, F) where
 Q = {q0, q1};
  = {a, b};
  = {0, Z0};
 F = {q1};
(q0, a, Z0) = {(q1, 0)}          (q1, a, 0) = {(q1, 0)}
(q0, b, Z0) = {(q0, 0)}          (q1, b, 0) = {(q1, 0)}


What is M2?


                                                            213
                           An Example
          M1                                     M2
Q = {q0, q1};              Q = {q0, q1, q0’, qe);
 = {a, b};                 = {a, b};
 = {0, Z0};                = {0, Z0, X0};
F = {q1};                  F = {};
(q0, a, Z0) = {(q1, 0)}   (q0’, , X0) = {(q0, Z0X0)}
(q1, a, 0) = {(q1, 0)}    (q0, a, Z0) = {(q1, 0)} (q1, , Z0) = {(qe, )}
(q0, b, Z0) = {(q0, 0)}   (q1, a, 0) = {(q1, 0)} (q1, , 0) = {(qe, )}
(q1, b, 0) = {(q1, 0)}    (q0, b, Z0) = {(q0, 0)} (qe, , X0) = {(qe, )}
                           (q1, b, 0) = {(q1, 0)} (qe, , Z0) = {(qe, )}
                           (q1, , X0) = {(qe, )} (qe, , 0) = {(qe, )}


                                                                         214

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:13
posted:5/19/2012
language:
pages:214
fanzhongqing fanzhongqing http://
About