Push-Down Automata201046195328 by lindayy


More Info
									                                                                                                  General Structure of Automata

                                                                                                                           a0     a1   a2       ...   ....         an

                                                                                                                           read                                 input tape
                                 Push-Down Automata
                                                                                                                                            Finite           Auxiliary
                 COMP2600 — Formal Methods for Software Engineering                                                                     Control

                                                                                                  The input tape is a sequence of tokens.
                                       Clem Baker-Finch                                           Each time a symbol is processed the read head advances.

                                  Australian National University                                  The auxiliary memory is usually a linear organisation (e.g. a stack).
                                       Semester 2, 2008                                           The memory alphabet is usually Vt ∪Vn .

                                                                                                  The finite state control can be in any one of a finite number of states.

COMP 2600 — Push-down Automata                                                        1   COMP 2600 — Push-down Automata                                                           3

        Languages and Automata                                                                    General Automata ctd

        Recall that to define a language we can either:                                            Each action of the machine may change the FSC state, change the auxiliary
                                                                                                  memory, advance to the next input symbol.
         1. Give a set of rules (i.e. a grammar) to produce all the legal strings
             (sentences) of the language.                                                         The action of the machine depends on the current FSC state, the current
                                                                                                  input symbol, the current memory symbol(s).
         2. Provide a machine (i.e. an algorithm) to recognise all the sentences of
             the language.                                                                        The machine starts in some particular start state (q0 ), with the read head at
                                                                                                  the first input symbol (a0 ), with the memory empty.
        There is a close relationship between the two approaches. Commonly we
        define a language by giving a grammar and then base parsers (or compilers)                 A machine accepts an input string as a sentence of the language if it
        on the corresponding machine.                                                             reaches a goal state with the input exhausted.

        The machines are automata like Turing machines, but constrained in the
        same sense as the Chomsky hierarchy.

COMP 2600 — Push-down Automata                                                        2   COMP 2600 — Push-down Automata                                                           4
        Automata and Grammars                                                                PDAs ctd

        The kind of auxiliary memory in a machine determines the class of                    Each action of the machine may involve change to the FSC state, pushing
        languages that the machine can recognise:                                            or popping the stack, advance to the next input symbol.

                       Language Class      Memory                                            The action of the machine may depend on the current FSC state, the
                       regular             none                                              current input symbol, the current top-of-stack symbol.

                       context-free        stack                                             The machine accepts an input string if it reaches a specified goal state, with
                       context-sensitive   tape (bounded by input length)                    the input exhausted and the stack empty.

                       unrestricted        unbounded tape

        We have already looked at Finite State Automata (i.e. automata without
        memory and their relation to regular languages.

        We now consider Push-Down Automata (i.e. automata with stack memory)
        and their relation to context-free grammars and languages.

COMP 2600 — Push-down Automata                                                   5   COMP 2600 — Push-down Automata                                                            7

        Push-down Automata — PDA                                                             Example

                                                                                                    {an bn | n ∈ N}
                        a0       a1   a2   ...     ....             an
                                                                                             Recall that this language cannot be recognised by a FSA (because there
                        read                                      input tape                 can only be a finite number of states). But it can be recognised by a PDA.
                                                                                             Ad hoc design:

                                        Finite                                                 • phase 1: (state q1 ) stack as
                                                             zk                                • phase 2: (state q2 ) pop as, if there is a b on input
                                                                                               • finalise: if the stack is empty and the input is exhausted in the goal state
                                                             z2                                  (q3 ), accept the string.

COMP 2600 — Push-down Automata                                                   6   COMP 2600 — Push-down Automata                                                            8
        Example ctd                                                                                   Example ctd — PDA Trace

        PDA transitions modify the stack as well as change the FSC state, so we                       PDA configurations can be written as a triple (state, remaining input, stack)
        write transitions are a function δ of type:                                                   with the top of stack to the left.

               δ : (state, input token, tos) → (state, string)                                               (q0 , aaabbb, Z) ⇒ (q1 , aabbb, aZ)
                                                                                                                                 ⇒ (q1 , abbb, aaZ)
        The string in the result is the symbols with which to replace the top-of-stack
                                                                                                                                 ⇒ (q1 , bbb, aaaZ)
        symbol. (This notational device makes it simple to specify pushes and pops
        in a uniform way.)                                                                                                       ⇒ (q2 , bb, aaZ)
                                                                                                                                 ⇒ (q2 , b, aZ)
        To simplify (the notation for) testing for empty stack, assume a marker
        symbol Z is initially on the stack.                                                                                      ⇒ (q2 ,   , Z)
                                                                                                                                 ⇒ (q3 ,   ,    )

                                                                                                      The machine halts in the goal state with input exhausted, so the string is

COMP 2600 — Push-down Automata                                                            9   COMP 2600 — Push-down Automata                                                         11

        Example ctd                                                                                   Example ctd — Rejection

        PDA to recognise an bn :                                                                      The string aaba should be rejected by the PDA:

          δ(q0 , a, Z) = q1 /aZ          ···   push first a                                                   (q0 , aaba, Z) ⇒ (q1 , aba, aZ)
          δ(q1 , a, a) = q1 /aa          ···   push a’s                                                                        ⇒ (q1 , ba, aaZ)
          δ(q1 , b, a) = q2 /ε           ···   start popping a’s                                                               ⇒ (q2 , a, aZ)
          δ(q2 , b, a) = q2 / ε          ···   pop a’s                                                                         ⇒ ???
          δ(q2 ,   , Z) = q3 /ε          ···   accept
                                                                                                      No transition applies, and the PDA is “stuck” without reaching a goal state.

COMP 2600 — Push-down Automata                                                           10   COMP 2600 — Push-down Automata                                                         12
        Grammars and PDAs                                                                                 From CFG to PDA, ctd

        Theorem                                                                                            3. Initialise the process by pushing S onto the stack. For start symbol S:

                                                                                                                      δ(q0 ,   , Z) = q1 /SZ
        The class of languages recognised by PDA’s is exactly the class of
        context-free languages.                                                                            4. For termination, add the transition:

        We will only justify this result in one direction: for any CFG, there is a                                    δ(q1 ,   , Z) = q2 /ε
        corresponding PDA.
                                                                                                          In general we get a non-deterministic PDA since there may be several
        This is the most interesting direction since it is the basis of automatically
                                                                                                          productions for each non-terminal.
        deriving parsers from grammars.
                                                                                                          Unfortunately, there is no algorithm for obtaining a deterministic PDA from a
                                                                                                          non-deterministic one.

COMP 2600 — Push-down Automata                                                               13   COMP 2600 — Push-down Automata                                                          15

        From CFG to PDA                                                                                   Example — Derive a PDA for a CFG

        The translation uses three states: q0 (initial), q1 (processing), q2 (goal).
                                                                                                                 E → T | E +T
         1. For all terminal symbols t , pop the stack if it matches the input:                                  T → F | T ∗F
                    δ(q1 ,t,t) = q1 /ε                                                                           F → id | (E)

         2. If a non-terminal is on top of stack, expand it to one of its right-hand                      1. Match and pop terminals:
             sides. For all productions A → α:
                                                                                                                  δ(q1 , +, +) = q1 /ε
                    δ(q1 ,   , A) = q1 /α                                                                          δ(q1 , ∗, ∗) = q1 /ε
                                                                                                                 δ(q1 , id, id) = q1 /ε
                                                                            continued. . .
                                                                                                                    δ(q1 , (, () = q1 /ε
                                                                                                                    δ(q1 , ), )) = q1 /ε

COMP 2600 — Push-down Automata                                                               14   COMP 2600 — Push-down Automata                                                          16
        CFG to PDA ctd                                                        Example Parse, ctd

        2. Expand non-terminals:                                              Notes:

               δ(q1 ,   , E) = q1 /T                                            • The parse was guided through the non-determinism (by me, the Oracle)
               δ(q1 ,   , E) = q1 /E + T                                           to always make the correct choice towards a successful parse.
               δ(q1 ,   , T ) = q1 /F                                           • In practical terms states q0 and q2 and the initialisation and termination
               δ(q1 ,   , T ) = q1 /T ∗ F                                          transitions are unnecessary.
               δ(q1 ,   , F) = q1 /id
                                                                                • The stack always contains the unmatched part of the sentential form.
               δ(q1 ,   , F) = q1 /(E)

        3,4. Initiate and terminate:

               δ(q0 ,   , Z) = q1 /EZ
               δ(q1 ,   , Z) = q2 /ε

COMP 2600 — Push-down Automata                                   17   COMP 2600 — Push-down Automata                                                           19

        Example Parse                                                         A Context-Sensitive Language
                 (q0 , id ∗ id,   ) ⇒ (q1 , id ∗ id,       E)
                                                                              Just for completeness, a brief look at context-sensitive languages.
                                     ⇒ (q1 , id ∗ id,      T)
                                                                              The following language is not context-free:
                                     ⇒ (q1 , id ∗ id,   T ∗ F)
                                     ⇒ (q1 , id ∗ id,   F ∗ F)                       {an bn cn | n ∈ N}
                                     ⇒ (q1 , id ∗ id, id ∗ F)                 Intuitively, we can imagine a CFG generating either the ab pairs or the bc
                                     ⇒ (q1 ,     ∗id,     ∗F)                 pairs, but this language requires us to keep the generation process in step,
                                     ⇒ (q1 ,      id,      F)                 in two different points in the sentential forms.

                                     ⇒ (q1 ,      id,      id)                A context-sensitive grammar is on the next slide. Each production is of the
                                     ⇒ (q1 ,        ,        )                form

                                     ⇒ (q2 ,        ,        )                       αAβ → αγβ
                                     ⇒         accept
                                                                              That is, A can go to γ provided it is in the context α   β.

COMP 2600 — Push-down Automata                                   18   COMP 2600 — Push-down Automata                                                           20
                                                                                                     CSGs and Automata
               S → aRc
               R → aRT | b                                                                           The automata that recognise CSGs have a tape memory, of length bounded
               bT c → bbcc                                                                           by a linear function of the length of the input . . .

               bT T → bbUT
               UT → UU
               UUc → VUc → V cc
               UV → VV
               bV c → bbcc
               bVV → bbWV
               WV → WW
               WW c → TW c → T cc
               WT → TT

COMP 2600 — Push-down Automata                                                          21   COMP 2600 — Push-down Automata                                                   23

        The trick is to use non-terminals as markers and to shift and convert them to
        tokens. For example:

               S → aRc
                  → aaRT c
                  → aaaRT T c
                  → aaabT T c
                  → aaabbUT c
                  → aaabbUUc
                  → aaabbVUc
                  → aaabbV cc
                  → aaabbbccc

COMP 2600 — Push-down Automata                                                          22

To top