From regular expressions to automata by fdh56iuoui

VIEWS: 6 PAGES: 23

									          From regular expressions to automata

   Marcello Bonsangue1,2                 Jan Rutten1,3              Alexandra Silva1

                        1 Centrum  voor Wiskunde en Informatica
                              2 LIACS  - Leiden University
                            3 Vrije Universiteit Amsterdam



                                    January 2009




Alexandra Silva (CWI)        From regular expressions to automata             January 2009   1/8
Motivation




Context: Regular expressions
Goal: Decide r1 = r2 .
Usual approach:




   Alexandra Silva (CWI)   From regular expressions to automata   January 2009   2/8
Motivation

Context: Regular expressions
Goal: Decide r1 = r2 .
Usual approach:




   Alexandra Silva (CWI)   From regular expressions to automata   January 2009   2/8
Motivation
Context: Regular expressions
Goal: Decide r1 = r2 .
Usual approach:




   Alexandra Silva (CWI)   From regular expressions to automata   January 2009   2/8
For Regular Expressions to Deterministic automata




     Direct method : Brzozowski derivatives
                                        (ab + b)∗ ba
Problems:
 1   Comparing derivatives is very expensive.
 2   Method does not scale so well (cf. Circ/KAT)




     Alexandra Silva (CWI)   From regular expressions to automata   January 2009   3/8
For Regular Expressions to Deterministic automata


     Direct method : Brzozowski derivatives
                                                    (ab + b)∗ ba

                                  98             76                #"                !
                                   23            45                                
                                  G (ab + b)∗ ba              a   G (ab + b)∗ ba + a


                             #"                         !
                                                b
                                            

                                                      
                                  b(ab + b)∗ ba

Problems:
 1   Comparing derivatives is very expensive.
 2   Method does not scale so well (cf. Circ/KAT)


     Alexandra Silva (CWI)           From regular expressions to automata            January 2009   3/8
For Regular Expressions to Deterministic automata

     Direct method : Brzozowski derivatives
                                                   (ab + b)∗ ba



                                  98             76                         #"                 !
                                                                                                b
                                                                                            ×
                                   23         f 45                                           
                                  G (ab + b)∗ ba                 a          G (ab + b)∗ ba + a
                                         u

                             #"                         !            #"×Ö                             !ÔÕ
                                       a       b                                    a
                                                                                       

                                                                 ÐÑ                             ÒÓ
                                                                                                      
                                                             b
                                  b(ab + b)∗ ba                              b(ab + b)∗ ba + 1

Problems:
 1   Comparing derivatives is very expensive.
 2   Method does not scale so well (cf. Circ/KAT)

     Alexandra Silva (CWI)            From regular expressions to automata                          January 2009   3/8
More efficient algorithms


 1   Partial derivative (Antimirov)
 2   Continuation automaton (Berry-Sethi)
 3   Position automaton (Berry-Sethi, Glushkov, McNaughton-Yamada)




     Alexandra Silva (CWI)   From regular expressions to automata   January 2009   4/8
More efficient algorithms


 1   Partial derivative (Antimirov)
 2   Continuation automaton (Berry-Sethi)
 3   Position automaton (Berry-Sethi, Glushkov, McNaughton-Yamada)




     Alexandra Silva (CWI)   From regular expressions to automata   January 2009   4/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Continuation/Position automaton

    Basic idea: Assume that all occurrences of letters are different.

                            (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5


    Let the magic begin: example in the board.
    Key: The states are known from the beginning!

Theorem
Let all symbols in E be distinct. Given any symbol a, for all strings w,
(wa)−1 E is either 0 or unique modulo ACI.

    Nice property: all the transitions entering a state have the same
    label.
    Not so nice: the continuations need to be computed explicitly.

    Alexandra Silva (CWI)      From regular expressions to automata   January 2009   5/8
Position automaton

Definition
Let E be a regular expression and E                 the corresponding marked
expression.
                 first(E)      = {i                  | ai w ∈ L(E)}
                 follow(E, i) = {j                  | uai aj v ∈ L(E)}
                 last(E)      = {i                  | wai ∈ L(E)}
The position automaton for E is defined as :

                       Apos (E) = (pos(E), Σ, δpos , 0, last(E))

where
                      δpos = {(i, a, j) | j ∈ follow(E, i), a = aj }

Remark: Berry-Sethi provide an efficient algorithm to compute follow.
Example in the board.

   Alexandra Silva (CWI)        From regular expressions to automata     January 2009   6/8
Position automaton

Definition
Let E be a regular expression and E                 the corresponding marked
expression.
                 first(E)      = {i                  | ai w ∈ L(E)}
                 follow(E, i) = {j                  | uai aj v ∈ L(E)}
                 last(E)      = {i                  | wai ∈ L(E)}
The position automaton for E is defined as :

                       Apos (E) = (pos(E), Σ, δpos , 0, last(E))

where
                      δpos = {(i, a, j) | j ∈ follow(E, i), a = aj }

Remark: Berry-Sethi provide an efficient algorithm to compute follow.
Example in the board.

   Alexandra Silva (CWI)        From regular expressions to automata     January 2009   6/8
Language equivalence for NFA


Definition (Q1 ≡ Q2 )
Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s.
For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff
  1   Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅
  2   δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ.

Theorem
                              L(A1 ) = L(A2 ) ⇔ I1 ≡ I2

Worst case = determinization + bisimilarity




      Alexandra Silva (CWI)     From regular expressions to automata   January 2009   7/8
Language equivalence for NFA


Definition (Q1 ≡ Q2 )
Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s.
For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff
  1   Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅
  2   δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ.

Theorem
                              L(A1 ) = L(A2 ) ⇔ I1 ≡ I2

Worst case = determinization + bisimilarity




      Alexandra Silva (CWI)     From regular expressions to automata   January 2009   7/8
Language equivalence for NFA


Definition (Q1 ≡ Q2 )
Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s.
For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff
  1   Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅
  2   δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ.

Theorem
                              L(A1 ) = L(A2 ) ⇔ I1 ≡ I2

Worst case = determinization + bisimilarity




      Alexandra Silva (CWI)     From regular expressions to automata   January 2009   7/8
Language equivalence for NFA


Definition (Q1 ≡ Q2 )
Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s.
For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff
  1   Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅
  2   δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ.

Theorem
                              L(A1 ) = L(A2 ) ⇔ I1 ≡ I2

Worst case = determinization + bisimilarity




      Alexandra Silva (CWI)     From regular expressions to automata   January 2009   7/8
Language equivalence for NFA


Definition (Q1 ≡ Q2 )
Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s.
For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff
  1   Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅
  2   δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ.

Theorem
                              L(A1 ) = L(A2 ) ⇔ I1 ≡ I2

Worst case = determinization + bisimilarity




      Alexandra Silva (CWI)     From regular expressions to automata   January 2009   7/8
Conclusions




Can we implement this algorithms in CIRC?
Can we extend them to KAT?




   Alexandra Silva (CWI)   From regular expressions to automata   January 2009   8/8

								
To top