VIEWS: 6 PAGES: 23 POSTED ON: 8/6/2011
From regular expressions to automata Marcello Bonsangue1,2 Jan Rutten1,3 Alexandra Silva1 1 Centrum voor Wiskunde en Informatica 2 LIACS - Leiden University 3 Vrije Universiteit Amsterdam January 2009 Alexandra Silva (CWI) From regular expressions to automata January 2009 1/8 Motivation Context: Regular expressions Goal: Decide r1 = r2 . Usual approach: Alexandra Silva (CWI) From regular expressions to automata January 2009 2/8 Motivation Context: Regular expressions Goal: Decide r1 = r2 . Usual approach: Alexandra Silva (CWI) From regular expressions to automata January 2009 2/8 Motivation Context: Regular expressions Goal: Decide r1 = r2 . Usual approach: Alexandra Silva (CWI) From regular expressions to automata January 2009 2/8 For Regular Expressions to Deterministic automata Direct method : Brzozowski derivatives (ab + b)∗ ba Problems: 1 Comparing derivatives is very expensive. 2 Method does not scale so well (cf. Circ/KAT) Alexandra Silva (CWI) From regular expressions to automata January 2009 3/8 For Regular Expressions to Deterministic automata Direct method : Brzozowski derivatives (ab + b)∗ ba 98 76 #" ! 23 45 G (ab + b)∗ ba a G (ab + b)∗ ba + a #" ! b b(ab + b)∗ ba Problems: 1 Comparing derivatives is very expensive. 2 Method does not scale so well (cf. Circ/KAT) Alexandra Silva (CWI) From regular expressions to automata January 2009 3/8 For Regular Expressions to Deterministic automata Direct method : Brzozowski derivatives (ab + b)∗ ba 98 76 #" ! b × 23 f 45 G (ab + b)∗ ba a G (ab + b)∗ ba + a u #" ! #"×Ö !ÔÕ a b a ÐÑ ÒÓ b b(ab + b)∗ ba b(ab + b)∗ ba + 1 Problems: 1 Comparing derivatives is very expensive. 2 Method does not scale so well (cf. Circ/KAT) Alexandra Silva (CWI) From regular expressions to automata January 2009 3/8 More efﬁcient algorithms 1 Partial derivative (Antimirov) 2 Continuation automaton (Berry-Sethi) 3 Position automaton (Berry-Sethi, Glushkov, McNaughton-Yamada) Alexandra Silva (CWI) From regular expressions to automata January 2009 4/8 More efﬁcient algorithms 1 Partial derivative (Antimirov) 2 Continuation automaton (Berry-Sethi) 3 Position automaton (Berry-Sethi, Glushkov, McNaughton-Yamada) Alexandra Silva (CWI) From regular expressions to automata January 2009 4/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Continuation/Position automaton Basic idea: Assume that all occurrences of letters are different. (ab + b)∗ ba → (a1 b2 + b3 )∗ b4 a5 Let the magic begin: example in the board. Key: The states are known from the beginning! Theorem Let all symbols in E be distinct. Given any symbol a, for all strings w, (wa)−1 E is either 0 or unique modulo ACI. Nice property: all the transitions entering a state have the same label. Not so nice: the continuations need to be computed explicitly. Alexandra Silva (CWI) From regular expressions to automata January 2009 5/8 Position automaton Deﬁnition Let E be a regular expression and E the corresponding marked expression. ﬁrst(E) = {i | ai w ∈ L(E)} follow(E, i) = {j | uai aj v ∈ L(E)} last(E) = {i | wai ∈ L(E)} The position automaton for E is deﬁned as : Apos (E) = (pos(E), Σ, δpos , 0, last(E)) where δpos = {(i, a, j) | j ∈ follow(E, i), a = aj } Remark: Berry-Sethi provide an efﬁcient algorithm to compute follow. Example in the board. Alexandra Silva (CWI) From regular expressions to automata January 2009 6/8 Position automaton Deﬁnition Let E be a regular expression and E the corresponding marked expression. ﬁrst(E) = {i | ai w ∈ L(E)} follow(E, i) = {j | uai aj v ∈ L(E)} last(E) = {i | wai ∈ L(E)} The position automaton for E is deﬁned as : Apos (E) = (pos(E), Σ, δpos , 0, last(E)) where δpos = {(i, a, j) | j ∈ follow(E, i), a = aj } Remark: Berry-Sethi provide an efﬁcient algorithm to compute follow. Example in the board. Alexandra Silva (CWI) From regular expressions to automata January 2009 6/8 Language equivalence for NFA Deﬁnition (Q1 ≡ Q2 ) Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s. For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff 1 Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅ 2 δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ. Theorem L(A1 ) = L(A2 ) ⇔ I1 ≡ I2 Worst case = determinization + bisimilarity Alexandra Silva (CWI) From regular expressions to automata January 2009 7/8 Language equivalence for NFA Deﬁnition (Q1 ≡ Q2 ) Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s. For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff 1 Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅ 2 δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ. Theorem L(A1 ) = L(A2 ) ⇔ I1 ≡ I2 Worst case = determinization + bisimilarity Alexandra Silva (CWI) From regular expressions to automata January 2009 7/8 Language equivalence for NFA Deﬁnition (Q1 ≡ Q2 ) Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s. For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff 1 Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅ 2 δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ. Theorem L(A1 ) = L(A2 ) ⇔ I1 ≡ I2 Worst case = determinization + bisimilarity Alexandra Silva (CWI) From regular expressions to automata January 2009 7/8 Language equivalence for NFA Deﬁnition (Q1 ≡ Q2 ) Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s. For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff 1 Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅ 2 δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ. Theorem L(A1 ) = L(A2 ) ⇔ I1 ≡ I2 Worst case = determinization + bisimilarity Alexandra Silva (CWI) From regular expressions to automata January 2009 7/8 Language equivalence for NFA Deﬁnition (Q1 ≡ Q2 ) Let A1 = (Σ, S1 , I1 , δ1 , F1 ) and A1 = (Σ, S2 , I2 , δ2 , F2 ) be NFA’s. For Q1 ⊆ S1 and Q2 ⊆ S2 , Q1 ≡ Q2 iff 1 Q1 ∩ F1 = ∅ ⇔ Q2 ∩ F2 = ∅ 2 δ1 [Q1 ](a) ≡ δ2 [Q2 ](a), for all a ∈ Σ. Theorem L(A1 ) = L(A2 ) ⇔ I1 ≡ I2 Worst case = determinization + bisimilarity Alexandra Silva (CWI) From regular expressions to automata January 2009 7/8 Conclusions Can we implement this algorithms in CIRC? Can we extend them to KAT? Alexandra Silva (CWI) From regular expressions to automata January 2009 8/8