VIEWS: 4 PAGES: 25 POSTED ON: 2/13/2012 Public Domain
Regular Expressions Lisa Higham •Automata (hardware) • good for recognizing (deciding) a given language. • does the machine accept a given string? • Software models •good for generating strings in a language. •produce strings that are in a given language. 2 For regular languages two simple, natural ways to generate strings. regular grammars (later) regular expressions (today) 3 Regular expressions are defined inductively. • say what the basic building blocks are • say how to combine building blocks to get to more elaborate members 4 Let be an alphabet. Basis : • each symbol a in is a regular expression • is a regular expression Inductive part : If and are regular expressions over , then: • ( + ) is a regular expression • () is a regular expression • ()* is a regular expression Closure: Nothing else is a regular expression. 5 Regular expressions represent languages Regular Expression Language represented a {a} ( + ) L () L () ( ) L () L () () * L ()* Regular expressions examples Let ={a,b} Regular Expression Language represented ( ((a+b)(aa)) (a+b) ) ({a} {b}){aa}({a} {b}) = {aa} i.e. set of all strings of length four such that the middle two symbols are “a” Regular expressions examples Let ={a,b} ( ( ( ((a+b)*) (aa) ) (a+b)) + ()* ) ({a} {b})*{aa}({a} {b}) {} = * {aa} {} i.e. set of all strings that end in aab or aaa, or the empty string. Shortcuts and Simplifications 1. Use precedence to remove parenthesis * star concatenation + union Similar to arithmetic: xy + 7 means ((xy) +7) ( ( ( ((a+(b+c))*) (aa) ) (a+b)) + ()* ) (a+b+c)* aa (a+b) + * (a+b+c)* aa (a+b) + * 9 Shortcuts and Simplifications 2. Abbreviations: abbreviates * abbreviates (a + b + c + d) for = {a, b, c, d} (a+b+c+d)* aa (a+b) + * * aa (a+b) + 10 Shortcuts and Simplifications 3. Cautions: Regular Expression Simplified R R R+ R R 11 Regular expressions: more examples Let ={a,b} Let A = {w in {a,b}* | w contains abba } reg. exp. for A: * abba * B= {w in {a,b}* | | w | = 7 and middle symbol of w is b} reg. exp. for B: b Regular expressions are strings A regular expression over alphabet is itself a string over the new alphabet ’: ’ = { +, *, (, ) } { , , } But not all strings from this alphabet are regular expressions! So we could define a new language: R() = { w in ’ * | w is a valid regular expression } Challenge question: Is R() a regular language? Regular expressions capture regular languages Theorem: A language L is regular iff there exists a regular expression R such that L = L (R) 14 Regular expressions capture regular languages Recall • Definition: A language L is regular if there exists a DFA that accepts exactly the strings in L. • Theorem: Any NFA has an equivalent DFA. 15 Regular expressions capture regular languages Thus: • A language L is regular if there exists a DFA that accepts exactly the strings in L. • A language L is regular if there exists an NFA that accepts exactly the strings in L. 16 Regular expressions capture regular languages A language L is regular iff • there exists a DFA, M, satisfying L = L (M) • there exists an NFA, N, satisfying L = L (N) • there exists a regular expression R satisfying L = L (R) 17 Regular expressions capture regular languages Proof strategy: • Given regular expression R construct an NFA, N, satisfying L (N) = L (R) • Given DFA, M, construct regular expression R satisfying L (M) = L (R) 18 Every regular expression corresponds to an NFA Part 1: Given regular expression R construct an NFA, N, satisfying L (N) = L (R). Idea: mimic the inductive definition of regular expressions. 19 Base Case 1. If R = a in , construct N: a N L (N) = L (R) 20 Base Case 2. If R = , construct N: N L (N) = L (R) 21 Inductive Case Let and be regular expressions and let N and and N be NFA’s satisfying L (N) = L () and L (N) = L () Suppose R = ( + ) We need to construct NR satisfying: L (NR) = L (R). 22 Inductive Case 1. R = ( + ) : N NR N L (NR) = L (R). 23 Inductive Case 2. R = ( ) : N NR N L (NR) = L (R). 24 Inductive Case 3. R = * : N NR L (NR) = L (R). 25