Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

ppt by zhangyun

VIEWS: 10 PAGES: 35

									Regular Languages, Regular
Operations
September 11, 2001




                             1
Agenda
 Today
     Regular languages
        Finite languages are regular
     Regular operations on languages
        Union ()
        Concatenation ()
        Kleene star (*)
 For next time:
     Read 1.3 and handout on minimization
 Thursday, 9/20 (revised ): HW1 collected

                                             2
      Definition of Regular
           Language
Recall the definition of a regular language:
DEF: The language accepted by an FA M is
  the set of all strings which are accepted by M
  and is denoted by L (M).
Would like to understand what types of
  languages are regular. Languages of this
  type are amenable to super-fast recognition
  of their elements
Would be nice to know for example, which of
  the following are regular:
                                               3
        Language Examples
     Unary prime numbers:
       { 11, 111, 11111, 1111111, 11111111111, … }
       = {12, 13, 15, 17, 111, 113, … }
       = { 1p | p is a prime number }
     Unary squares:
       {, 1, 14, 19, 116, 125, 136, … }
       = { 1n | n is a perfect square }
     Palindromic bit strings:
       {, 0, 1, 00, 11, 000, 010, 101, 111, …}
       = {x  {0,1}* | x = xR } o
Will explore whether or not these are regular in
 future.
                                                     4
         Finite Languages
All the previous examples had the following
   property in common: infinite cardinality
NOTE: The strings which made up the
   language were finite (as they always will be
   in this course); however, the collection of
   such strings was infinite.
Before looking at infinite languages, should
   definitely look at finite languages.


                                                  5
 Languages of Cardinality 1
Q: Is the singleton language containing
 one string regular? For example, is
                { banana }
 regular?




                                          6
 Languages of Cardinality 1
A: Yes.




Q: What’s, wrong with this example?


                                      7
 Languages of Cardinality 1
A: Nothing, really. This an example of a
  nondeterministic FA. This turns out to be the
  most concise way to encapsulate the
  language { banana }


But we will deal with nondeterminism in coming
  lectures. So:
Q: Is there a way of fixing this and making it
  deterministic?
                                              8
  Languages of Cardinality 1
A: Yes, just add a fail state q7; I.e., put a
  state that sucks in all strings different from
  “banana” for all eternity –unless they happen
  to be the “banana” prefixes {, b, ba, ban,
  bana, banan}.




                                               9
          Two Strings
Q: How about two strings? For example
          { banana, nab } ?




                                    11
           Two Strings
A: Just add another route:




                             12
 Arbitrary Finite Number of
           Strings
Q1: How about more? For example
     { banana, nab, ban, babba } ?

Q2: Or less (the empty set):
                Ø = {} ?



                                     13
   Arbitrary Finite Number of
A1:          Strings




                                14
 Arbitrary Finite Number of
 Strings: Empty Language
A2: Build a 1-state automaton whose
 accept states set F is empty!




                                      15
  Arbitrary Finite Number of
            Strings
THM: All finite languages are regular.
Proof : Can always construct a tree whose
  leaves are word-ending. In our example the
  tree is:                  n          b
                                   a           a
                               n       b           b
                           a               b
                       n                   a
                   a


Now make word endings into accept states, add
  a fail sink-state and add links to the fail state
  to finish the construction.              •
                                                       16
       Infinite Cardinality
Q: Are all regular languages finite?




                                       17
       Infinite Cardinality
A: No! Many infinite languages are regular.
Common Mistake 1: The strings of regular
  languages are finite, therefore the regular
  languages must be finite.
Common Mistake 2: Regular languages are –by
  definition– accepted by finite automata,
  therefore regular languages are finite.
Q: Give an example of a infinite but regular
  language.

                                           18
           Infinite Cardinality
   bit strings with an even number of b’s



   Simplest example is S*


many, many more
Home exercise: think of a criterion for non-
 finiteness


                                               19
         Regular Operations
You may have come across the regular
   operations when doing advanced searches
   utilizing programs such as emacs, egrep,
   perl, python, etc. There are three basic
   operations we will work with:
1. Union
2. Concatenation
3. Kleene-star
And a fourth definable in terms of the previous:
4. Kleene-plus

                                                   20
         Regular Operations –
          Summarizing Table
Operation Symbol UNIX version       Meaning
                                   match one of
  Union                 |         the patterns
                    implicit in   match patterns
Concatenation   
                      UNIX         in sequence
 Kleene-                          Match pattern 0
                *       *
   star                            or more times
 Kleene-                          Match pattern 1
                +       +
   plus                            or more times

                                              21
Regular operations - Union
UNIX: to search for all lines containing
 vowels in a text one could use the
 command
      egrep -i `a|e|i|o|u’
Here the pattern “vowel ” is matched by
 any line containing one of a, e, i, o or u.
Q: What is a string pattern?

                                           22
          String Patterns
A: A good way to define a pattern is as a
  set of strings, i.e. a language. The
  language for a given pattern is the set
  of all strings satisfying the predicate of
  the pattern.
EG: vowel-pattern =
  { the set of strings which
      contain at least one of: a e i o u }

                                          23
      UNIX patterns vs.
    Computability patterns
In UNIX, a pattern is implicitly assumed
  to occur as a substring of the matched
  strings.
In our course, however, a pattern needs
  to specify the whole string, and not just
  a substring.



                                         24
Regular operations - Union
Computability: union is exactly what we
  expect. If you have patterns
A = {aardvark}, B = {bobcat},
  C = {chimpanzee}
union the patterns together to get
AB C = {aardvark, bobcat,
               chimpanzee}

                                      25
     Regular operations -
       Concatenation
UNIX: to search for all consecutive
 double occurrences of vowels, use:
 egrep -i `(a|e|i|o|u)(a|e|i|o|u)’
Here the pattern “vowel ” has been
 repeated. Parentheses have been
 introduced to specify where exactly in
 the pattern the concatenation is
 occurring.
                                          26
     Regular operations -
       Concatenation
Computability. Consider the previous
  result:
L = {aardvark, bobcat, chimpanzee}

Q: What language results when we
 concatenate L with itself obtaining
                LL ?

                                       27
        Regular operations -
          Concatenation
A: LL =
{aardvark, bobcat, chimpanzee}{aardvark, bobcat, chimpanzee}

                             =
{aardvarkaardvark, aardvarkbobcat, aardvarkchimpanzee,
 bobcataardvark, bobcatbobcat, bobcatchimpanzee,
 chimpanzeeaardvark, chimpanzeebobcat, chimpanzeechimpanzee}

Q1: What is L{} ?
Q2: What is LØ ?
                                                           28
      Algebra of Languages
A1: L{} = L. In general, {} is the identity
  in the “algebra” of languages. I.e., if we
  think of concatenation as being like
  multiplication, {} acts like the number 1.
A2: LØ = Ø. Opposite to {}, Ø acts like the
  number zero obliterating everything it is
  concatenated with.
Note: We can carry on the analogy between
  numbers and languages. Addition becomes
  union, multiplication becomes concatenation.
  This forms a so-called “algebra”.
                                                 29
Regular operations – Kleene-*
UNIX: search for lines consisting purely of
 vowels (including the empty line):
         egrep -i `^(a|e|i|o|u)*$’
NOTE: ^ and $ are special symbols in UNIX
 regular expressions which respectively anchor
 the pattern at the beginning and end of a
 line. The trick above can be used to convert
 any Computability regular expression into an
 equivalent UNIX form.

                                              30
Regular operations – Kleene-*
Computability: Suppose we have a
 language
             B = { ba, na }

Q: What is the language B * ?



                                   31
Regular operations – Kleene-*
A:
B * = { ba, na }*=
{ ,
   ba, na,
   baba, bana, naba, nana,
   bababa, babana, banaba, banana,
      nababa, nabana, nanaba, nanana,
   babababa, bababana, … }

                                        32
Regular operations – Kleene-+
Kleene-+ is just like Kleene-* except that the
  pattern is forced to occur at least once.
UNIX: search for lines consisting purely of
  vowels (not including the empty line):
         egrep -i `^(a|e|i|o|u)+$’
Computability: B+ = { ba, na }+=
{ ba, na,
  baba, bana, naba, nana,
  bababa, babana, banaba, banana,
     nababa, nabana, nanaba, nanana,
  babababa, bababana, … }
                                                 33
Generating the Regular
Languages
The real reason that regular languages are
  called regular is the following:
THM: The regular languages are all those
  languages which can be generated starting
  from the finite languages by applying the
  regular operations.
This will be proved in the coming lectures.
Q: Can we start with even more basic
  languages than arbitrary finite languages?

                                               34
Generating the Regular
Languages
A: Yes. We can start with languages consisting
  of single strings which are themselves just a
  single character. These are the “atomic”
  regular languages.
EG: To generate the finite language
               L = { banana, nab }
we can start with the atomic languages
           A = {a}, B = {b}, N = {n}.
Then we can express L as:
     L = (B A N A N A)  (N A B )
                                                  35
Blackboard Exercises
Express the DFA patterns from the
  previous board-exercises using regular
  operations in both UNIX-style and
  Computability-style.




                                       36

								
To top