Properties of Regular Languages

Document Sample
Properties of Regular Languages Powered By Docstoc
					CS 3240 – Chapter 4
   Closure Properties

   Algorithms for Elementary Questions:
     Is a given word, w, in L?
     Is L empty, finite or infinite?
     Are L1 and L2 the same set?

   Detecting non-regular languages

                 CS 3240 - Properties of Regular Languages   2
   Closure of operations
     If x and y are in the same set, is x op y also?
     Example: The integers are closed under addition
      ▪ They are not closed under division
   Regular languages are closed under
    everything!
     Typical set operations


                 CS 3240 - Properties of Regular Languages   3
   Regular languages are closed under:
     Kleene Star (*)
     Union (+)
     Concatenation (xy)
     (By definition!)
   They are also closed under:
     Complement (reverse state acceptability✓)
     Intersection
     Set difference
     Reversal (already proved in homework #12, 2.3✓)

                  CS 3240 - Properties of Regular Languages   4
   Proof from set theory:
     L1 ∩ L2 = (L1’ ∪ L2’)’
     Since complement and union are closed,
      intersection must be also! QED




                CS 3240 - Properties of Regular Languages   5
• Note how the intersection is never shaded
• L1’ ∪ L2’ shades everything but where they overlap
• Therefore, (L1’ ∪ L2’)’ is the overlap (intersection)


          CS 3240 - Properties of Regular Languages       6
   A – B:
     Everything that is in A but not in B
   A – B = A ∩ B’
     We have already shown that regular languages
     are closed under intersection and complement.
     QED




                CS 3240 - Properties of Regular Languages   7
   Start with a composite start state:
     Consisting of the two start states
   Follow all out-edges simultaneously
     As we did for NFA-to-DFA conversion
   States containing any original final state is a final state
    in the result for union
     Because one of the machines accepts there
   States containing an original final state from each
    original machine is a final state in the result for
    intersection
     Because both of the machines accept there
   ¿How would you construct the difference machine?
                    CS 3240 - Properties of Regular Languages     8
b                  a                              a,b

                                         a
         -x1               x2                    +x3
                   b

                       b                           Double-a

         y1                    y2
                       b



    a          a            a        a       EVEN-EVEN

                       b

         y3                     y4
                       b

                                                              9
xi, yi   a        b
x1, y1   x2, y3   x1, y2
x2, y3   x3, y1   x1, y4   For union: assign accepting
                           states where any original xi or
x1, y2   x2, y4   x1, y1   yi accept.
x3, y1   x3, y3   x3, y2   For intersection: assign
x1, y4   x2, y2   x1, y3   accepting states only where
x2, y4   x3, y2   x1, y3   both original xi or yi accept
                           simultaneously. No need to
x3, y3   x3, y1   x3, y4   compute (L1’ ∪ L2’)’ !
x3, y2   x3, y4   x3, y1
                           For difference, assign
x2, y2   x3, y4   x1, y1   accepting states where one
x1, y3   x2, y1   x1, y4   accepts and the other does
x3, y4                     not.
         x3, y2   x3, y3
x2, y1   x3, y3   x1, y2
The resulting machine…
                              a
 
11                                                                8
                              a                           a
            a                     b
                                                  a
                9            1       3               6
                     a            b
b                                         b
    b                                                             b
                         b                                             b
                                              a

                     a            b
                 
                12           10       5       b       2
        a                         b                           a

                              a
7                                                                4
                                  a
                                                                       11
   Given a word w, and a regular language, L,
    can we answer the question:
     Is w ∊ L?


   You tell me…




                  CS 3240 - Properties of Regular Languages   12
   A graph theory problem:
     Find a path from the start to a final state in the
      associated FA
   Algorithm:
    “mark” the start state
    repeat:
      mark any state with an incoming edge from a previously
       marked state
    until an accepting state is marked or no new states were
      marked at all
                 CS 3240 - Properties of Regular Languages     13
   Attempt to convert the associated FA to a
    regular expression
     By the state bypass and elimination algorithm
   If you get a regular expression, then a string is
    accepted




               CS 3240 - Properties of Regular Languages   14
   Suppose a minimal machine, M, for the
    language L has p states
   If M accepts any non-empty words at all, it
    must accept one of length <= p
     Why?
   So…
     Systematically try all possible strings in Σ* of
      length 1 through p. If none are accepted, then no
      non-empty strings at all are in L.
                CS 3240 - Properties of Regular Languages   15
   Convert its machine to a regular expression
   It is infinite iff it has a star
   

   Another way:
     A language is infinite if there is a cycle in an
      accepting path
     A (tedious) graph theory problem
    
                 CS 3240 - Properties of Regular Languages   16
 Suppose L’s minimal machine, M, has p states
 Any path of length p has (or is) a cycle
     And any cycle must have or be a cycle of length p or less
     Because a state is revisited after at most p characters
 So, infinite languages have a machine with at least
  one cycle of length p or less in an accepting path*
 And all non-empty languages have a string of length
  p or less (already showed that)…


                  CS 3240 - Properties of Regular Languages       17
   Let m denote the length of a cycle in an accepting path
     We know m ≤ p
   Let k be the length of a string in L such that k ≤ p
     There has to be one if the language is infinite!
   Then strings of length k + im are accepted, i ≥ 0
     By traversing the cycle i times
   But k + im ≤ p + ip = (i+1)p
   So, there must be some i such that p ≤ k+im ≤ 2p
   Procedure: Test all strings of length p through 2p-1

                   CS 3240 - Properties of Regular Languages   18
   That is, are they the same set of strings?
   Set-theoretic argument:
     Two sets are equal if their symmetric difference is
      empty (denoted by A ∆ B or A ⊖ B)
     A∆B=A∪B–A∩B=A–B∪B–A
   But A – B = A ∩ B’, and B – A = B ∩ A’

   So L1 = L2 iff (L1 ∩ L2’) ∪ (L1’ ∩ L2) = ∅

                CS 3240 - Properties of Regular Languages   19
CS 3240 - Properties of Regular Languages   20
CS 3240 - Properties of Regular Languages   21
   Not all languages are regular
   We need to recognize whether languages are
    regular or not
     We don’t want to waste time using regular
     language processing techniques where they don’t
     apply




               CS 3240 - Properties of Regular Languages   22
CS 3240 - Properties of Regular Languages   23
CS 3240 - Properties of Regular Languages   24
CS 3240 - Properties of Regular Languages   25
   Consider anbn
   ab is regular
   ab + aabb = anbn, 0 ≤ n ≤ 2, is regular
   Any finite language is regular (why?)
   But anbn, n ≥ 0 is not regular (why not?)

   How do we prove it’s not regular!?!


               CS 3240 - Properties of Regular Languages   26
   Finite Automata don’t have unlimited
    counting capability
     They only have a fixed number of states
   Intuitively, we see that an automaton can’t
    keep track of counts for anbn where n is
    arbitrarily large

   But intuition is often faulty. We need a proof!

               CS 3240 - Properties of Regular Languages   27
   Any accepted string of length p (the number
    of states) or greater forces a cycle in an
    accepting path.
   In other words, at least one state is visited a
    second time
     And that “revisit” must happen within the first p
      characters of the string
      ▪ Because that’s when the (p+1)th state is entered
     This could be any state (start, final, other)
                 CS 3240 - Properties of Regular Languages   28
   Consider akbk, where k is greater than the number of
    states in a supposed DFA accepting all anbn, n ≥ 0
     Before the first b is encountered, a state has been visited at least
      twice (because there are more a’s than states)
     Suppose the length of the associated cycle is m
     Then the string ak+imbk is also accepted!
   This contradicts the existence of a DFA that accepts anbn




                   CS 3240 - Properties of Regular Languages                 29
The first “revisit”




              CS 3240 - Properties of Regular Languages   30
   For every infinite regular language, L, there is
    a number, p, such that for all strings, s, in L,
    where |s| ≥ p, you can partition s into three
    concatenated substrings, xyz, such that:
    1. |y| > 0
    2. |xy| ≤ p
    3. xy*z ∈ L



                  CS 3240 - Properties of Regular Languages   31
   You can only use the pumping lemma to show
    that a language is not regular
     By showing it fails the “pumping” conditions of
      infinite regular languages
     Note: Some non-regular languages pump!
   The trick is to find a convenient string
     Usually the condition |xy| ≤ p is also key
     Sometimes pumping down (i = 0) is easiest


                CS 3240 - Properties of Regular Languages   32
   Consider the string apbp
     It is in this language
     It is long enough (≥ p in length)
   Now let apbp = xyz
     Remember |xy| ≤ p
     What can you conclude about y?




                 CS 3240 - Properties of Regular Languages   33
   You can treat proving a language non-regular
    as a “game”:
    1. You pick a string, s, in L, where |s| ≥ p
      ▪ You may pick any such string; choose wisely!
    2. Opponent picks x, y, and z
      ▪ But must obey |xy| ≤ p and |y| > 0
    3. You show it can’t be “pumped”
      ▪ Because a pumped string falls “outside” the language
   Must anticipate all possible partitions xyz
                 CS 3240 - Properties of Regular Languages     34
   aibj, i > j
   PALINDROME
     w = wR (same backwards and forwards)
   ww
     Equal halves
   PRIME (am where m is prime)
   SQUARE (am where m is a perfect square)


               CS 3240 - Properties of Regular Languages   35
   Strings with equal number of a’s and b’s
   NOTPRIME




              CS 3240 - Properties of Regular Languages   36
   NOTPRIME is pumpable!
   Let y = the whole string (akm)
   The number of a’s will always be a multiple
    of km, hence not prime
     Note: zero is not a prime number
   This does not violate the pumping lemma
     The pumping lemma draws no conclusion about
     non-regular languages

               CS 3240 - Properties of Regular Languages   37

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:9/2/2012
language:English
pages:37