Covering Problems from a formal language point of view by pptfiles

VIEWS: 1 PAGES: 32

									                Covering problems
                       from
          a formal language point of view
                           Marcella ANSELMO
                           Maria MADONIA


Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   1
                                                M. Anselmo - M. Madonia
                               Covering a word
Covering a word w with words in a set X
           ÎX ÎX ÎX            ÎX       ÎX
   w

Covering = concatenations +overlaps

Example: X = ab+ba                             w = abababa


   a b a b a b a

Ravello 19-21 settembre 2003     Covering Problems from a Formal Language Point of View   2
                                                  M. Anselmo - M. Madonia
                         Why study covering ?


 • Molecular biology:
             manipulating DNA molecules (e.g. fragment assembly)

 • Data compression

 • Computer-assisted music analysis




Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   3
                                                M. Anselmo - M. Madonia
                                       Literature
• Apostolico, Ehrenfeucht (1993)
                                                            w is ‘quasiperiodic’
• Brodal, Pedersen (2000)
• Moore, Smyth (1995)                                         x is a ‘cover’ of w
• Iliopulos, Moore, Park (1993)                                       x ‘covers’ w

• Iliopulos, Smyth (1998)                               ‘set of k-covers’ of w

• Sim, Iliopulos, Park, Smyth (2001)                                p ‘approximated
   (complete references)                                               period’ of w

                         All algorithmic problems!!!
                                 (given w find ‘optimal’ X)
  Ravello 19-21 settembre 2003        Covering Problems from a Formal Language Point of View   4
                                                       M. Anselmo - M. Madonia
                Formal language point of view

Formal language point of view is needed!
Madonia, Salemi, Sportelli (1999) [MSS99]:

If X Í A*,
X cov = set of words ‘covered’ by words in X
                           also
 Xcov = (X, A*)­ set of z-decompositions over (X, A*)
               ,


       Here: Coverings not simple generalizations of
                    z-decompositions!
 Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   5
                                                 M. Anselmo - M. Madonia
                                 Formal Definition




red(w) = canonical representative of the class of w in the free group


Def. A covering (over X) of w in A* is d =(w1, …, wn) s.t.
       1. n is odd;
          for any odd i, wi Î X
          for any even i, wi Î
       2. red(w1… wn) = w
       3. for any i, red(w1…wi) is prefix of w
  Ravello 19-21 settembre 2003     Covering Problems from a Formal Language Point of View   6
                                                    M. Anselmo - M. Madonia
Example: X = ab+ba                      w = ababab.

d =(ab,          , ba, l , ba,       , ab) is a covering of w over X
     •   n is odd; for any odd i, wi Î X;
         for any even i, wi Î *
     2. red(ab ba l ba ab) = ababab
     3. for any i, red(w1…wi) is prefix of w



d:                  a b a b a b


Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   7
                                                M. Anselmo - M. Madonia
                    Concatenation, zig-zag, covering

Concatenation                                                    submonoid                    X*

   Zig-zag                                                     z-submonoid                    X­

  Covering                                                   cov-submonoid Xcov


 cov-submonoid                      z-submonoid                           submonoid

  Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   8
                                                  M. Anselmo - M. Madonia
                          Splicing systems for Xcov
X, finite
S, splicing system s.t. L(S) = # Xcov $
COV2(X) =
Start with: # x $, xÎX or COV2(X)
Rules: (l, x, $), xÎX
       (#, x, x3$), x=x1x2, x2x3 ÎX

Example: X= ab+ba,                     w=#ababaab$ Î L(S)

#a b a b $ x = ab                                          x = ba
                                  #a b a b a $
   #a b a $                                                                #a b a b a a b$
                                        #b a a b $
   Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   9
                                                       M. Anselmo - M. Madonia
                     Coding problems [MSS99]
 How many coverings has a word?
 Example:                X=ab + ba, w = ababab Î X cov
     •    w has many different coverings over X :

d1 =(ab, l , ab, l , ab)
d2 =(ab, , ba, l, ba , l , ba, , ab)
d3 =(ab,           , ba,        , ab, l , ab)
d4 =(ab, l, ab,                 , ba,    , ab)
d5 =(ab,           , ba,        , ab,     , ba,       ,ab)
 Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   10
                                                     M. Anselmo - M. Madonia
                       Covering codes [MSS 99]

X Í A* is a covering code if any word in A* has at
   most one minimal covering (over X).
Example: X = ab + ba is not a covering code
        (remember δ1, δ2)

Example: X = aabab + abb is a covering code

Example: X= ab+a + a                  is a covering code



  Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   11
                                                  M. Anselmo - M. Madonia
                                  Cov - freeness
  Let M Í A*, cov-submonoid.
  cov-G(M) is the minimal X Í A* such that M= Xcov.
   M is cov-free if cov-G(M) is a covering code.

  Fact:          M free                       M stable (well-known)
                 M z-free                      M z-stable (known)

  Question: M cov-free                                 M ‘cov-stable’?
We want
‘cov-stability’ = global notion equivalent to cov-freeness.

   Ravello 19-21 settembre 2003    Covering Problems from a Formal Language Point of View   12
                                                    M. Anselmo - M. Madonia
           Toward a cov-stability definition (I)

stable                         u,w,uv,vw Î M                implies           wÎM

z-stable                       w, vw Î M , uv, u Î Z-p-s(uvw)
                               implies vÎ Z-p-s(uvw)
cov-stable?                    w, vw, uvx, uy Î M, for l£ x <w
                               and l£ y <vw, implies vx Î M ?
                               Not always!
                               Example:
                               X = abcd+bcde+cdef+defg

Ravello 19-21 settembre 2003     Covering Problems from a Formal Language Point of View   13
                                                  M. Anselmo - M. Madonia
              Toward a cov-stability definition (II)
Main observation in the classical proof of (stable implies free):
     • x minimal word with 2 different factorizations:
       the last step in a factorization ¹ from the last step
       in the other factorization
New situation with covering:
              u              w



So we have to study the case v = l.
Example: X = abc + bcd + cde
    Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   14
                                                    M. Anselmo - M. Madonia
                                    Cov – stability

Def. M is cov-stable if
w, vw, uvx, uy Î M, for l£ x < w and l£ y < vw


•      If v ¹ l, then vz Î M, for some l £ z < w

                Moreover vx Î M if çy ç< çv ç
2.      If v = l, u ¹ l and çx ç> çy ç then t Î M,

                  for some t proper suffix of ux
Remark: cov-stable implies stable

     Ravello 19-21 settembre 2003    Covering Problems from a Formal Language Point of View   15
                                                      M. Anselmo - M. Madonia
                             Cov-stable iff cov-free


      Theorem: M covering submonoid.

                         M is cov-stable                   M is cov-free




Proof: many cases and sub-cases (as in definition!)



    Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   16
                                                    M. Anselmo - M. Madonia
                                 Some consequences
Fact 1: (cov-free Ç cov-free) ¹ cov-free
Fact 2: cov-free implies free (not viceversa)
Fact 3: cov-free implies very pure (not viceversa)
Fact 4: M covering submonoid, X= cov-G(M).
        M cov-free implies X* free.

Fact 5:         cov –free                            z-free
                                     free
Remark: Covering not simple generalization of
       z-decomposition!
  Ravello 19-21 settembre 2003      Covering Problems from a Formal Language Point of View   17
                                                     M. Anselmo - M. Madonia
Cov - maximality and cov-completeness
Let X Í A*, covering code.
X is cov-complete if Fact(Xcov).
X is cov-maximal if X Í X1, covering code                                       X=X1
Fact:         X cov-complete                    X cov-maximal
Remark [MSS99]:
X cov-complete                       X infinite (unless X=A)
Remark complete                  cov-complete (not viceversa)
       maximal                   cov-maximal (not viceversa)

 Example: X=ab+a +a
Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   18
                                                M. Anselmo - M. Madonia
                   Counting minimal coverings
  X Í A*, regular language

  covX : w                 number of minimal coverings of w


                       X                       A, 1DFA recognizing X


        A             1        X               B, 2FA recognizing Xcov


 Remark: B counts all coverings of wÎ Xcov

Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   19
                                                M. Anselmo - M. Madonia
    Remark on minimal coverings

 Remark: In minimal coverings, no 2 steps to the left
 under the same occurrence of a letter

 Crossing sequences in B for minimal coverings of w:
                                      w
                   1            1
                        1 1 1
                                                  1
                                           1 1
                                                                1


Ravello 19-21 settembre 2003        Covering Problems from a Formal Language Point of View   20
                                                     M. Anselmo - M. Madonia
                   A 1NFA automaton for covX
CS3 = crossing sequences of length £ 3 and no twice state 1
d(cs,a) =cs’ if cs matches cs’ on a
                               C = (CS3, (1), d, (1) )
                                                                     a   2           b
Example: X = ab + ba,                               A:       1                           4
                                                                     b   3           a



                          2                     2                        3
                          1      a                           b           1
                                            a       b
  C:                  b                                                          a
                                                1
                          1       a         b       a            b           1
                          3                                                  2
                                                3

Ravello 19-21 settembre 2003         Covering Problems from a Formal Language Point of View   21
                                                      M. Anselmo - M. Madonia
                                   Some remarks


•   Language recognized by C = X cov

•   X regular implies X cov regular

•   Behaviour of C is covX

•   X regular implies covX rational

• X covering code iff C unambiguous (decidable)
         (different proof in [MSS99])

    Ravello 19-21 settembre 2003    Covering Problems from a Formal Language Point of View   22
                                                     M. Anselmo - M. Madonia
                     Conclusions and future works

• Formal language point of view is needed

• Covering not generalization of zig-zag (or z-decomposition):
  many new problems and results

•Further problems:
    ücovering codes: measure
    üspecial cases: |X| =1, X Í Ak
    ü suggestions …


    Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   23
                                                    M. Anselmo - M. Madonia
                                    x          x           x             x
w is‘quasiperiodic’
x is a ‘cover’ of w
                                                           w


                                        x          x           x             x
x ‘covers’ w
                                                        w

                                ÎX ÎX              ÎX              ÎX            ÎX
‘set of k-covers’                                                                            X Í Ak
       of w
                                                       w

     Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   24
                                                         M. Anselmo - M. Madonia
Example: X = ab+ba


  a b a b a b                                      w = ababab Î Xcov




   a b a b a b                                     w = ababab Î (X, A*)­



 Xcov = (ab + ba+ aba + bab)*

Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   25
                                                M. Anselmo - M. Madonia
           d 1: a b a b a b


           d 2: a b a b a b



 All the steps to the right are needed for covering w:
 δ1, δ2 are minimal coverings!



Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   26
                                                M. Anselmo - M. Madonia
          d 3: a b a b a b

          d 4: a b a b a b

          d 5: a b a b a b


All blue steps are useless for covering w :
δ3, δ4, δ5 are not minimal.
We count only minimal coverings.

Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   27
                                                M. Anselmo - M. Madonia
Toward a cov-stability definition (I)
stable             u,w,uv,vw Î M                       vÎM

                   u             v                    w


z-stable             w, vw Î M , uv, u Î Z-prefix-strict(uvw)
                            v Î Z -prefix-strict(uvw)


                     u               v                  w


  Ravello 19-21 settembre 2003   Covering Problems from a Formal Language Point of View   28
                                                  M. Anselmo - M. Madonia
Example: X= abcd+bcde+cdef+defg                                       M=Xcov



              a b c d e f g
                                vx

Set u=ab, v=c, w=defg, x=de, y=cd.
Therefore w, vw, uvx, uy Î M but vx =cde Ï M.
•Note vz=cdef Î M,                   l£z<w.

 Ravello 19-21 settembre 2003    Covering Problems from a Formal Language Point of View   29
                                                  M. Anselmo - M. Madonia
Example: X = abc + bcd + cde                                        M=Xcov
                                 w

              a b c d
                 e
               u                 x

Set u=ab, v=l , w=cde, x=cd, y=c.

Therefore w, vw, uvx, uy Î M but vz Ï M for no l £ z < w.

• Note bcdÎ M,                   bcd proper suffix of ux.

  Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   30
                                                      M. Anselmo - M. Madonia
                                       Case 1.

                                   u             v               w
  v¹l                                                                                      vz Î M
çy ç³ çv ç                                                                               l£z<w
                                                          x
                                                      y

                                   u            v                w
  v¹l
                                                                                           vx Î M
çy ç< çv ç
                                                y         x
    Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   31
                                                        M. Anselmo - M. Madonia
                                      Case 2.

                                  u                                  w
  v=l
çx ç> çy ç
  u¹l                                                          x
                                                           y



                     t Î M, t proper suffix of ux


   Ravello 19-21 settembre 2003       Covering Problems from a Formal Language Point of View   32
                                                       M. Anselmo - M. Madonia

								
To top