Pattern representation _ the future of pattern recognition

Document Sample
Pattern representation _ the future of pattern recognition Powered By Docstoc
					     Pattern representation
                &
the future of pattern recognition



             Lev Goldfarb
                ETS group
       Faculty of Computer Science
                   UNB
           Fredericton, Canada
                                       Outline
1.   The wisdom of modern physicists                        (3 slides)
2.   The maturity of a science                              (1 slide)
3.   The currently prevailing wisdom in our field           (1 slide)
4.   Why should this be the guiding wisdom?                 (4 slides)
5.   Are we mature enough for the task?                     (2 slides)
6.   The (social) reason for the status quo                 (1 slide)
7.   The forgotten history: syntactic pattern recognition   (8 slides)
8.   Syntactic pattern recognition: the unrealized hopes    (4 slides)
9.   How should we apply the wisdom of physicists?          (2 slides)
10. ETS formalism: its inspiration                          (2 slides)
11. ETS formalism: temporal information                     (3 slides)
12. ETS formalism                                           (13 slides)
13. ETS formalism: representational completeness            (1 slide)
14. ETS formalism: the intelligent process                  (1 slide)
15. Learning without representation?                        (2 slides)
16. Conclusion                                              (3 slides)
         [     The wisdom of modern physicists
From: Freeman Dyson, Innovations in Physics, Scientific American, September 1958:


        A few months ago two of the great historical figures of European
    physics, Werner Heisenberg and Wolfgang Pauli, believed that they had
    made an essential step forward in the direction of a theory of elementary
    particles. Pauli happened to be passing through New York, and was
    prevailed upon to give a lecture explaining the new ideas to an audience
    that included Niels Bohr, who had been mentor to both . . . in their days
    of glory thirty years earlier when they made their great discoveries. Pauli
    spoke for an hour, and then there was a general discussion during which
    he was criticized sharply by the younger generation. Finally, Bohr was
    called on to make a speech summing up the argument. “We are all
    agreed,” he said, “that your theory is crazy. The question which divides
    us is whether it is crazy enough to have a chance of being correct.
    My own feeling is that it is not crazy enough.”

                              Lev Goldfarb, ICPR, Aug. 2004                  3
             The wisdom of modern physicists
(the quote continues)


        The objection that they are not crazy enough applies to all the
    attempts which have so far been launched at a radically new theory of
    elementary particles. It applies especially to crackpots. Most of the
    crackpot papers that are submitted to the Physical Review are rejected,
    not because it is impossible to understand them, but because it is
    possible. Those that are impossible to understand are usually
    published. When the great innovation appears, it will almost certainly
    be in a muddled, incomplete, and confusing form. To the discoverer
    himself it will be only half-understood. To everybody else it will be a
    mystery. For any speculation that does not at first glance look crazy,
    there is no hope.



                           Lev Goldfarb, ICPR, Aug. 2004               4
         The wisdom of modern physicists                    ]

Why did two of the 20th century leading physicists behave so
“childishly”?


Their wisdom is this:
based on the past experience, a radical novelty of the proposed
physical model (of a unified field theory) is a necessary prerequisite
for it to be a serious “contender”.




                         Lev Goldfarb, ICPR, Aug. 2004            5
                 The maturity of a science
Why did I begin the talk with the above quote?


I want to draw your attention to one very important informal fact:

the maturity of a science is reflected in the ability of its
practitioners to estimate the quality of the match between
the reality and its model.


This is one of the main messages I would like you to keep in mind,
and I hope we will discuss it in this workshop.


How is our field doing in this respect?
                                                                 6
 The currently prevalent wisdom in our field

• The main (subconscious?) postulate: Completely rely on
statistical models (and, therefore, on the vector space formalism).

• In accordance with the above postulate, expect that there exist
some new, statistically “profound” models/algorithms that would
do a satisfactory job.

• Innovative, by taking an apparently structural representation,
convert it to a numeric one (to “tame” its structural elements), and
reduce the problem to the more familiar statistical setting (in the
process, misleading yourself and others that one is actually
dealing with the structural representation).

                       Lev Goldfarb, ICPR, Aug. 2004            7
    [   Why should this be the guiding wisdom?
Indeed, why should we confine ourselves to the statistical framework?

In their 1974 book, similar (however somewhat rhetorical) doubts were
expressed by two of the present leaders in the field of statistical pattern
recognition, Vapnik and Chervonenkis:

   In those days it appeared that the pattern recognition problem
   carried within itself the beginnings of some new idea, which is in
   no way based on the system of old concepts; researchers wanted to
   find new formulations, not to reduce the problem to already known
   mathematical schemes. In this sense the reduction of the pattern
   recognition problem to the scheme of average risk minimization
   rouses some disappointment. True, there are attempts to understand
   the problem in a more complex formulation . . . . As yet, however,
   such attempts are extremely rare.
                                                                     8
      Why should this be the guiding wisdom?

(the quote continues)



      Now, many years after the period of „pattern recognition
   romantics‟, it is difficult to estimate what this problem formalization
   brought. It is possible that the desire to find a rigorous formulation
   led scientists to restrict the meaningful problem solution of which
   was attempted at the beginning of the „60s.




                          Lev Goldfarb, ICPR, Aug. 2004             9
      Why should this be the guiding wisdom?


From the preface of Probability, Statistics and Truth (1957), by one
of the 20th century pioneers of modern probability and statistics,
Richard von Mises:

    The stated purpose of these [mentioned earlier] investigations is to
    create a theory of induction or „inductive logic‟. According to the
    basic viewpoint of this book, the theory of probability in its
    application to reality is itself an inductive science; its results and
    formulas cannot serve to found the inductive processes as such . . ..




                          Lev Goldfarb, ICPR, Aug. 2004            10
    Why should this be the guiding wisdom?                              ]
From: A. N. Kolmogorov, Logical basis for information theory and
probability theory, IEEE IT-14 (1968) (one of the founders of modern
probability theory):

   The proceeding rather superficial discourse should prove two general theses:

   (1) Basic information theory concepts must and can be founded without
       recourse to the probability theory . . . .
   (2) Introduced in this manner, information theory concepts can form the basis
       of the concept random, which [would then] naturally suggest that the
       random is the absence of periodicity.



                            Lev Goldfarb, ICPR, Aug. 2004                11
        [ Are we mature enough for the task?
Why are the present day physicists feel compelled to venture (and on a big
scale) into such highly speculative theories as “string theories”, while we are
infatuated with the “good old” statistics that simply cannot address the
qualitative side of modeling, i.e. the structure of the model itself?

Do we understand that an adequate modeling of information processes cannot
succeed in the same manner as, for example, the modeling of a flight has
succeeded (i.e. without capturing the essence of the corresponding biological
processes)?

In particular, do we understand that without producing reasonably good models
of the information processes in nature we will not succeed in developing
satisfactory information systems?



                             Lev Goldfarb, ICPR, Aug. 2004                 12
       Are we mature enough for the task?                  ]
Are we mature enough for the task?           

God forbid if we are not:
from the very beginning of our science, we are faced with modeling
of much more abstract processes then physicists have ever been

=> when modeling information processes, we need even greater
imagination than physicists do (who, as I mentioned above, are ahead
of us in many ways).

It seems quite obvious to me that without some radically new insights
we are not going to get to any “promised land”.

                        Lev Goldfarb, ICPR, Aug. 2004          13
           The (social) reason for the status quo


The above prevalent wisdom has not always been as popular as it is
today.

One of the main reasons for the status quo is the forgotten part of our
history, due to the emergence during the last 15-20 years of two “new”
popular areas, neural networks and machine learning. Both of them are
dealing with the same subject matter as pattern recognition however
starting, basically, all over again, and eventually rediscovering the
importance of symbolic representations.
(In contrast to pattern recognition, the professional milieu is not any more
engineering, but psychological and computational/statistical, respectively,
although both of them attracted many young physicists.)
                              Lev Goldfarb, ICPR, Aug. 2004                    14
[   The forgotten history: syntactic pattern recognition


• In North America, one of the few early general texts on pattern
recognition, Pattern Recognition Principles (1974), by Tou and
Gonzalez, had the last chapter (chapter 8) titled “Syntactic Pattern
Recognition” and considered “structural” pattern representations.


• Among English books that came out in the ‟70s and ‟80s and devoted
entirely to this topic, we had those by Fu (Syntactic Pattern
Recognition and Applications), Grenander (Lectures in Pattern
Theory), Gonzalez and Thomason (Syntactic Pattern Recognition),
Watanabe (Pattern Recognition) and several others.


                           Lev Goldfarb, ICPR, Aug. 2004               15
The forgotten history: syntactic pattern recognition



 In the resulting excitement and during the making of so many
 careers in the above two new sister areas, some of the important
 lessons learned in syntactic/structural pattern recognition were lost,
 i.e. the critical role of (non-vector) pattern representations and
 formalisms was overlooked.




                         Lev Goldfarb, ICPR, Aug. 2004            16
The forgotten history: syntactic pattern recognition

Syntactic pattern recognition
• Pioneers: Eden, Narasimhan, and Ledley (published their initial work
in the early 60s) and others
• King-Sun Fu (of Purdue university, also instrumental in the founding of IAPR
and was its first president) mounted a productive and influential applied
scientific program to shift emphasis from the vector space based
representation to other, “structural”, forms of representation,
predominantly those associated with formal grammars and its various
generalizations.
Fu began his career in statistical pattern recognition; later, in the ‟70s and early ‟80s,
he was largely responsible for the creation of a burgeoning subfield of syntactic
pattern recognition, and his untimely death in 1985 had a big impact on the vitality of
this subfield.
                               Lev Goldfarb, ICPR, Aug. 2004                     17
18
The forgotten history: syntactic pattern recognition



Narasimhan (1964):


   The aim of any adequate recognition procedure should not be
   merely to arrive at a “yes”, “no”, “don‟t know” decision but to
   produce a structural description of the input picture.




                        Lev Goldfarb, ICPR, Aug. 2004           19
The forgotten history: syntactic pattern recognition



There are applications of syntactic pattern recognition to almost
any field, from seismic oil exploration to speech recognition, from
face recognition to fingerprint recognition.




                        Lev Goldfarb, ICPR, Aug. 2004           20
The forgotten history: syntactic pattern recognition



The main overlooked lesson from syntactic pattern recognition
(already noted by its pioneers) is this:


even in this incomplete form, “structural” pattern and class
representations have substantial advantages over their vector space
counterparts, from both applied and theoretical points of view.
(However, see slides 23-26).




                               Lev Goldfarb, ICPR, Aug. 2004    21
                                                                          ]




Compared to the vector space representation of the digitized image, under
symbolic representation, one moves immediately into a more meaningful, higher
level representation, with a generative class description.
[    Syntactic pattern recognition: the unrealized hopes

    What is the problem, then? Why have not these advantages
    materialized yet in a more apparent manner?
                           ______________________________


    Fundamental inadequacy of the (conventional) string representation

    Take a string: afdbaaccbdfaddbbcacbffacda

    => no temporal information is represented in it (i.e. how the string
    was “formed”)

    => exponentially many candidate operations to consider that could
    have been involved in the generation of the string
                                                                   23
Syntactic pattern recognition: the unrealized hopes


 => given a training set of strings, the inductive learning process
 simply cannot recover reliably the set of “generative operations”,
 i.e. to recover the class description

 => basic inadequacy of the underlying formal structure of the
 conventional syntactic (similarly all computational) formalisms: the
 “link” between the class and object representations is too week.




                         Lev Goldfarb, ICPR, Aug. 2004           24
Syntactic pattern recognition: the unrealized hopes


Thus, the conventional string is not an adequate/reliable form of
representation: there are just too many formative object histories that
are “hidden” behind this representation.



(The related observation applies to graphs and various numeric
representations.)




                         Lev Goldfarb, ICPR, Aug. 2004           25
                                                                               ]




A somewhat obvious problem―which is a consequence of the above fundamental
inadequacy―is the presence of the second (“spurious”) alphabet of the non-terminals.
                   We need to be wise


About 2500 years ago Democritus wrote:

“Fools can learn from their own experience; the wise learn from
the experience of others.” 

                          ____________


So, let‟s try to be wise and learn as much as we can from the
experience of physicists, mathematicians, and biologists.


                       Lev Goldfarb, ICPR, Aug. 2004            27
[   So how should we apply the wisdom of physicists?


Going back to slide 5, since an incremental wisdom has not really
worked for our field, how should we interpret “a radical novelty” for
our needs?

I suggest that we should interpret it in two (equally important) ways,
both pointing towards radically new forms of representation.




                         Lev Goldfarb, ICPR, Aug. 2004           28
How should we apply the wisdom of mathematicians?


First: why the representation?


If we interpret correctly, from the applied point of view, the wisdom
of modern mathematics, we would immediately accept that form the
representational point

• the data operations that are not derivatives/compositions of the
basic operations (specified by the underlying axiomatic structure of
the “data space”) cannot be inductively recovered/discovered.



                        Lev Goldfarb, ICPR, Aug. 2004           29
 How should we apply the wisdom of physicists?                         ]
 Second:

• we should demand from the model a radical explanatory novelty

   we should expect it to offer some basic insights into the nature of
   information processes in the Universe


• we should demand radical novelty in its formal structure

    we should expect it to embody a radically new formal structure



                         Lev Goldfarb, ICPR, Aug. 2004            30
                ETS formalism: its inspiration

From the very beginning, the ETS framework has been inspired by
the formal/esthetical beauty and power of a dynamic (and
generalized) version of the generative grammar model:

to support an evolving concept of class, one needs an evolving set of
transformations that captures the class description and also modifies
the corresponding (evolving) mathematical structure on the
representation “space”.


(In that sense, if besides ETS there is another formal realization of this vision, it
should definitely be investigated.)


                                Lev Goldfarb, ICPR, Aug. 2004                     31
             ETS formalism: its inspiration

In mathematics, so far, we have been dealing with various static
(abstract) structures.
For example, in group theory, which does study the subgroup lattice
of a given group, there are, quite naturally, no expectations that one
subgroup is obtained by modifying another one.
Even in a more “continuous” setting of a topological space, there are
again no expectations that a topological structure itself is evolving.
                          -----------------------
In contrast, in ETS formalism, some of the central building blocks of
the formal structure, the set of transformations, are being modified on
the basis of the inductive experience.

                         Lev Goldfarb, ICPR, Aug. 2004             32
   [     ETS formalism: temporal information

Thus, it should not come as a surprise that, when we came to the
formalization stage about 5 years ago, we had to begin literally
from scratch.


The main difficulties have been (and will continue to be)
associated with the need to introduce temporal information
into a structural representation, i.e. with the concept of object‟s
formative/generative structural history.


And it is precisely this feature that characterizes the radical
departure from all known mathematical paradigms.

                         Lev Goldfarb, ICPR, Aug. 2004                33
             ETS formalism: temporal information




Event environment versus object environment:
In State 1, three unbonded oxygen atoms are shown. After the first “real” event has occurred,
OA and OB become bonded, and the corresponding “ideal” event (primitive p1) is depicted on
the right.
         ETS formalism: temporal information                            ]
From Edward Witten, Universe on a String, Astronomy, June 2002:


Note how one event (particle on the left or string on the right) is immediately
followed by two events (two particles/strings).
[      ETS formalism: (class) primitive transformations
                                        initial sights




                                                   time
                                        terminal sights

• Think of a primitive as an “elementary” process that transforms the initial “objects” into
terminal ones: it is a symbolic “notation” of a typically nontrivial process (structured event).
• The circle and the square denote two site types: letters {a, b} and {x, y} are names of the
variables that are allowed to vary over non-overlapping sets of numeric labels.
• Brackets [ ] signify that we are, in fact, dealing with a class of (original) primitives, where
each original primitive carries concrete numeric labels.
ETS formalism: structs (segments of formative history)




     number 3

                              representations of a more general structural object

Each pi denotes an ETS primitive transformation (the order in which the primitive
transformations are applied is captured in the representation).
           ETS formalism: extructs (contexts)




• Examples of extructs: heavy lines identify the interface sites and crosses identify
detached sites.
• Contexts should be thought of as parts of the formative history that are necessary
for the presence of the (immediately following) “important” segments of history.
 ETS formalism: transformation



                         context




                          body

context          body

   Formal definition    The “assembled” transform
ETS formalism: a supertransformation



                      A supertransform, t (tau bold), is
                      a generalization of the concept of
                      transformation, and it can be
                      thought of as an abstraction of the
                      set of several “closely related” and
                      inductively acquired transforms.

                      Here, all contexts have the same
                      interface sites and all bodies have
                      the same initial and terminal sites.




                                                   40
ETS formalism: class supertransform
      (structural class representation)




                              The class supertransform, [t ] ,
                              is obtained on the basis of the
                              supertransform, by abstracting
                              away the supertransform‟s site
                              labels.




                                                        41
ETS formalism: (single level) class representation


Class representation (associated with a class supertransform [t ]) is
defined as a pair

                     CLASS [t ] =   ( [t ] ,   CBt   )   ,

where CBt is the context-body association strength scheme,
or simply class weight scheme:
                       CBt : {t | t from t } → R+    .




(Obviously, [t ] is the main, “structural”, part of the representation.)
                                                                   42
  ETS formalism: (structural) description of a
         single representational level



Transformation system TS is simply a finite set of class
supertransforms:

                    TS = { [t1], [t2], . . . , [tm] } .




                     Lev Goldfarb, ICPR, Aug. 2004         43
ETS formalism: transition to the next level
                       (a tentative form)




 For each class supertransform in a transformation system, we choose
 a canonical supertransform (shown on the left) and construct the
 corresponding next-level primitive (shown on the right).
ETS formalism: transition to the next level

                               Simplified multi-level ETS
                               representation with different
                               time scales for each level.


                               Two consecutive levels are
                               shown. The time scale for
                               the higher level is measured
                               in coarser units:
                               t’0 corresponds to t0 ,
                               t’1 corresponds to t2 ,
                               t’2 corresponds to t5   .


                               The contexts of the
                               transformations are not
                               identified.
         ETS formalism: multi-level view




A multi-level representational tower with a single-level sensor at level 0.
                                                                          46
ETS formalism: multi-level view of class representation



                                  Pyramid view (partial) of a k-th
                                  level class supertransform:
                                  the pyramid is formed by all
                                  subordinate class supertransforms.




                                                            47
  ETS model basics: the evolution of a class                         ]

Since any class is specified by a finite set of weighted k-th level
(for some k) transformations, the class evolution is readily
understood via modification of the set of transformations
(structural change) and/or their weights (quantitative change).
And this is exactly what you will observe in the functioning of
the ETS intelligent process, discussed in the next talk.




                       Lev Goldfarb, ICPR, Aug. 2004            48
               The proceedings cover page




This is “Metamorphosis III” by Escher, which was chosen for the
cover page of the proceedings as intimating an evolution of a class.




                         Lev Goldfarb, ICPR, Aug. 2004           49
ETS formalism: representational completeness



A most distinguishing feature of this formalism is unprecedented
representational completeness and explicitness.


This representational completeness radically changes the formal
side of the modeling (the corresponding “future mathematics”).




                       Lev Goldfarb, ICPR, Aug. 2004         50
       ETS formalism: the intelligent process
In particular, in pattern recognition, the nature of basic algorithms changes
radically, as you will see from the next talk: the processing is now basically
concerned with a careful (algorithmic) “examination” of the input data and
“recording” of the results of such examination in the ETS language.


This is the job of the intelligent process (which includes the learning and
recognition stages):

• the modification of the structural memory (at various levels), i.e. of the class
supertransforms, and occasionally, introduction of new levels
• the modification of the numeric memory (at various levels), which is needed,
at present, to record the statistics related to various observed recurring
associations (between various primitives as well as between the contexts and
the bodies).

                             Lev Goldfarb, ICPR, Aug. 2004                 51
   [      Learning without representation?
(See also Section 6 of my introductory paper in the proceedings).




• Within the vector space (VS) formalism, because of millennia
old tradition, it appears that representation is “easy” but learning is
“difficult”.


• Within the ETS framework, the present expectation is that
representation is difficult but learning is easy. (Not quite true.)


                        Lev Goldfarb, ICPR, Aug. 2004             52
        Learning without representation?                                           ]
However, since at the end of the VS leaning process we have
• no reliable/meaningful object or class representations
  => no “transferable knowledge” about the class objects is gained

and as a consequence

• the results of (carefully crafted) VS learning algorithms can hardly be used for any
information processing needs other than “classification”,

it is very misleading to call such process “inductive learning”.
After all, induction is the only candidate we have for the central
intelligent process.



                           Lev Goldfarb, ICPR, Aug. 2004                           53
                 About our resources


Currently, the (financial) resources of our ETS group are
absolutely negligible, which is why it is not surprising, given
the scale of the undertaking, that the more “substantial”
applications of the ETS formalism are still ahead.




                      Lev Goldfarb, ICPR, Aug. 2004           54
                 Inductive informatics

Nevertheless, we are an optimistic lot and have big plans: 

we even introduced the term inductive informatics for the
new science emerging around the development and various
applications of the ETS formalism (this is to honor the ideas
of the true prophet of modern science, Francis Bacon)

moreover, on the philosophical side, as many scientists and non-
scientists have noted, numeric formalisms have failed “to reveal
the unity of nature”; so we believe that, as a general “symbolic”
formalism, ETS, when developed, promises to fit the bill.


                      Lev Goldfarb, ICPR, Aug. 2004             55
                     [      Conclusion


Going back to slide 6, I would like you to keep in mind the question
raised there.


How mature are we as a science: are we capable of estimating the
quality of the match between the (inductive) information processes
as they exist in nature and our models of them?


Let‟s come back to this question periodically and during the
workshop‟s concluding open discussion.


                         Lev Goldfarb, ICPR, Aug. 2004         56
                                  Conclusion
• Radical explanatory novelty of the ETS model

  the relationship between an object and its class
  the nature of class and class description
  the temporal nature of class and object descriptions
  the nature of multilevel class description
  the nature of class evolution

• Radical formal/structural novelty of the ETS model

  the first temporal form of structural representation (generaliz. Peano axioms)
  its unprecedented representational completeness/transparency
  still ahead: the mathematics of such structural representations
  (the concept of class, rather than that of a set, will be the pivotal point)
                          Conclusion                ]
• Undoubtedly, the golden age of our field is still ahead, and it appears
that its arrival depends on the choice of the “right” representational
formalism.

• The very hart of any scientific enterprise is the construction of a
fundamentally new model, and much more rarely, of a new formalism.

• I do hope that some of you will participate in this most exciting
development.




                          Lev Goldfarb, ICPR, Aug. 2004            58