Basic Pattern Recognition Concept by izy20048


									                                                       Concepts of Pattern Recognition
                                                     • Pattern: A pattern is the description of an
     Basic Pattern Recognition
                                                     • According to the nature of the patterns to
             Concept                                   be recognized, we may divide our acts of
                                                       recognition into two major types:
                  Xiaojun Qi
                                                        – The recognition of concrete items
                                                        – The recognition of abstract items

                                                     • The study of pattern recognition problems
• When a person perceives a pattern, he                may be logically divided into two major
  makes an inductive inference and
  associates this perception with some
  general concepts or clues which he has                – The study of the pattern recognition capability
  derived from his past experience.                       of human beings and other living organisms.
                                                          (Psychology, Physiology, and Biology)
• Thus, the problem of pattern recognition
  may be regarded as one of discriminating
  of the input data, not between individual             – The development of theory and techniques for
  patterns but between populations, via the               the design of devices capable of performing a
  search for features or invariant attributes             given recognition task for a specific
  among members of a population.                          application. (Engineering, Computer, and
                                                          Information Science)

                                                   Task of            Input Data           Output Response
                                                   Character          Optical signals or   Name of
                                                   Recognition        strokes              character
• Pattern recognition can be defined as the
  categorization of input data into identifiable   Speech             Acoustic             Name of word
                                                   Recognition        waveforms
  classes via the extraction of significant
                                                   Speaker            Voice                Name of speaker
  features or attributes of the data from a
  background of irrelevant detail.
                                                   Weather            Weather maps         Weather forecast
                                                   Medical            Symptoms             Disease
                                                   Stock Market       Financial news       Predicted market
                                                   Prediction         and charts           ups and downs.
                                                                  Fundamental Problems in Pattern
• Pattern Class: It is a category determined by
  some given common attributes.                                     Recognition System Design
• Pattern: It is the description of any member of a            • The first one is concerned with the representation of
  category representing a pattern class. When a                  input data which can be measured from the objects to
  set of patterns falling into disjoint classes is               be recognized.
  available, it is desired to categorize these                    – The pattern vectors contain all the measured
                                                                    information available about the patterns. The
  patterns into their respective classes through the                measurements performed on the objects of a pattern
  use of some automatic device.                                     class may be regarded as a coding process which
• The basic functions of a pattern recognition                      consists of assigning to each pattern characteristic a
  system are to detect and extract common                           symbol from the alphabet set.
  features from the patterns describing the objects               – When the measurements yield information in the
  that belong to the same pattern class, and to                     form of real numbers, it is often useful to think of a
                                                                    pattern vector as a point in an n-dimensional
  recognize this pattern in any new environment                     Euclidean space.
  and classify it as a member of one of the pattern               – The set of patterns belonging to the same class
  classes under consideration.                                      corresponds to an ensemble of points scattered
                                                                    within some region of the measurement space.

• The second problem concerns the extraction of                 • The third problem involves the determination of
  characteristic features or attributes from the                  optimum decision procedures, which are needed
  received input data and the reduction of the
  dimensionality of pattern vectors. (This is often               in the identification and classification process.
  referred to as the preprocessing and feature                     – If completed a prior knowledge about the patterns to
  extraction problem.)                                               be recognized is available, the decision functions may
   – The features of a pattern class are the characterizing          be determined with precision on the basis of this
     attributes common to all patterns belonging to that             information.
     class. Such features are often referred to as intraset
     features.                                                     – If only qualitative knowledge about the patterns is
   – The features which represent the differences between            available, reasonable guesses of the forms of the
     pattern classes may be referred to as the interset              decision functions can be made.            Need adjustment
     features. The elements of intraset features which are           as necessary.
     common to all pattern classes under consideration
     carry no discriminatory information and can be ignored.       – If there exists little, if any, a priori knowledge about
   – The extraction of features has been recognized as an            the patterns to be recognized, a training or learning
     important problem in the design of pattern recognition          procedure is needed.

                                                                             Design Concepts and
• The patterns to be recognized and
                                                                 • Membership-roster Concept
  classified by an automatic pattern
                                                                    – Characterization of a pattern class by a roster
  recognition system must possess a set of
                                                                      of its members suggests automatic pattern
  measurable characteristics.                                         recognition by template matching.
• Correct recognition will depend on
  – The amount of discriminating information                        – The membership-roster approach will work
    contained in the measurements;                                    satisfactorily under the condition of nearly
  – The effective utilization of this information.                    perfect pattern samples.
• Common-property Concept
                                                      • Advantage: (Membership-roster Concept
  – Characterization of a pattern class by
    common properties shared by all of its              vs. Common-property Concept)
    members suggests automatic pattern                  – The storage requirement for the features of a
    recognition via the detection and processing          pattern class is much less severe than that for
    of similar features.                                  all the patterns in the class.

  – The basic assumption in this method is that         – Significant pattern variations cannot be
    the patterns belonging to the same class              tolerated in template matching. If all the
    possess certain common properties or                  features of a class can be determined from
    attributes which reflect similarities among           sample patterns, the recognition process
    these patterns.                                       reduces simply to feature matching.

• Clustering Concept                                  • Overlapping clusters are the result of:
  – When the patterns of a class are vectors            – A deficiency in observed information;
    whose components are real numbers, a
    pattern class can be characterized by its           – The presence of measurement noise.
    clustering properties in the pattern space.
                                                      • The degree of overlapping can often be
  – If the classes are characterized by clusters        minimized by:
    which are far apart, simple recognition
    schemes such as the minimum-distance                – Increasing the number and the quality of
    classifiers may be successfully employed.             measurements performed on the patterns of a
  – When the clusters overlap, it becomes
    necessary to utilize more sophisticated
    techniques for partitioning the pattern space.

                                                     • Heuristic Methods: The heuristic approach is
• The basic design concepts for automatic              based on human intuition and experience,
  pattern recognition described above may              making use of the membership-roster and
  be implemented by three principal                    common-property concepts.
  categories of methodology:                           – A system designed using this principle generally
  – Heuristic;                                           consists of a set of ad hoc procedures developed
  – Mathematical;                                        for specialized recognition tasks.
  – Linguistic or syntactic.
                                                       – Decision is based on ad hoc rules.

                                                       – Example: Character recognition (Detection of
                                                         features such as the number and sequence of
                                                         particular strokes)
• Mathematical Methods: It is based on                            • Linguistic (Syntactic) Methods: Characterization
  classification rules which are formulated and                     of patterns by primitive elements (subpatterns)
  derived in a mathematical framework,                              and their relationships suggests automatic
  making use of the common-property and                             pattern recognition by the linguistic or syntactic
  clustering concepts.                                              approach, making use of the common-property
  – Deterministic approach:                                         concept.
     • Does not employ explicitly the statistical properties of      – A pattern can be described by a hierarchical structure
       the pattern classes.                                            of subpatterns analogous to the syntactic structure of
  – Statistical approach:                                              languages. This permits application of formal
                                                                       language theory to the pattern recognition problem.
     • It is formulated and derived in a statistical framework.
     • Example: Bayes classification rule and its variations.        – This approach is particularly useful in dealing with
       This rule yields an optimum classifier when the                 patterns which cannot be conveniently described by
       probability density function of each pattern population         numerical measurements or are so complex that local
       and the probability of occurrence of each pattern class         features cannot be identified and global properties
       are known.                                                      must be used.

                                                                      Examples of Automatic Pattern
                                                                          Recognition Systems
 • In a supervised learning environment, the                      • Character Recognition:
   system is taught to recognize patterns by means                   – Technique Used: Rather than being
   of various adaptive schemes. The essentials of                      compared with pre-stored patterns, hand-
   this approach are a set of training patterns of                     printed characters are analyzed as
   known classification and the implementation of                      combinations of common features, such as
   an appropriate learning procedure.                                  curved lines, vertical and horizontal lines,
 • The unsupervised pattern recognition                                corners, and intersections.
   techniques are applicable to the situations
   where only a set of training patterns of unknown
   classification may be available.

• Automatic Classification of Remotely                            • Biomedical Applications:
  Sensed Data:                                                       – Technique Used: Pattern primitives, such as
   – Examples: Land use, crop inventory, crop-                         long arcs, short arcs, and semi-straight
     disease detection, forestry, monitoring of air                    segments, which characterize the
     and water quality, geological and                                 chromosome boundaries are defined. When
     geographical studies, and weather prediction,                     combined, these primitives form a string or
     plus a score of other applications of                             symbol sentence which can be associated
     environmental significance.                                       with a so-called pattern grammar. There is
   – Technique Used: Bayes classifier                                  one grammar for each type (class) of
                                                     • Nuclear Reactor Component Surveillance:
                                                       – Technique Used:
                                                         • Detect the clusters of pattern vectors by iterative
• Fingerprint Recognition:                                 applications of a cluster-seeking algorithm.
  – Technique Used: It detects tentative minutiae        • The data cluster centers and associated
    and records their precise locations and                descriptive parameters, such as cluster
    angles.                                                variances, can then be used as templates against
                                                           which measurements are compared at any given
                                                           time in order to determine the status of the plants.
                                                         • Significant deviations from the pre-established
                                                           characteristic normal behavior are flagged as
                                                           indications of an abnormal operating conditions.

   A Simple Pattern Recognition
• A simple scheme for pattern recognition            • We assume that the a priori probabilities
  consists of two basic components:                    for the occurrence of each class are equal,
  – Sensor: It is a device which converts a            that is, it is just as likely that x comes from
    physical sample to be recognized into a set of     one class as from another.
    quantities which characterize the sample.
  – Categorizer: It is a device which assigns each
    of its admissible inputs to one of a finite
    number of classes or categories by computing
    a set of decision functions.

To top