Large Scale AFIS Engineering

Document Sample
Large Scale AFIS Engineering Powered By Docstoc
					ControlNumber                                  1




                Large Scale AFIS Engineering

                          Rajiv Khanna
                        29 November 2000
ControlNumber                                           2




                Overview

                   AFIS Applications
                   Functionality
                   Performance parameters
                   Systems engineering relationships
                   Test and measurement
                   Modeling
ControlNumber                                                                     3




                AFIS Applications

                           Enter

          Wanted Person
                                               AFIS

                                                  Search       Response
                                                            Wanted Person
          Missing Person
                                                              – Where
                                                              – Why
                                   Encounter                  – Cautions
                                                            Missing Person
                                                              – Who
                 Felon                                        – Where
                                                            Felon
                                                              – Criminal Record
ControlNumber                                                       4




                Large Scale AFIS Functional
                Architecture

          Finger-
          print(s)    Feature       Classification      Prescreen
                     Extraction   (File Partitioning)   Matcher




                     Secondary
                                  Decision Logic
                      Matcher
ControlNumber                                    5




                Feature Extraction

                 Locate and encode features
                  from fingerprint images
                 Features are used by
                   – Classifier
                   – Matchers
                 Direct relationship to image
                  quality
                   – Produce image quality
                      measurements
                   – Small correlation to
                      matcher performance
ControlNumber                                                                                                       6




                Fingerprint Classification
                (File Partitioning)
                 Top-down representation of
                  fingerprints                             Fully Referenced
                                                                                             Right Loop
                 Uses global features, typically
                    – Core – delta locations                                 2%        3%
                                                                                                      13%
                                                        29%
                    – Ridge counts
                    – Central ridge structure
                 Limit searches to similar      1%

                  classes                       1%
                                                  0%                                                          22%

                 Three types of Classifiers
                                                     5%

                                                              12%                                2%
                    – Syntactic: Representation                           6%      0%        4%


                      and logical rules                                                                     Whorl
                    – Statistical: Global feature          Right Loop/Whorl
                      measurements
                    – Hybrid
ControlNumber                                                                7




                Prescreen Matcher

                   Coarse (low resolution) matcher
                   Input: Relatively large candidate list & search print
                   Output: Filtered candidate list for secondary matcher
                   Benefits:
                     – Can use older technology
                     – Requires less computer resources per candidate than
                       secondary matcher
                     – Can minimize system computer requirements
ControlNumber                                                                 8




                Secondary Matcher

                 High resolution matcher
                   – More features
                   – Higher discrimination
                 Input: Filtered candidate list & search print
                 Output: Similarity measures for each candidate (to search
                  print)
                 Benefits:
                   – Provide detailed matching
                   – Improve performance
                 Costs:
                   – Requires more computer resources per candidate
                   – Reduce performance if not used properly
ControlNumber                                                                                                                                           9




                Decision Logic
                 Combine results for the final report
                   – Accumulate results from distributed processes
                   – Fuse results from
                       • Different matchers
                       • Multiple fingerprints
                 Typically at the end of the processing thread
                 May be embedded in components
                                                                                   Prim ary and Secondary Matcher Scores
                                                                                                                                         True No-hits
                                                              5000                                                                       True Hits
                                                              4500                                                                       False Alarms
                                                                                                                                         Miss
                                                              4000
                                    Secondary Matcher Score




                                                              3500

                                                              3000

                                                              2500

                                                              2000

                                                              1500

                                                              1000

                                                              500

                                                                0
                                                                     0   1000   2000         3000           4000           5000   6000          7000
                                                                                            Prim ary Matcher Score
ControlNumber                                                                   10




                System Performance Parameter
                Definitions
                 System Reliability (R)
                   – Chance that the system will report a correct match given
                     that there is one in the file
                 System Selectivity (S)
                   – Average number of false candidates per search
                     expected to be reported by the system
                 False Alarm Rate (RFA)
                   – Chance that the system will report an incorrect match
                 Standard Error Margin
                   – Confidence interval for measurements
ControlNumber                                                                    11




                Internal Performance Parameter
                Definitions
                 Conditional Reliability (Rk)
                   – Chance that the kth stage will pass a correct match given
                      that it passed at previous stages
                   – Output/Input relationship for true matches
                 Filter Rate (Fk)
                   – Expected percentage of input false candidates passed
                      (output) by the kth stage
                   – Output/Input relationship for candidate matches
ControlNumber                                                                       12




                Some System Engineering
                Relationships
                 System Reliability is the product of conditional reliabilities:
                      RSystem = R1· R2· · · Rk
                 System [average] Filter Rate is the product of stage filter
                  rates:
                      FSystem = F1· F2· · · Fk
                 System Selectivity is the product of file size ( f ) and system
                  filter rate (FSystem):
                      S = f ·FSystem
ControlNumber                                                             13




                Benchmarking AFIS

                 Benchmarking is a process to measure AFIS performance
                   – Collect representative data
                      • Background File
                      • Mated Pairs
                          – File Fingerprints
                          – Search Fingerprints
                      • Un-mated Search prints
                   – Load File
                      • Background File
                      • File prints with mated search prints
                   – Run Search prints against the file as a benchmark
                 Measure performance parameters
ControlNumber                                                                                       14




                Measuring Reliability

                 R is the probability that the                     N
                  system will report the          ˆR 1
                                                  R                R     i
                  correct match given that it          N           i 1
                                                  where
                  is in the file
                 Measured with mated pairs            1 if search results in correct candidate
                                                  Ri  
                 Use a relative frequency             0 if search result is incorrect
                  approach to estimate R          i is an index to the searches
                 There is a trade-off            N is the number of searches with mates in the file.
                  between Reliability and
                  Selectivity                     Confidence Interval:
                                                                   ˆ     ˆ
                                                                   R(1  R)
                                                      z / 2
                                                                      N
                                                  where z is the number of standard deviations
                                                           2

                                                  from the mean for the confidenceinterval.
ControlNumber                                                                                   15




                Measuring Selectivity
                                                                 N
                                                 ˆS  1
                 S is the number of false      S
                                                       N
                                                                S
                                                                 i 1
                                                                        i
                  candidates per search         where
                  expected to be reported by    Si is the number of falsecandidates
                  the system
                                                  reported for the i th search
                 Use the average to
                  estimate S                    N is the number of searches.
                 Expect Selectivity to         Confidence Interval:
                  increase as file size grows             1 N
                   – Use projection models       S2 
                                                 ˆ             
                                                        N  1 i 1
                                                                          ˆ
                                                                   ( Si  S ) 2
                   – May need to adjust
                      system to reduce                 S  z 2
                                                       ˆ
                      selectivity                
                   – Adjustment will lower                 N
                      Reliability               where z is the number of standard deviations
                                                         2

                                                from the mean for the confidenceinterval.
ControlNumber                                                                                   16




                Measuring False Alarm Rate
                                                ˆ O
                                                PFA
                 RFA is the probability that        C
                  the system will report an     ˆ         ˆ
                                                RFA  f  PFA
                  incorrect match
                 Applies to systems that       where
                  have a binary outcome, e.g.   O is the number of observed falsecandidates
                  hit/no-hit report             C is the number of finger compares
                 Use a relative frequency      f is the file size.
                  approach to estimate RFA
                                                Confidence Interval:
                                                             ˆ        ˆ
                                                             PFA (1  PFA )
                                                  z / 2
                                                                   C
                                                where z is the number of standard deviations
                                                        2

                                                from the mean for the confidenceinterval.
ControlNumber                                                                                 17




                Measuring Filter Rate
                                                           N
                                                   1
                 Fk is the percentage of     Fk 
                                                   N
                                                           F
                                                           i 1
                                                                  i
                  non-mating input
                  candidates expected to be   where
                  passed by the kth stage     Fi is the filter rate for a search
                 Use the average to          N is the number of searches at stage k .
                  estimate Fk
                                              Confidence Interval:
                                                        1 N
                                              F 
                                               ˆ               ( Fi  Fk ) 2
                                                  2

                                                     N  1 i 1
                                                      F  z 2
                                                      ˆ
                                              k  
                                                          N
                                              where z is the number of standard deviations
                                                       2

                                              from the mean for the confidenceinterval.
ControlNumber                                                                                     18




                Projecting Selectivity

                 Larger file makes
                  discriminating between mate
                                                                          ˆ
                                                                        S 
                  and non-mate fingerprints           ˆ
                                                      S P  f target        
                  harder                                               f 
                 More high-score non-mates                             test 

                 Method 1: Traditional               where
                  Statistics Model (shown)            ˆ
                                                      S is the measured selectivity
                   – Divide measured                  f target is the expected system file size
                      selectivity by test file size
                                                      f test is the test file size
                   – Multiply by target file size
                 Method 2: Apply extreme
                  value statistics model
ControlNumber                                                                                                                                       19




                                  Selectivity Projection from 70K to
                                  40M File
                              10000




                                                                                               DS1 Selectivity, T= 0.05, 40M
                               1000                                                            DS1 Selectivity, T= 0.01, 40M




                               100
      Projected Selectivity




                                10




                                 1




                                0.1




                               0.01
                                   900   1000   1100   1200   1300   1400   1500       1600   1700      1800       1900        2000   2100   2200
                                                                            Threshold Score
ControlNumber                                                                                                     20




                       Measuring Reliability as a Function of
                       Score
                     100.00%

                                                                         DS1 Reliability,T= 0.05
                                                                         DS1 Reliability, T=0.1
                                                                         Poly. (DS1 Reliability, T=0.1)
                                                                         Poly. (DS1 Reliability,T= 0.05)

                     95.00%
       Reliability




                     90.00%




                     85.00%




                     80.00%
                           1000   1500   2000                     2500                        3000         3500
                                                Threshold Score
ControlNumber                                                                                              21




                        Projected Reliability and Selectivity
                       97.00%




                       96.00%




                       95.00%




                       94.00%
         Reliability




                       93.00%




                       92.00%




                       91.00%                                         60M File   40M File   20M File
                                                                      10M File   5M File


                       90.00%
                             0.001   0.01                           0.1                                1
                                            Projected Selectivity
ControlNumber                                                                         22




                Some [Textbook] Systems Engineering
                   Prove that a 95% minimum conditional reliability is needed to
                   achieve 95% system reliability.


                 Let Rmin be the minimum conditional reliability

                   Rmin  R1, R2, R3, R4, · · · , Rk

                 Recall the product equation for conditional reliability

                    RSystem = R1· R2· R3 · · · Rk  Rmin since Ri  1

                 Implies that the other conditional reliabilities must equal 1 !!!
                   (which is hard to do)

                 Corollary : Rmin > System Reliability Requirement
ControlNumber                                                                                  23




                Conditional Reliability Measurements
                                                                                  System
                                             RSSP     RM1      RM2      RMM      Reliability

                      Ten-print Rolled Ink   99.84%   99.07%   99.94%   99.94%       98.96%
                       Error Margins          0.19%    0.44%    0.11%    0.11%        0.47%


                      Two-print Rolled Ink   99.23%   98.79% 100.00%    99.94%       98.74%
                       Error Margins          0.40%    0.50%   0.00%     0.11%        0.51%


                      Two-print Flat LS      99.88%   91.95%   99.22%   98.50%       89.87%
                       Error Margins          0.17%    1.30%    0.42%    0.58%        1.44%

                      Two-print Flat LS      99.88%   91.90%   99.35%   98.04%       89.51%
                       Error Margins          0.17%    1.31%    0.38%    0.66%        1.47%



                      Note: Reliabilities were computed to machine precision and rounded.
                      Multiplying reported values will have rounding errors.
ControlNumber                                                                                                                               24




                          Computer Resources Are Modeled as
                          Linear with File Size
                                          16.00



                                          14.00       Selectivity 0.01 - 0.02
                                                      Selectivity 0.02 - 0.03
                                                      Selectivity 0.06-0.07
                                          12.00
            Relative Computer Resources




                                          10.00



                                           8.00



                                           6.00



                                           4.00

                                                                                                 Key Assumptions: scalable architectures,
                                           2.00                                                  small fixed resources, and today’s
                                                                                                 technology
                                           0.00
                                                  0           10                20      30                  40         50            60
                                                                                     File Size (Millions)
ControlNumber                                                                                                                         25




                          Additional Fingers Reduce Computer
                          Resources and Risks
                                          12.00

                                                                                           Low Selectivity < 0.01
                                                                                           Selectivity 0.01-0.02

                                          10.00                                            Selectivity 0.02-0.03
                                                                                           Selectivity 0.06-0.07
                                                                                           Extrapolations from File Size Model


                                           8.00
            Relative Computer Resources




                                           6.00
                                                                      Additional Assumption: multiple-flat
                                                                      livescan searches are similar to their
                                           4.00
                                                                      rolled counterparts



                                           2.00




                                           0.00
                                                  0   2   4           6               8                    10                    12
                                                              Num ber of Fingers
ControlNumber                                                         26




                Summary

                 Proposed a generic AFIS architecture and function
                  definitions
                 Presented performance parameters
                 Methods for measurement
                 Systems engineering
                 Modeling

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:3
posted:9/14/2012
language:Unknown
pages:26