Document Sample

              J M KUYL
   Department of Chemical Pathology
       NHLS Universitas & UFS
Physicists have a long tradition of building their own
equipment, and are often fascinated by its
mechanics. Biologists‟ fascination is primarily with
the mechanics of nature and, for many, the
machines themselves are simply tools –
complicated „black boxes‟ that produce the results
they need. It doesn‟t help that the tools biologists
are using may have been designed by physicists,
and that the two groups tend to use different jargon.

Nature 2007; 447: 116
• Quantitative analytical methods have become more
  reliable and more standardized.
• Emphasis moved away from methods development to
  the selection and evaluation of those commercial
  available methods that suit a particular laboratory best.
• Commercial kit methods are ready for implementation in
  the laboratory, often in a “closed” analytical system on a
  dedicated instrument.
• Furthermore, method evaluation is a costly exercise in
  terms of reagents, specimens, and labour and time of
  the professionals doing the evaluating.
• If not done properly it wastes laboratory revenue and
  time, if the method is accepted might lead to errors in
  medical decisions based on results the method
  generates on patient samples.
Generally what happens is that
laboratories are most concerned with
getting the methods up and running that
there is little time, or thought given, to
selection and evaluation studies.
• The most common scenario is the
  implementation of readily available commercial
  kit methods, often in a “closed” analytical system
  on a dedicated instrument.

• When a new clinical analyzer is included in the
  overall evaluation process, various instrumental
  parameters also require evaluation. Information
  on most of these parameters should be available
  from the instrument manufacturer, who should
  also be able to furnish information on what user
  studies to conduct in estimating these
  parameters for an individual analyzer.
                       Establish need

                      Method selection

                  Definition of quality



Submission of                                 Quality control
  specimen                                      practices

Reasons for Selecting a New Method

•   improve accuracy and / or precision over
    existing methods
•   to reduce reagent cost
•   to reduce labour cost
•   new analyzer or instrument
•   to measure a new analyte
•   Evaluation of need
•   Application characteristics
•   Method characteristics
•   Analytical performance characteristics
       Scopes of Method Evaluation

• Evaluation is the determination of the analytical performance
  characteristics of a new method.

• Validation is confirmation by examination and provision of
  objective evidence that the particular requirements for a specific
  intended use can be consistently fulfilled.

• Verification is confirmation by examination of objective evidence
  that specified requirements have been fulfilled.

• Demonstration is a minimum evaluation for a laboratory to use
  to show that it is able to obtain expected results by following the
  manufacturer‟s instructions. This is appropriate for test systems
  whose performance characteristics have been well studied and
     Method Evaluation and Validation
• Main purpose is error assessment.
• To demonstrate that prior to reporting patient
  test results, it can obtain the performance
  specifications for accuracy, precision, and
  reportable range of patient test results,
  comparable to those established by the
• The laboratory must also verify that the
  manufacturer‟s reference range is appropriate
  for laboratory‟s population.
An Overview of Qualitative Terms and Quantitative Measures Related
                     to Method Performance
Qualitative Concept                    Quantitative Measure

Closeness of agreement of mean value
                                       A measure of the systematic error
with “true value”

                                       Imprecision (sd)
Repeatability (within run)
                                       A measure of the dispersion of random
Intermediate precision (long term)
Reproducibility (interlaboratory)

Accuracy                               Error of measurement
Closeness of agreement of a single     Comprises both random and systematic
measurement with “true value”          influences
Total Analytical error TEA.

   TEA = RE + SE



Constant and proportional errors.
                Analytical Sensitivity
• Several terms describe the different aspects of the
  minimum analytical sensitivity of a method.
• Limit of absence (LoA) is the lowest concentration of
  analyte that the method can differentiate from zero.
• Limit of detection (LoD) is the minimum concentration
  of analyte whose presence can be quantitatively
  detected under defined conditions.
• Functional sensitivity or limit of quantification (LoQ)
  is the minimum concentration of analyte whose presence
  can be quantitatively measured reliably under defined
  The concentration at which the CV = 20%.
Illustration of different aspects of analytical sensitivity
or detection limits.
     Random Analytical Error (RE)
Factors contributing to random analytical error (RE) are those
  that affect the reproducibility of measurement. These include:
• instability of the instrument,
• variations in the temperature,
• variations in the reagents and calibrators (and calibration-
  curve stability),
• variability in handling techniques such as pipetting, mixing,
  and timing, and
• variability in operators.
These factors superimpose their effects on each other at
  different times. Some cause rapid fluctuations, and others
  occur over a longer time. Thus RE has different components
  of variation that are related to the actual laboratory setting.
    Random Analytical Error (RE)

• Within-run component of variation (wr)

• Within-day, between-run variation (br)

• Between-day component of variation (bd)
  Within-run component of variation             (wr)

is caused by specific steps in the procedure:

1. sampling

2. pipetting precision

3. short-term variations in temperature and

4. stability of the instrument.
 Within-day, between-run variation     (br)

is caused by:
1. instability of calibration curve
2. differences in recalibration that occur
   throughout the day,
3. longer term variations in the instrument,
4. small changes in the condition of the
   calibrator and reagents,
5. changes in the condition of the laboratory
   during the day, and
6. fatigue of the laboratory staff.
Between-day component of variation (bd)
is caused by:
1. daily variations in the instrument,
2. changes in calibrators and reagents
   (especially if new vials are opened each day),
3. changes in staff from day to day.
4. Although not a true random component of
   variation, any drift in the stability of the
   calibration curve over time greatly affects the
   bd as well.
Total Variance of a Method   (t2)

 t2 = wr2 + br2 + bd2

         RE = t
       Familiarization with the method

• It is essential that operators of the method become
  thoroughly familiar with the details of the method and
  instrument operation before the collection of any data
  that will be used to characterize the method‟s
• May include training by the manufacturer.
• It should be of sufficient duration that, at its completion,
  the operators can perform all aspects of the method or
  instrument operation comfortably.
Experiments for Estimating Analytical Errors
The importance of daily examination and plotting of
 comparison-of-method data cannot be over emphasized,
 and the data must be carefully examined for outliers.

Definition of an outlier from a regression line:

               | yi – Yi| > 4•sx,y
Outlier specimens must be detected immediately and
reanalyzed by both methods so that the data can correct
or confirm the outlier.
An example evaluation study: Cholesterol in serum.

 Step 1: Analytical needs
 Rapid procedure with a turnaround time of  30 min
   suitable for lipid clinic requirement. Short turnaround
   time means that patients do not have to come back for
   treatment based on lipid-profile results.
 A sample volume of  200 µL.
 Analytical range of 0 to 20 mmol/L.
 High through-put.
 Analytical goals
 An example evaluation study: Cholesterol in serum.
                                   Analytical Goals

Analyte        Acceptable    Decision level   Allowable   Maximum       Medically based
              performance           XC           error       sd          maximum sd
                 criteria                     (CLIA 88)   (CLIA 88)        (Fraser)
               (CLIA 88)                                   (CV%)           (CV%)
Albumin          ± 10%          35 g/L           3.5      0.9 (2.6%)      0.5 (1.43%)
Cholesterol     ± 10%         5.2 mmol/L        0.52      0.13 (2.5%)     0.14 (2.7%)
                              88 µmol/L          26        7.0 (8%)       1.8 (2.0%)
Creatinine       ± 15%
                              265 µmol/L         40       9.7 (3.7%)      6.2 (2.3%)
                                                          0.08 (2.9%)
                             2.75 mmol/L        0.33                      0.06 (2.2%)
                                                          0.18 (2.6%)
Glucose          ± 10%       6.9 mmol/L         0.69                      0.15 (2.2%)
                             11.0 mmol/L        1.10                      0.24 (2.2%)
Hb A1C                           7.0%          0.35%        0.14%
                              3.0 mmol/L        0.50       0.12 (4%)     0.07 (2.33%)
K             ± 0.5 mmol/L
                              6.0 mmol/L        0.50       0.12 (2%)     0.14 (2.33%)
ALP              ± 30%         150 U/L           45        11 (7.3%)      5.1 (3.4%)
CK               ± 30%         200 U/L           60        15 (7.5%)       40 (20%)
An example evaluation study: Cholesterol in serum.

  Step 2: Quality goals
  Medical decision (XC) levels of interest for cholesterol analysis
    are taken as 4.5 mmol/L; levels below this indicate low risk of
    CVD, and 6.0 mmol/L; high risk, levels above this should be
    actively treated with cholesterol lowering drugs, respectively.

  Precision goals for cholesterol are defined to be 0.12 mmol/L at
    4.5 mmol/L and 0.15 mmol/L at 6.0 mmol/L (2.5%).

  Total error goals (TEA) are 0.45 mmol/L at 4.5 mmol/L and 0.60
    mmol/L at 6.0 mmol/L (10%).
Total Analytical error. (TEA)
 For Cholesterol
 TEA = RE + SE

 10% = 2.5% + 7.5%

                                     RE = 2.5%

                                SE = 7.5%

                                     TEA = 10%
An example evaluation study: Cholesterol in serum.

  Step 3: Method selection

  Existing laboratory analyzer Beckman-Coulter LX20

  Cholesterol kit specifically designed for this analyzer.

  Senior operator who is familiar with this particular analyzer
    and is available to do the evaluation.
An example evaluation study: Cholesterol in serum.

  Step 4: Test material selection
     Synchron 1: mean [cholesterol] 2.71 mmol/L,
     Synchron 2: mean [cholesterol] 4.19 mmol/L, and
     Synchron 3: mean [cholesterol] 5.82 mmol/L.

  Pooled patient serum two levels A and B – matrix closest to
    real patient serum.

  20 Patient serum samples to be run in parallel with existing
    laboratory method.
An example evaluation study: Cholesterol in serum.

 Step 5: Within-run imprecision
 Performed by analyzing 6 aliquots of Synchron 1, 2, and
 3 and Pool A and B within a run.

                Mean (mmol/L)   sd (mmol/L)       RE %

   Synchron 1       2.69           0.028          1.04
   Synchron 2       4.21           0.042          1.00
   Synchron 3       5.80           0.073          1.26
   Pool A           4.89           0.057          1.17
   Pool B           6.54           0.109          1.67
An example evaluation study: Cholesterol in serum.

 Step 5a: Within-run imprecision
 Testing for acceptable performance
                      RE against Maximum allowable CV%

    CLIA 88: 2.5% > synchron 1: 1.04% < Fraser: 2.7%
    CLIA 88: 2.5% > synchron 2: 1.00% < Fraser: 2.7%
    CLIA 88: 2.5% > synchron 3: 1. 26% < Fraser: 2.7%
    CLIA 88: 2.5% > pool A: 1.17% < Fraser: 2.7%
    CLIA 88: 2.5% > pool B: 1.67% < Fraser: 2.7%

 proceed with step 5b
An example evaluation study: Cholesterol in serum
Step 5b: Within-run imprecision
Testing for acceptable performance
                    RE against TEA

             If 4 x RE > TEA reject method

             If 4 x RE < TEA proceed with step 6

With the TEA = 10% for cholesterol, the within-run imprecision
of synchron 1, 2, 3 and pool A and B each passes the test.

Proceed to step 6.
An example evaluation study: Cholesterol in serum.

Step 6: Between-run (day-to-day) precision
Performed by analyzing aliquots of pool A and B for 20 days

            Mean     sd
                                 RE %          4 x RE%
          (mmol/L) (mmol/L)

Pool A      4.93      0.098    1.99 < 2.5      7.96 < 10

Pool B      6.49      0.135    2.08 < 2.5      8.32 < 10
An example evaluation study: Cholesterol in serum
 Step 7: SD has confidence intervals
 Factors for computing one-sided confidence intervals
   for standard deviation.

          Degrees of
                              A0.05         A0.95
       freedom (N – 1)

              1             0.5103        15.947
              5             0.6721         2.089
             10             0.7391         1.593
             15             0.7747         1.437
             20             0.7979         1.358
 An example evaluation study: Cholesterol in serum
 Step 7: Confidence-interval estimate of random error REU
        and REL ; N = 20

          Mean       sd         sdU=      sdL=     REU= REL=
        (mmol/L)   (mmol/L)   sd x A.95 sd x A.05 4 x sdU 4 x sdL
          4.93      0.098      0.133     0.078    0.532   0.312

          6.49      0.135      0.183     0.108    0.732   0.432

       REU pool A > 0.493 and REU pool B > 0.649
 An example evaluation study: Cholesterol in serum
Step 8: Validation of linearity or reportable range
Obtained pool C by combining all serum samples with
  [cholesterol] > 15 mmol/L.
Prepared the following samples:

   Sample 1         Special prepared with [cholesterol]  0

   Sample 2         3 parts sample 1 + 1 part pool A
   Sample 3         Pool A
   Sample 4         Pool B

   Sample 5         2 parts sample 1 + 2 parts pool C

   Sample 6         Pool C
An example evaluation study: Cholesterol in serum
Step 8: Validation of linearity or reportable range
Pools analyzed by Kendal-Abell method (reference method)


   Pool A                        4.88

   Pool B                        6.52

   Pool C                        16.7
An example evaluation study: Cholesterol in serum
Step 8: Validation of linearity or reportable range
Samples 1, 2, 3, 4, 5, and 6 were analyzed in triplicate in a single run in
  random order.

                 Theoretical (X)         Mean (Y)              Bias (%)

Sample 1                 0                 0.035            +0.035 (N/A)

Sample 2               1.22                1.967            -0.024 (-2.0)

Sample 3               4.88                4.846            -0.034 (-0.7)

Sample 4               6.52                 6.47            -0.05 (-0.77)

Sample 5               8.09                 7.99             -0.1 (-1.24)

Sample 6               16.7                16.35             -0.35 (-2.1)
                                           Reportable Range of Serum-[cholesterol]

Method (Y) [cholesterol] mmol/L

                                              Y = 0.9565 X + 0.3125
                                                    R = 0.9989



                                       0             5                10             15    20
                                                    Theoretical (X) [cholesterol] mmol/L
An example evaluation study: Cholesterol in serum

 Step 9: Estimation of SE from the linearity study which is a
   comparison of the method against reference method.
   The following statistics were obtained by linear
   regression analysis:

   Y = 0.956 X + 0.313 mmol/L SY,X = 0.294

   Mean X = 6.235           Mean Y = 6.276

 Bias = | mean Y – mean X| = 0.041 mmol/L

 This is the estimate of SE at the mean of the data.
An example evaluation study: Cholesterol in serum

Step 9: Point estimate of SE at medical decision levels (X C).
For XC = 4.5 mmol/L, YC = 4.615 mmol/L
      SE1 = | YC – XC | = 0.115 mmol/L
  Because SE1 < TEA = 0.45 mmol/L,
  SE1 is acceptable.

For XC = 6.0 mmol/L, YC = 6.049 mmol/L
      SE2 = | YC – XC | = 0.049 mmol/L
Because SE2 < TEA = 0.6 mmol/L,
SE2 is acceptable
An example evaluation study: Cholesterol in serum
Step 10: Point estimate of TE
Criteria for acceptable performance:
      TEA > TE = 3 x sd + | YC – XC |
For XC1 = 4.5 mmol/L, YC1 = 4.615 mmol/L and sd = 0.098
TE1 = 3 x 0.098 + 0.115 = 0.409 mmol/L < 0.45 mmol/L
                   Performance acceptable

For XC2 = 6.0 mmol/L, YC2 = 6.049 mmol/L and sd = 0.135
TE2 = 3 x 0.135 + 0.049 = 0.454 mmol/L < 0.6 mmol/L
                   Performance acceptable
An example evaluation study: Cholesterol in serum
Step 11: Medical decision chart

                                  XC1     XC2

  Level mmol/L                     4.5     6.0
  TEA mmol/L                      0.45    0.60
  SE mmol/L                       0.115   0.049
  RE mmol/L                       0.098   0.135
  RE as % of TEA                  21.8    22.5
  SE as % of TEA                  25.6     8.2
Medical decision Chart

                          XC1

                          XC2


              Use of method decision chart.
A method with:

1. Unacceptable performance does not meet the requirement for
   quality, even when the method is working properly. Not acceptable
   for routine operation.

2. Marginal performance provides the desired quality when everything
    is working correctly. But, difficult to manage in routine operation,
    requires total QC strategy, well-trained operators, aggressive
    preventive maintenance, etc.

3. Good performance meets requirement for quality and can be well-
    managed in routine service. Requires multirule procedure with 4-6
    control measurements per run.

4. Six sigma or excellent performance is clearly acceptable and easy
    to manage in routine service and can be controlled
A comparison of methods experiment is performed to estimate
   inaccuracy or systematic error.

This performed by analyzing patient samples by the new method
  (test method) and a comparative method, then estimate the
  systematic errors (SE) on the basis of differences observed
  between the methods.

The systematic differences at the critical medical decision
  concentrations are the errors of interest.

When possible, a “reference method” should be chosen for the
 comparative method.

Any differences between a test method and a reference method
  are assigned to the test method.
                              Cholesterol Methods Comparison Plot. N = 20


Test Method mmol/L

                                      y = 1.0032x - 0.0233
                                            R = 0.999




                          0     2        4          6        8     10    12   14
                                             Comparative Method mmol/L
                      Bland - Altman Difference Plot

% Difference

               -1 0             5                10    15

                                [Cholesterol] mmol/L
   Interpretation of comparison of
           methods study.

The differences are relatively small, not more than
  2.2% across the concentration range of 2.0 –
  15.0 mmol/L.

The two methods have the same relative accuracy.

The can be substituted for the other.
 Recommended Minimum Studies for comparison
          of methods experiment.

1. Select 40 patient specimens to cover the full working
   range of the method.
2. Analyze 8 specimens a day within 2 hours by the test
   and comparative methods.
3. Graph results immediately on a difference plot and
   inspect for discrepancies.
4. Reanalyze specimens that give discrepant results.
5. Continue the experiment for 5 days if no discrepant
   results are observed.
Recommended Minimum Studies for comparison of
           methods experiment.
 6. Continue for another 5 days if discrepancies are
    observed during the first 5 days.

 7. Prepare a comparison plot of all the data to assess the
    range, outliers, and linearity.

 8. Calculate the correlation coefficient and if 0.99 or
    greater, calculate simple linear regression statistics and
    estimate the systematic error at medical decision

 9. Use the medical decision chart to combine the estimates
    of SE and RE and make judgment on the total error
    observed for the method.
   NATURE, 18 September 2003
• Monkeys reject unequal pay.
   - Sarah Brosnan and Frans de Waal
• Working for peanuts.
  - Paul Smaglik