Document Sample
					         The Western Australian Centre for Evidence Based Nursing and Midwifery
                  A Collaborating Centre of the Joanna Briggs Institute



By:   Robin Watts
      Jeanette Robertson
      Vivien Hewitt
      Gaby Haddow
      with review panel members

Neonatal Hypoglycaemia: the diagnostic accuracy of point-of-care testing methods

Whilst undertaking a systematic review of the midwifery/nursing management of hypoglycaemia in
healthy full term neonates (see protocol Watts, Robertson, & Haddow, 2002 for details), the reliability
of point-of-care testing methods was identified as an issue (World Health Organisation, 1997). Given
the widespread use of screening for hypoglycaemia in both pre and full term infants using point-of-care
technology, investigation of the diagnostic accuracy of this type of testing is warranted as a further
systematic review.

Portable analysers provide a ‘local’ (at the point-of-care), rapid and often lower-cost alternative to
sending samples to a pathology laboratory for analysis. A variety of different types of analysers, using
either use spectrophotometric or electrochemical methods, are available for blood glucose
determination and screening in the neonate. Many of these analysers were initially developed for adult
use and, although sufficiently accurate for purposes such as blood glucose estimation in adult diabetics,
their reliability when testing neonates is yet to be adequately proved, particularly for very low values of
blood glucose.

Based on studies and related literature available up to 1995, the World Health Organisation Guidelines
(1997) do not recommend screening, defined as the scheduled measurement of blood glucose, of
healthy full term babies or large for gestational age (LGA) infants, except those born to diabetic
mothers. Of the two reasons given for adopting this position, one is the lack of reliable point-of-care
methodology (the other is that no diagnostic blood glucose concentration can be set). Hawdon (2000)
supports this position, stating that there are no indications for routine blood glucose monitoring in
healthy full term neonates. However there is a need in other groups of neonates where “single or serial
measurements of circulating blood glucose levels are essential to guide management” (Hawdon,
2000:1). Among these risk groups she included moderately preterm or growth retarded babies, any
baby presenting with an acute illness, and those cared for in neonatal units with known co-existing
clinical complications. The accuracy of these measurements is vital to detect and manage problems
with blood glucose homeostasis as quickly as possible.

Despite the WHO’s recommendation, a survey of Western Australian hospitals with maternity units
indicated that screening of full term babies is commonplace in nurseries. Possible reasons for this are
the availability of screening technology, hospital or medical protocols to protect against being sued if
the child suffers from any future health problems resulting from hypoglycaemia i.e. risk management,
or lack of knowledge as to the effectiveness of screening in this group and/or the accuracy of point-of-
care testing technology. A decade ago Reynolds and Davies (1993) were of the view that cotside
measuring of blood glucose levels had become routine and that the convenience of the measurement
had prevailed over other considerations.

Although screening of risk groups is based on evidence of effectiveness of rapid treatment to prevent
serious sequelae, laboratory testing is required for confirmation of diagnosis. However results from
laboratory tests can take up to a hour to become available. Consequently point-of care te sting is used
with these at-risk infants for rapid results and treatment if required.

Thomas, et al.(2000) indicated that the usual management of point–of-care testing in neonates is to
confirm cot-side glucose readings of less than 2.6mmol/L with laboratory analysis. If, in their unit, the
point-of-care analyser records a result less than 2.0mmol/L, a glucose infusion is commenced (this
intervention level varies with neonatal unit protoco ls) while the result of the laboratory test is awaited.
If the point-of-care results are confirmed, a crucial hour of treatment has not been lost. If however
satisfactory glucose levels are reported by the laboratory test, the infant has undergone unnecessary
invasive procedures, been exposed to possible infection, pain and blood loss, plus a possible negative

impact on bonding and delay in initiating breast feeding. Overestimation of blood glucose le vels by
point–of-care tests is equally concerning as this could well result in a failure to treat when in fact the
infant is hypoglycaemic.

The objective of this review is to determine from the available evidence the diagnostic accuracy of
point-of-care technology commonly used in estimating blood glucose levels in newborn infants.
The specific question being asked is –

Are the results from studies of point-of–care testing by various instrumentation of a sufficient level of
diagnostic accuracy to be used in point-of-care screening for hypoglycaemia in newborns?

[Bruns defines ‘diagnostic accuracy’ as: “the ability of a test to identify a condition of interest. In this
context ‘accuracy’ refers to the amo unt of agreement between the studied test(s) and the r       eference
standard” (2003: 19)]

Criteria for considering studies for this review
Types of studies
The review will only include prospective cross sectional/comparative (correlational) studies from 1995
–2004 (1995 has been selected as the start point as studies from this year on were not included in the
WHO review).

Exclusions: case control studies (Lijmer, et al, 1999)

Types of participants
Studies that include live born infants of any gestation in any hospital setting, including those admitted
to newborn and special care nurseries, and neonatal intensive care units.

Exclusions: studies of infants in settings other than hospitals e.g. home, medical clinics.

Types of index tests
All point-of-care diagnostic tests for hypoglycaemia in neonates using any form of minimally invasive
technology. (Examples of ‘minimally invasive ‘ are heel prick, blood drawn from line already in-situ.)

Sirkin et al explained glucose detection and measurement in the following way:
        When blood is placed on a test strip, a chemical reaction occurs. A measurement of these
        changes is interpreted as a glucose value. Systems measure either voltage changes, as is the case
        with electrochemical biosensor technology, or colour cha nges measured by reflectance or
        absorbtion photometry.” (2002:105)

For inclusion in this systematic review studies must have compared the index test against a recognised
reference standard e.g. the hexokinase method (Schlebusch et al, 1998).

Types of outcome measures

The outcome of interest is the diagnostic accuracy of the test. Accuracy can be expressed through
correlation of the index test with the reference standard, sensitivity and specificity, predictive values,
diagnostic likelihood ratios and the area under a receiver operator characteristic (ROC) curve (Bossuyt,
et al., 2003:2). Each measure of accuracy should be used in combination with its complementary
measure. The different measures of accuracy are explained in Appendix I.

In addition the feasibility of the test (e.g. cost, ease of use, training required) will be taken into

Search strategy
A number of electronic databases will be searched to locate relevant studies in this subject area.
Databases to be searched will include:
           •   CINAHL (1982 to most recent week 2 2004)
           •   Cochrane Library
                    § Cochrane Database of Systematic Reviews
                    § Database of Abstracts of Reviews of Effects
                    § Cochrane Central Register of Controlled Trials (CENTRAL)
           •   Centre for Reviews and Dissemination databases
                    § NHS Economic Evaluation Database
                    § Health Technology Assessment Database
           •   TRIP database
           •   MEDLINE/PubMed (1996 to March Week 2 2004)
           •   Australian Medical Index
           •   Current Contents
           •   ProQuest 5000
           •   Science Direct
           •   Ingenta
           •   InfoTrac

Individual search strategies will be developed for each database, adopting the different terminology of
index thesauri if available. As it has been established that difficulties can arise when using the MeSH
terms to locate studies (Dickersin, Scherer, & Lefebvre, 1994), the search terms used to locate studies
for the review will be drawn from the natural language terms of the topic as well as the controlled
language indexing terms used by different databases, as applicable. It has been noted that in a selected
set of Medline journals covering publications between 1992 and 1995 the use of the MeSH heading
‘Sensitivity and Specificity’ identified only 51% of all studies on diagnostic accuracy ( Bossuyt, et al,
Example of Medline search:

#1    *Blood Glucose/an [analysis]
#2    exp *Hypoglycemia/di [Diagnosis]
#3    (hypoglycaem$ or hypoglycem$).mp.
#4    2 or 3
#5    1 and 4
#6     limit 5 to human
#7     limit 6 to newborn infant <birth to 1 month>
#8     limit 7 to yr=1995-2004

If a search of a database that uses MeSH indexing requires defining further because of the general
nature of articles located, the search will be narrowed using MeSH terms such as:
   •   sensitivity AND specificity
   •   diagnostic
   •   ROC
   •   diagnosis, differential [NHMRC, 2000].
Searches will also be conducted to locate relevant unpublished materials, such as conference papers,
research reports, and dissertations. The sources searched to locate unpublished studies will include:
   •   Dissertation Abstracts
   •   Index to Theses
   •   conference proceedings
   •   research and clinical trials registers
   •   WWW sites of relevant associations
   •   direct communication with neonatal and midwifery organisations, and neonatal nurse or
       midwife researchers.
Journals relevant to the topic and accessible in Western Australian libraries or online will be hand-
searched to ensure useful studies that have not been listed in the major indexing services are located.
For highly relevant journals hand-searches will be conducted for all available issues for 2001-2004.
Reference lists of all studies and review papers will be examined to identify additional research studies.

Full text versions of the studies located will be used for the initial assessment against the inclusion and
exclusion criteria. Where abstracts in English exist for articles in Spanish, Italian or German, these will
be reviewed (translation resources only being available in these languages). Two reviewers will
conduct this assessment independently. Bibliographic details of the studies located will be organised
using the Endnote software program.

Methods of the review
All studies that meet the inclusion criteria (see Appendix II) will be assessed for methodological
quality using a checklist and accompanying notes to guide the reviewers (Appendix III). This checklist
is based on the work of Reid, Lachs and Feinstein (1995), NHMRC (2000) and the Evidence Based
Medicine Working Group (2003). Based on the results obtained by Juni, Witschi, Bloch and Egger
(1999) a scored scale will not be used. Rather the quality assessment will focus on key components of
the research design. In respect to studies of diagnostic tests, Lijimer, Mol, Heisterkamp, Bonsel, Prins,
van der Meulen, and Bossuyt (1999) identified these key components as cohort not case-control
studies, description of study population, and description and application of reference test. Reviewers
will be asked to pilot the checklist and suggested modifications will be made to ensure clarity. The
discussion arising from the piloting of the checklist contributes to the reviewers’ training.

Two reviewers will independently assess all studies included in the review and any disagreements
between reviewers will be resolved by discussion with a third reviewer.

Data extraction
Two reviewers will extract data independently, using a tool designed for the purpose. A third reviewer
will be asked to resolve any differences if the initial reviewers cannot reach agreement. The data
extraction tool will be pilot tested before use.

Data synthesis
The method of summarising and synthesising the data will be as outlined by the NHMRC (2000) for
systematic reviews of diagnostic tests.

Using the Meta- Test or AccuROC software, a summary table will be compiled showing the sensitivity,
specificity and confidence intervals for each study. Sensitivity will be plotted against specificity with a
receiver operating characteristic (ROC) curve and statistical significance tested.

Data will be combined to produce summary receiver operating characteristic (SROC) curves and
likelihood ratios generated.

An ROC curve is a “graph of the sensitivity (true -positive rate) on the vertical axis against the false-
positive rate (1- specificity) on the horizontal axis. One overall measure of the test’s accuracy is the
space under the ROC curve, where a value of 0.5 is obtained if the test does no better than chance and a
value of 1 is obtained if the test is perfect” (Irwig et al., 1994:669).

A likelihood ratio is “the ratio of the probability of a particular test result in people with disease to the
probability of the same test result in people without diseases” (Irwig et al., 1994).
If tests can be compared i.e. those studies that do both tests, the specificities and sensitivities will be
plotted using different symbols against the ‘common’ SROC (Loy, Irwig, Katelaris, & Talley, 1996).
To assess heterogeneity i.e. whether the test performance characteristics vary by study quality or
population and test characteristics (Moons, van Es, Deckers, Habbema, & Grobbee, 1997, NHMRC
2000), the data for subgroups (full term and preterm infants) defined by each important criteria for
study quality will be plotted and how they fall around the common regression will be examined. To
test for significance each feature will be added individually in the SROC model.

The outcome of this review will be presented as a guideline and placed on the Joanna Briggs Institute


Bossuyt, P.M. et al (2003). Toward complete and accurate reporting of studies of diagnostic accuracy:
        The STARD initiative. Clinical Chemistry, 49 (1):1-6.
Bruns,D.E. (2003). The STARD initiative and the reporting of studies of diagnostic accuracy. Clinical
        Chemistry, 49 (1): 19-20.
Dickersin, K., Scherer, R., & Lefebvre, C. (1994). Identifying relevant studies for systematic reviews.
        BMJ, 309 (6964), 1286-1291.
Evidence Based Medicine Working Group. (2003). Evidence Based Medicine Toolkit. Retrieved
        January 29 2003. URL:
Hawdon, J. M. (2000). Neonatal Metabolic Monitoring. Unpublished: AS131 Radiometer Copenhagen.
Irwig, L., Tosteson, A. N. A., Gatsonis, C., Lau, J., Colditz, G., Chalmers, T. C., et al. (1994).
        Guidelines for meta-analysis evaluating diagnostic tests. Annals of Internal Medicin e, 120(8),
Juni, P., Witschi, A., Bloch, R., & Egger, M. (1999). The hazards of scoring the quality of clinical trails
        for meta-analysis. JAMA, The Journal of the American Medical Association, 282(11), 1054-
Lijmer, J. G., Mol, B. W., Heisterkamp, S., Bonsel, G., J., Prins, M. H., van der Meulen, J. H. P., and
        Bossuyt, P.M.. (1999). Empirical evidence of design-related bias in studies of diagnostic tests.
        JAMA, The Journal of the American Medical Association, 282(11), 1061-1066.

Loy, C. T., Irwig, L. M., Katelaris, P. H., & Talley, N. J. (1996). Do commercial serological kits for
        Helicobacter pylori infection differ in accuracy? A meta-analysis. American Journal of
        Gastroenterology, 91(6), 1138-1144.
Moons, K. G., van Es, G. A., Deckers, J. W., Habbema, J. D., & Grobbee, D. E. (1997). Limitations of
        sensitivity, specificity, likelihood ratio and Bayes' Theorem in assessing diagnostic
        probabilities: A clinical example. Epidemiology, 8, 12-17.
NHMRC. (2000). How to Review the Evidence: Systematic Identification and Review of the Scientific
        Literature. Canberra: Commonwealth of Australia.
Reid, M. C., Lachs, M. S., & Feinstein, A. R. (1995). Use of methodological standards in diagnostic
        test research: Getting better but still not good. JAMA, The Journal of the American Medical
        Association, 274(8), 645-651.
Reynolds, G.J. & Davies, S.(1993). A clinical audit of cotside blood glucose measurement in the
        detection of neonatal hypoglycaemia. Journal of Paediatric Child Health, 29 : 289-291.
Schlebusch, H., Niesen, M., Sorger, M., Paffenholz, I.& Fahnenstich, H.(1998). Blood glucose
        determinations in newborns: Four instruments compared. Pediatric Pathology & Laboratory
        Medicine, 18:41-48.
Sirkin, A., Jalloh, T. & Lee, L. (2002). Selecting an accurate point-of-care testing system: Clinical and
        technical issues and implications in neonatal blood glucose monitoring. Journal of Specialists in
        Pediatric Nursing, 7(3): 104-112.
Thomas, C.L., Critchley, L.& Davies, M.W. (2000). Determining the best method for first- line
        assessment of neonatal blood glucose levels. Journal of Paediatric Child Health, 36 : 343-348.
Watts, R., Robertson, J., & Haddow, G. (2002). Midwifery/Nursing Management of Hypoglycaemia in
        Healthy Full Term Neonates. Unpublished: WA Centre for Evidence Based Nursing and
World Health Organisation. (1997). Hypoglycaemia of the Newborn: Review of the Literature. Geneva:
        World Health Organization.


Sensitivity                                complements       Specificity

Positive predictive value                  complements       Negative predictive

Positive diagnostic likelihood ratio       complements       Negative diagnostic
                                                             likelihood ratio

Confidence intervals can be calculated to reflect the statistical significance of each accuracy measure.

                                         CALCULATIONS OF ACCURACY

                                              Reference Test Results

                                                         +                 -
              DIAGNOSTIC TEST          + (<2.0
                RESULT (blood                           TP                 FP
                                                        FN                 TN

                                  TP=number of true positive specimens
                                 FP=number of false positive specimens
                                 FN=number of false negative specimens
                                 TN=number of true negative specimens
The sensitivity of the test is the probability that it will produce a true positive result when used on
newborn with hypoglycaemia (as compared to a reference or "gold standard"). After inserting the test
results into a table set up like Table 1, the sensitivity of a test can be determined by calculating:


The specificity of the test is the probability that it will produce a true negative result when used on a
newborn without hypoglycaemia (as determined by a reference or "gold standard"). After inserting the
test results into a table set up like Table 1, the specificity of a test can be determined by calculating:


The positive predictive value of the test is the probability that a newborn has hypoglycaemia when a
positive test result is observed. In practice, predictive values should only be calculated from cohort
studies or studies that legitimately reflect the number of people in that population who have
hypoglycaemia at that time. This is because predictive values are inherently dependent upon the
prevalence of hypoglycaemia. After inserting results into a table set up like Table 1, the positive
predictive value of a test can be determined by calculating:

The negative predictive value of the test is the probability that a newborn does not have hypoglycaemia
when a negative test result is observed. This measure of accuracy should only be used if prevalence is
available from the data. (See note in positive predictive value definition.) After inserting test results
into a table set up like Table 1, the negative predictive value of a test can be determined by calculating:


Diagnostic likelihood ratios (DLR) are not yet commonly reported in peer-reviewed literature or in
marketing information provided by test manufacturers, but they can be a valuable tool for comparing
the accuracy of several tests to the gold standard, and they are not dependent upon the prevalence of

The positive DLR represents the odds ratio that a positive test result will be observed in a population of
newborns with hypoglycaemia compared to the odds that the same result will be observed among a
population of newborns without hypoglycaemia. After inserting test results into a table set up like
Table 1, the positive DLR of a test can be determined by calculating:

                                                TP/ TP+FN
                                                FP/ FP+TN
Or it can also be expressed as sensitivity:


Useful tests will, therefore, have larger positive DLRs and less useful tests will have smaller positive
DLRs. An example interpretation of a positive diagnostic likelihood ratio equal to 5.0 is for every 1%
of newborns without hypoglycaemia that test as positive, 5% of the newborns with hypoglycaemia will
test as positive.

The negative DLR represents the odds ratio that a negative test result will be observed in a population
of newborns with hypoglycaemia compared to the odds that the same result will be observed among a
population of newborns without hypoglycaemia. After inserting the test results into a table set up like
Table 1, the negative DLR for a test can be determined by calculating:

                                                FN/ TP+FN
                                                TN/ FP+TN

                                              false negative rate
                                              true negative rate

Useful tests will, therefore, have negative DLRs close to 0, and less useful tests will have higher
negative DLRs. As an example, interpretation of a negative diagnostic likelihood ratio equal to 2.5 is
for every one false negative, we observe 2.5 true negatives.




                          Criteria                           Satisfied               Comments
Type of study
    •     Cross-Sectional /Comparative/
          Correlational (same blood from child)

Type of Participant
    •     Live born infants of any gestation
          (including those admitted to a special care
          baby unit or similar)

Type of Diagnostic Test
    •     Photometry

              - wipe off
              - non-wipe off

    •     Electrochemistry

    •     Other

Reference standard used

     • Hexokinase

     • Other

Outcome measures
    •     Correlation with reference standard
    •     Specificity and sensitivity
    •     Predictive values
    •     Diagnostic likelihood ratios
    •     ROC curve
    •     Feasibility

Other Comments

______________________________________________ _______________           __________________________________


                      Quality Appraisal Checklist for Diagnostic Tests. Part I

Article Details
                                                                                       Reviewer’s Initials. --------

Title: -------------------------------------------------------------------------------------------------------------

Year: -----------------     Source: -------------------------------------------------------------------------------

Author: -----------------------------------------------------------------------------------------------------------

1.       Is this a case control study?
                                                                             Yes       No        Unclear

                                                                        If YES exclude study from review

2.       Was the same reference standard used to verify both positive and negative test results?

                                                                             Yes       No        Unclear

                                                                        If NO exclude study from review

3.       Is there a description of the test(s)?
                                                                             Yes       No        Unclear

                                                                        If NO exclude study from review

4.       Is information provided about three of the following characteristics of the population being tested?
                      § Sex distribution
                      § Age distribution
                      § Range of symptoms/Disease stage
                      § Eligibility criteria for subjects
                                                                        Yes        No      Unclear

                                                                        If NO exclude study from review

 If the study has met all the above criteria, please continue with the quality appraisal on the next

                      Quality Appraisal Checklist for Diagnostic Tests. Part II

5.       Is the research question clearly defined?                                        Yes         No

6.       Did the sample include an appropriate spectrum of infants eg. sex                Yes         No           Not clear
         distribution, symptoms?

7.       Were participants in the sample selected consecutively?                          Yes         No           Not clear

8.       Was the same reference standard used as the control when two or                  Yes         No           Not         Not
         more instruments were compared?                                                                           clear       relevant

9.       Was the reference standard used acceptable?                                      Yes         No           Not clear

10.      Were the results of the study tests and reference standard                       Yes         No           Not clear
         assessed independently?

11.      Is there adequate information to determine the precision of                      Yes         No           Not clear
         results? Eg. clear definitions of test results, confidence intervals.

12.      Is there adequate information to reproduce the study?                            Yes         No           Not clear

13.      Are all the participants accounted for and test results reported?                Yes         No           Not clear

14.      Are the findings clinically relevant?                                            Yes         No           Somewhat

Total responses                                                                             Yes          No           Not clear

Decision to Include

More information required


Comments --------------------------------------------------------------------------------------------------------------------








Shared By: