Howard Kim BME1450 Paper

Document Sample
Howard Kim BME1450 Paper Powered By Docstoc
					BME 1450 Term Paper                                                                                                                   1

             The BBB Locomotor Scale to Assess Motor
              Function in Spinal Cord Transected Rats
                                                              Howard Kim

   Abstract—The Basso Beattie and Bresnahan (BBB) locomotor            [2]. Measuring functionality in these animals requires an assay
scale is one of the most widely used scales to assess motor function   that is sensitive to very small incremental measurements of
recovery in spinal cord injured rats. However, certain deficits        motor ability. The Basso Beattie and Bresnahan (BBB)
exist in the interval nature of the scale, particularly at the lower
                                                                       locomotor rating scale [4] is almost exclusively used in this
end where scores of 2 and 3 are rarely assigned. This study was
performed to investigate how this potential flaw may affect            situation. Indeed, the BBB scale is the most popular method
significance testing. A BBB dataset was simulated for three            for tracking functional outcome in moderate or mild injury
experimental groups; a control and two treatment groups.               models as well. The BBB rating scale is an ordinal listing of
Significance testing was performed using the standard BBB scale        definitions based on improved functional activity. Scores are
and a modified BBB scale that compensates for the scale deficit.       assessed from 0 to 21, with higher scores representing higher
We conclude that while the modified BBB scale may be more
                                                                       level of motor function.
scientifically valid, but that it may also lead to a decrease in
statistical power.

   Index Terms—BBB scale, functional outcome, spinal cord                               II. ASSESSING BBB SCORES
injury, transection injury
                                                                         A. Procedure
                                                                         The experimental protocol for assessing BBB scores is well
                        I. INTRODUCTION                                defined [4]. The testing apparatus is an open-field box
                                                                       approximately 90 cm2 in area. The walls surrounding the
S   PINAL   cord injury (SCI) is a devastating condition that
    generally causes victims to lose some or all sensory and
motor function below the site of injury. There are currently
                                                                       perimeter should be between 7 and 10 cm tall. Rats are placed
                                                                       in the box and observed by two examiners who assess the
over 25 000 people living with SCI in North America, with a            motor function based on movements in hindlimb joints, paw
rate of prevalence of 11 000 new injuries per year [1]. These          placement during stepping, weight support, forelimb-hindlimb
conditions are generally permanent due to the very limited             coordination, etc. The lower half of the BBB scale is
innate regenerative response of the human central nervous              explicitly defined in Table 1.
system. There is currently no clinically available treatment to           The observation takes place for 4 minutes during which the
regenerate or cure SCI. However, there is a large research             left and right sides of the subject are given independent scores.
community addressing this problem with many promising                  The test can be videotaped for later analysis. This is generally
results.                                                               done at several time points (eg. once per week for 12 weeks)
   In the experimental treatment of spinal cord injury, the most       to generate a trend of functional recovery over time.
relevant endpoint is improved functional outcome. Often, the            B. Sources of Variability
type(s) of behavioral test depends on the injury model that is
used. Moderate or mild injuries can be simulated using                    As with any study involving animals, there will be a great
devices such as the NYU weight drop, calibrated clip, OSU              amount of variability between subjects within any given
impactor, etc., which will lead to a compression-type injury to        treatment. However, it is important to minimize these
the spinal cord [2]. This results in the formation of a central        variations whenever possible [5].
cavity but also leads to many spared axons along the outer rim            One of the largest sources of variation is from the surgical
of the cord. Subsequently, these animals tend to have limited          procedure. It is important that the same surgeon is responsible
to substantial use of their hindlimbs due to the survival of           for every procedure. Special consideration must be taken to
axonal pathways. Many tests are sensitive to this region of            ensure that injury is performed with as much consistency as
functionality including inclined plane, grid walk, and foot print      possible between subjects.
analysis, among others [3].                                               There is also a wide variation in the natural response of
   However, in very severe models of injury such as spinal             individual animals to spinal cord injury. This is less of an
cord transection, there is very little resultant functional ability.   issue with transection injuries because there should be no
This model is used primarily to study regeneration                     sparing of any axonal tracts [6]. Nevertheless, all subjects
mechanisms due to less ambiguity in interpretation of results          should be of relatively equal starting population in terms of
                                                                       sex, strain, age, and weight, and should be kept in similar
BME 1450 Term Paper                                                                                                                         2

                             TABLE I                                          environmental conditions (food, water, caging, etc).
                                                                                 A third source of variation comes from the observers and
  Score                        Operational Definition                         their level of experience with BBB testing [7]. There should
                                                                              be at least one experienced observer assessing BBB scores.
 0            No observable hindlimb (HL) movement
 1            Slight movement of one or two joints (ankle, hip or knee) of    Observers should be blinded to the treatment of the animal as
              the HL                                                          well as to the animal’s previous scores. The same observers
  2           Extenseive movement of one joint OR extensive movement in       should be used throughout the study.
              one joint and slight movement in on other joint
  3           Extensive movement in two joints                                 C. Special Consideration on the Low Range
  4           Slight movement of all three joints
  5           Slight movement of all two joints and extensive movement of        The BBB scale was first developed using moderate and mild
              the third
  6           Extensive movement of two joints and slight movement of the
                                                                              models of rat spinal cord injury [4]. Therefore, the very
              third                                                           bottom scores on the scale were developed with intuitive ideas
  7           Extensive movement of all three joints                          of incremental increase in function. Unfortunately, it has since
  8           Sweeping with no weight support OR plantar placement of the     been demonstrated that there exists a discontinuity in the scale
              paw with no weight support
  9           Plantar placement of the paw with weight support in stance      [8]. Indeed, a large study which included two of the original
              only (i.e., when stationary) OR occasional, frequent, or        inventors of the BBB scale, Beattie and Bresnahan,
              consistent weight supported dorsal stepping and no plantar      acknowledged this deficit devised a means of compensation
  10          Occasional weight supported plantar steps, no forelimb (FL)-    [8]. As illustrated in figure 1, they found evidence that scores
              HL coordination                                                 of 2 and 3 are rarely awarded. Taking a careful look at Table
  11          Frequent to consistent weight supported planter steps and no    1, this may seem logical. Scores of 2 and 3 correspond to
              FL-HL coordination
  12          Frequent to consistent weight supported planter steps and
                                                                              being able to extensively move at least one hindlimb joint
              occasional FL-HL coordination                                   (ankle, knee, or hip) while, on the same leg, not being able to
  13          Frequent to consistent weight supported planter steps and       move another. It is certainly not unreasonable to think that the
              frequent FL-HL coordination
                                                                              three hindlimb joints are closely related in motility.
    Definitions are as follows:
Slight: partial joint movement through less than half of the range of joint      In an attempt to compensate, a post-hoc transformation was
motion.                                                                       suggested where scores of 2-4 are binned together [8]. This
Extensive: movement through more than half of the range of joint motion       transformation leads to a scale that is more interval in nature
Sweeping: rhythmic movement of HL in which all three joints are extended,
then fully flex and extend again                                              [8]. Interval scaling is a prerequisite basis for the parametric
Weight Support: contraction of the extensor muscles of the HL during          statistics often used to analyze BBB data. The purpose of this
plantar placement of the paw, or elevation of hindquarter                     paper is to see how a modified BBB scale for severe injury
Plantar Stepping: the paw is in plantar contact with weight support and
then the HL is advanced forward and plantar contact with weight support is
                                                                              will affect statistical significance testing.
Dorsal Stepping: weight is supported through the dorsal surface of the paw
at some point in the step cycle.                                                               III. STATISTICAL MODELING
FL-HL Coordination: for every FL step, an HL step is taken and the HLs
Occasional: ≤50%                                                                A. Methods
Frequent: 51-94%                                                                 Data sets were simulated for three experimental groups
Consistent: 95-100%
                                                                              representing transection only (control), treatment A, and
    This table is adapted from [4].
                                                                              treatment B. BBB scores were assigned to left and right legs
                                                                              of ten subjects per group. Scores of 2 and 3 were assigned in
                                                                              no more than 10% of the time. To yield a single score for a
                                                                              subject, the average of the scores of the left and right leg was
                                                                              used. The means of the groups were compared using one-way
                                                                              analysis of variance (ANOVA).           Post-hoc testing was
                                                                              performed using Bonferroni’s test and Tukey HSD to look for
                                                                              significance between individual groups. Individual t-tests
                                                                              were also performed between the treatment groups versus
                                                                              transection.    Differences were considered significant at
                                                                              p<0.05. SPSS software was used for the statistical analysis.
                                                                                 From the literature, the mean BBB score for rats undergoing
                                                                              transection show a stabilized mean BBB score of around 1.5
                                                                              after twelve or more weeks [9-12]. The standard deviation is
                                                                              estimated to be 0.8 [9-12]. Treatments A and B were
Fig. 1. A percent frequency plot of scores taken from severely injured rats
(n=54) scored over a six week period. BBB scores of 2 and 3 are uncommon      simulated to represent a minimally statistically significant
suggesting a discontinuity in the scale. This figure is adapted from [8].     effect and a clinically important effect respectively. The
                                                                              standard deviations for these were made larger with respect to
BME 1450 Term Paper                                                                                                                       3

                                                                           1.8, and 5.0 ± 1.8 respectively. After transformation, these
                                                                           became 1.1 ± 0.4, 2.2 ± 1.2, and 3.5 ± 1.3. One-way ANOVA
                                                                           confirmed that a statistical difference did exist among
                                                                           experimental groups in both scales (p < 0.001). Several post-
                                                                           hoc tests were done to determine which comparisons
                                                                           accounted for this difference. It was found that in both the
                                                                           original and modified BBB scale, treatment B was significant
                                                                           from controls (p < 0.001). However, treatment A went from
                                                                           reaching statistical significance to being non-significant at the
                                                                           p < 0.05 level after transformation.

                                                                                                  IV. DISCUSSION
                                                                              We have illustrated one example where modifying the scale
                                                                           to make it more interval leads to a less favorable statistical
                                                                           result. Taking a closer at the p-values shown in table 2, we
Fig. 2. A comparison of the mean BBB scores of each experimental group     notice that the p-value is dependent on the type of analysis
(n=10) before and after transformation of the BBB scale. The modified
scale leads to a non-significant result of treatment A compared with the   used.
transection only group. Significance testing was done using ANOVA             As discussed previously, ANOVA post-hoc testing with
followed by Tukey’s HSD post-hoc analysis. Error bars represent standard   both Bonferroni’s test and Tukey’s HSD results in non-
deviation. *p < 0.05, **p < 0.001.
                                                                           significance for treatment A with the modified BBB scale.
                                                                           Interestingly however, direct comparison using a t-test results
                               TABLE 2                                     in significance for treatment A in both scenarios. That is to
                 P-VALUES FOR CONTROL VS TREATMENTS                        say, the presence of treatment B group has a distinct effect on
    Test          Original BBB Scale          Modified BBB Scale           significance testing. T-tests are generally not recommended to
 Control vs Treatment A                                                    compare between specific groups in a multi-group study
 Bonferroni     p = 0.020                  p = 0.070                       because there is a familywise error associated with making
 Tukey HSD p = 0.018                       p = 0.059
 t-test         p = 0.006                  p = 0.018                       multiple comparisons [13].            Post-hoc tests such as
 Control vs Treatment B                                                    Bonferroni’s test and Tukey’s HSD compensate for this by
 Bonferroni     p < 0.001                  p < 0.001                       adusting the α level. These tests also make use of the mean
 Tukey HSD p < 0.001                       p < 0.001
 t-test         p < 0.001                  p < 0.001
                                                                           standard error from ANOVA, which is a better estimation of
                                                                           the population variances.
                                                                              To further investigate how the tranformation into the
controls because we expect a wider variation at higher BBB                 modified BBB scale affects the data, effect sizes for the
scores where there is presumably functional tissue across the              treatments were calculated Effect size is a measurement of the
injury site [6].                                                           difference to the standard deviation, or ∆/SD. For treatment
   For the purpose of this analysis, we are interested in only             A, the effect size went from 1.13 to 0.93 after transformation,
the end of study BBB score. While time-course analysis is                  a decrease of 17.7%. This corresponds to loss in statistical
useful in showing a recovery trend over time, the most                     power from 0.71 to 0.55. What this means is that the
important effects, and indeed what we would like to show with              tranformation results in a 45% chance of not being able to
statistical confidence, is the long-term recovery from a                   detect a difference if one truly exists (type II error).
treatment.                                                                    Although we show here that the discontinuity in the BBB
   The purpose of the statistical model will be to compare the             scale may affect borderline significances, it is important to
original BBB scale against the modified scale, where scores of             note that the treatment A group may not be considered
2-4 are combined. The basis of adopting the modified BBB                   clinically relevant. A clinically important result should raise
scale relies heavily on reducing the variance within a group.              BBB scores to the 5+ range where there is substantial
Because you are making the scale more compact, any                         improvement in hindlimb function and overall gait. Since we
differences you obtain will be now be smaller. This must be                are interested in such large differences, the transformation
accompanied by a similar or greater reduction in the relative              should not have a considerable effect on significance testing,
variances to detect the same statistical outcome. This study               assuming sample sizes and standard deviations are kept
will attempt to investigate how the modified BBB scale affects             reasonable. This is seen in our analysis of treatment B where
detection of statistical significance.                                     all p-values were less than 0.001.
 B. Results                                                                   Apart from finding significance with behavioral assays such
                                                                           as BBB, it is important to correlate functional recovery with
  The three groups had mean BBB scores of 1.5 ± 0.8, 3.5 ±                 histological improvement, especially in transection models
BME 1450 Term Paper                                                                                                                                          4

[6,10]. Evidence of a physical tissue bridge connecting the                      [11] E.C. Tsai, P.D. Dalton, M.S. Shoichet, and C.H. Tator, “Matrix
                                                                                      inclusion within synthetic hydrogel guidance channels improves specific
two spinal cord stumps will greatly support claims of an                              supraspinal and local axonal regeneration after spinal cord transection,”
effective treatment, and provides a less subjective sign of                           Biomaterials. vol. 27, pp. 519-533, 2006.
regeneration.                                                                    [12] K. Fouad, L. Schnell, M.B. Bunge, M.E. Schwab, T. Liebscher, and
   While the modified BBB scale may be a more                                         D.D. Pearse. “Combining schwann cell bridges and olfactory-
                                                                                      ensheathing glea grafts with chondroitinase promotes locomotor
mathematically valid measurement, there may be cases when it                          recovery after complete transection of the spinal cord,” J Neurosci. vol.
will be disadvantageous in showing statistical differences. It                        25, no. 5, pp. 1169-1178, February 2005.
should be strongly noted that this was not a comprehensive                       [13] S.W. Scheff, D.A. Saucier, and M.E. Cain, ”A statistical method for
                                                                                      analyzing rating scale data: the BBB lccomotor score,” J. Neurotrauma.
study and that factors such as proportion of scores of 2 and 3,                       vol. 19 no. 10, pp. 1251-1260. 2002.
left to right side variability, and sample size were not explicitly
studied. Indeed, whereas we found a 17.2% decrease in effect
size in our data set, Ferguson et al. report a 2% average
increase in effect size after transformation [8], suggesting that
our simulated data may represent a certain scenario that
disfavors the transformation.
   In summary, because the original BBB scale is in such
widespread use, it would require strong evidence that the
modified BBB scale is capable of better detecting statistical
differences in treatments. We found that the transformation to
make the scale more metric does not greatly increase statistical
power, and in fact may cause it to decrease in some cases.
This may affect cases where treatments are only moderately
effective, but should not influence significance testing of
clinically relevant treatments where the functional
improvements are much greater.

[1]  National Spinal Cord Injury Statistical Center (2005), Spinal Cord
     Injury Facts and Figures at a Glance [Online]. Available:
[2] R. Talac, J.A. Friedman, M.J. Moore, L. Lu, E. Jabbari, A.J.
     Windebank, B.L. Currier, and M.J. Yeszemski, “Animal models of
     spinal cord injury for evaluation of tissue engineering strategies,”
     Biomaterials. vol. 25, pp. 1505-1510, 2004.
[3] G.A. Metz, D. Merkler, V. Dietz, M.E. Schwab, and K. Fouad,
     “Efficient testing of motor function in spinal cord injured rats,” Brain
     Res. vol. 883, no. 2, pp. 165-177, 2000.
[4] D.M. Basso, M.S. Beattie, and J.C. Bresnahan, ”A sensitive and reliable
     locomotor rating scale for open field testing in rats,” J. Neurotrauma.
     vol. 12 no. 1, pp. 1-21. 1995.
[5] D.M. Basso, “Behavioral testing after spinal cord injury: congruities,
     complexities, and controversies,” J. Neurotrauma. vol. 21 no. 4, pp.
     395-404. 2004.
[6] D.M. Basso, M.S. Beattie, and J.C. Bresnahan, ”Graded histological and
     locomotor outcomes after spinal cord contusion using the NYU-weight
     drop device versus transection,” Experimental Neurology. vol. 139, pp.
     244-256. 1996.
[7] D.M. Basso et al. “MASCIS evaluation of open field motor scores:
     effects of experience and teamwork on reliability,” J. Neurotrauma. vol.
     13 no. 7, pp. 343-359. 1996.
[8] A.R. Ferguson, M.A. Hook, G. Garcia, J.C. Bresnahan, M.S. Beattie,
     J.W. Grau, “A simple post hoc transformation that improves the metric
     properties of the BBB scale for rats with moderate to severe spinal cord
     injury,” J. Neurotrauma. vol. 21 no. 11, pp. 1601-1613. 2004.
[9] Y.S. Lee, C.Y. Lin, R.T. Robertson, I. Hsiao, and V.W. Lin, ”Motor
     recovery and anatomical evidence of axonal regrowth in spinal cord-
     repaired rats,” J Neuropath. Exper. Neurol. vol. 63, no. 3, pp. 233-245,
     March, 2004.
[10] E.C. Tsai, A.V. Krassioukov, and C.H. Tator, “Corticospinal
     regeneration in lumbar grey matter correlates with locomotor recovery
     after complete spinal cord transection and repair with peripheral nerve
     graphs, fibroblast growth factor-1, fibrin glue, and spinal fusion,” ,” J
     Neuropath. Exper. Neurol. vol. 64, no. 3, pp. 230-244, March 2005.

Shared By: