Validity and Reliability Validity & Reliability • Internal validity - are you measuring what you think you are measuring? – Effects on Internal Validity: • History • Maturation • Testing • Subject selection • Subject mortality/attrition • Instrumentation • Statistical regression • Selection bias • Selection/maturation interaction Validity & Reliability • External Validity - generalizibility? – Effects on External Validity: • Hawthorne effect • Replication • Multiple treatments • Researcher effect Validity & Reliability • How can we control these? – Internal validity • Randomization • Calibration • Placebo, Blind, Double Blind Setups – External validity • Randomization • Ecological validity • Design Four Types of Measurement Validity • Face or logical – measure obviously involves the performance being measured • static balance - balancing on one foot – R would like more objective evidence than this • Criterion - validated vs. a criterion – 2 types: • Concurrent - a instrument is correlated with some criterion measured at the same time, concurrently – Stratford et al. (1987) Diagnostic values of knee extension tracings in suspected ACL tears Phys Ther • Predictive - use of a criterion to predict later behavior – Harris et al. (1984) Predictive validity of the “Movement Assessment of Infants” Devel Behav Ped Measurement Validity (con’t) • Construct – degree to which a test measures some hypothetical construct; established by relating the test to some behavior • What is strength? – move vs. gravity, speed specific torque, lift a weight a number of times in a specified time, functional task completed, type of muscle activity... • All are strength! – use of operational definitions • does not guarantee construct validity – is the operational definition appropriate? sensitive enough? Reliability • “the degree to which a measurement is free from random error” – Effects on Reliability: • Subject fatigue • Subject motivation • Subject learning • Subject ability • Tester skill • Different testers • Test environment Types of Reliability • Test-retest- – stability of an observation over two time intervals. • Split-half- – agreement parts of the instrument measuring the same thing? • Inter-rater- – scores of a trained observer of a measure agree with scores of a second trained observer. • Intra-rater- – scores of a trained observer agree with scores of the same trained observer. • test retest, videotape Measurement error • Sources of error: – subject • mood, fatigue, injury type, practice, motivation, knowledge – testing • test directions, multiple testers – evaluation • testers competency, criteria (subjective/objective) – instrumentation • calibration, sampling rate Other measurement issues • When ever you acquire measured data, you should make every effort to maximize its accuracy and precision. – Accuracy is how close the measurement comes to the true value. – Precision is how close the measurements are from each other (units – mm, grades, Newtons). Accuracy / Precision Not precise or accurate. Accuracy / Precision Accuracy but not very precise. Accuracy / Precision Precise and accurate. Look out! So what… Get to the point! • We want both accuracy and precision!! • Think of PT tests and measures: – VAS, MMT, Pain Provocation Tests etc. Screening and Diagnosis- Sensitivity, Specificity • Sensitivity = proportion or percentage of individuals with a particular diagnosis who are correctly identified as positive by the test (true positive). – Compared to people who do have the condition. • Specificity = proportion or percentage of without a particular diagnosis who are correctly identified as negative by the test (true negative). – Compared to people who do not have the condition. Sensitivity / Specificity Condition compared to the Gold Standard Present Absent Condition based on the Present A B test being evaluated Absent C D Sensitivity = A/(A+C) Specificity = D/(B+D) Predictive value – Tells the other part of the story. • Positive predictive value = percentage of those identified by the test as positive who actually have the diagnosis. – Compared to people who do and do not have the condition. • Negative predictive value = percentage of those identified by the test as negative who actually do not have the diagnosis. – Compared to people who do and do not have the condition. Predictive value Compared to the Gold Standard Compared Present Absent based on the test being Present A B evaluated Absent C D Positive predictive value = A/(A+B) Negative predictive value = D/(C+D) Example from the literature. • Tennet, TD, Beach, WR and Meyers, JE (2003) A Review of the Special Tests Associated with Shoulder Examination: Part I-The Rotator Cuff Tests. Am J of Sports Med. 31: 154-160. – Hawkins Test – Forward flexing the humerus to 90 degrees and forcibly internally rotating the shoulder. – Sensitivity = 92% for bursitis and 88% for cuff abnormalities. • Got it right, positively identified, people that have it! – Specificity = 44% for bursitis and 43% for cuff abnormalities. • Got it right, negatively identified, people that don’t have it! – Positive predictive values = 39% for bursitis and 37% for cuff abnormalities. • Got it right, positively identified, mixed sample. – Negative predictive values = 93.1% for bursitis and 90% for cuff abnormalities. • Got it right, negatively identified, mixed sample . Problems… • There are over 30 special tests associated with a shoulder examination (Rotator Cuff/Instability Tests). – “Unfortunately, often these tests are of little help in confirming a diagnosis and many, for example, Yergason’s sign, were originally described without recourse to evidence based medicine, but now have become part of the standard orthopedic criteria.’’ (p . 160, Tennet et al., 2003 (AJSM). You should ask… – Are some tests we use better than others? • Reliable? • Specific? • Sensitive? • Positive prediction? • Negative prediction? – Are the tests we use performed as originally described? – Are the findings of the tests we use interpreted as originally intended?
Pages to are hidden for
"Validity _ Reliability"Please download to view full document