Embed
Email

Biostats for Medical Students

Document Sample
Biostats for Medical Students
Shared by: Ashok Vashistha
Categories
Tags
Stats
views:
103
posted:
8/22/2008
language:
English
pages:
81
Clinical Biostatistics

Dr Kripa Shanker Gupta



Introduction to



Overview of Presentation

 



 









Introductory Concepts (Review) Hypothesis Testing Linear Regression and Correlation Analysis of Variance (ANOVA) Nonparametric Statistics Survival Analysis



Introductory Concepts



Introductory Concepts





Types of Data









Presenting Data Descriptive Measures Probability and Distributions Estimation Techniques



 



Types of Data





Data are usually Discrete or Continuous





Discrete Variables take on a finite set of values that can be counted





Race, Gender, Year in School etc.







Continuous Variables take on an infinite set of values





Age, Height/Weight, Blood Pressure



Types of Data





A Special type of Discrete Variable is the Binary Variable which takes on exactly 2 possible values





Gender (M/F) Pregnant? (Y/N) Hypertensive? (Y/N)











Types of Data





Sometimes, discrete variables have a “natural ordering” to them





For example, names of consecutive days in a week (M, Tu, Wed, Thurs, Fri, Sat, Sun)







Other types of discrete variables do not have a natural order and are called Nominal Variables





Race (African American, Caucasian, Asian, Hispanic etc.)



Types of Data





If in an experiment you measure a single variable, it is called a Univariate experiment







If you measure 2 variables, it is called a Bivariate experiment And if you measure multiple variables, it is called a Multivariate experiment







Types of Data





A Random variable is one whose value is determined by chance or random event Typically, a variable X is random if it is the outcome of an experiment where results can occur by chance or are not completely predictable







Types of Data





Nonparametric Variables





Many times in clinical studies, we seek opinion data (I.e. patient satisfaction scores, relative value scales etc.) The data can be ranked but has no absolute scale that is comparable This type of data is called nonparametric data











Presenting Data





There are many ways to present data:

  

   



Frequency Tables Pie Charts Bar Graphs (Histograms) Line Graphs Scatter Plots (Scattergrams) Stem and Leaf Displays Box Plots



Descriptive Measures





Now that we have displayed our data, we want to be able to characterize it quantitatively





Measures of Central Tendency





Mean, Median, Mode

Range, Variance, Standard Deviation







Measures of Variability









Measures of Relative Standing





Z-Scores, Percentiles, Quartiles



Measures of Central Tendency





Mean





Arithmetic Average of a sample of data







Median









If you order the data from smallest to highest, the median is the middle value, assuming an odd number of data elements If you have an even number of elements, it is the average of the 2 middle numbers.

The most common value in a set of values







Mode





Mode





The value which is the “most popular “ in a continuous

distribution of scores







E.g. 2,4,4,4,5,5,5,5,5,6,6,6,6,6,6,7,7, No of 2’s-1, 4’s– 3, 5’s– 5 , 6’s– 6, 7’s- 2 Mode is 6 (most popular) GREATEST FREQUENCY Simplest but least useful

Useful when data has been divided into categories















Median

 







It’s the centre point of distribution Represents the value below which 50% of all scores are located. Divides the distribution into two equal parts ( 50th percentile)

E.g. 2, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, Median will be – 5 Better estimate than mode – not effected by certain extreme values. Therefore it can not tell exactly the variations.



 







Mean ( Average)

 







 



It is the weighted centre point Calculated by summing all observations and divided by total number of observations Adv : it takes how far the values spread and allows for extreme scores. It acts as balancing point for the distribution Most commonly used measure of central tendency



Measures of Variability





Once we have located the center of a set of data points, we want to know how “dispersed” they are



Measures of Variability





Range  This is the difference between the highest and lowest value Variance  Defined to be the average of the square of the deviations of the individual data points about their mean

Standard Deviation  This is defined as the square root of the variance











Measures of Relative Standing





Percentiles and Quartiles also indicate relative standing but in terms of the categories of scores from lowest to highest





Given a set of n measurements x1, …, Xn the pth percentile is defined to be the value of x that exceeds p% of the measurements and is less than (100-p)% of the values. Ex: Scores of 20, 30, 50, 60, 67, 67, 70, 80, 90, 95









The score 50 is in the 30th percentile, meaning that 30% of the scores were lower than yours and 70% were higher than yours.

Contd…



Measures of Relative Standing





Quartiles similarly reflect in which quarter of the set of values a particular observation lies:





Ex: Scores of 20, 30, 50, 60, 67, 67, 70, 80, 90, 95 1st Quartiles = 50, 3rd Quartile = 80







Probability





Suppose you do an experiment with a finite number of possible outcomes (ex: coin toss)







The Probability of an event E (H/T) is the chance (%) that the event will turn out in a given way in the next repetition of the experiment

Probabilities values are always between 0 and 1







Contd…



Probability





The notation for probabilities is as follows:





Given our coin toss experiment,  P(H) = Probability that a Head will be tossed in the next round  P(T) = Probability that a Tail will be tossed in the next round







One can estimate probabilities by repeating the event many times and observing the outcomes



Probabilities: Some Simple Rules





Arithmetically, one can combine probabilities of simple and sequential events:









Given a complex event composed of N simple events, the probability of the complex event is equal to the sum of the probabilities of each of the simple events Ex: Coin toss 1 and Coin toss 2

First Coin

Heads Heads Tails Tails



Event

E1 E2 E3 E4



Second Coin P(Ei)

Heads Tails Heads Tails ¼ ¼ ¼ ¼



Let A = E2, E3. Then P(A) = P(E2)+P(E3) = ½



Probability Distributions





Given a random variable X (either discrete or continuous), the Probability Distribution gives a table or formula or graph of the probabilities of each potential value of X







For a Probability Distribution P(x) the following must hold:

 



0 $14 Ho:  = $14 Test statistic = Z-value = X – Uo / (Var/sqrt(N)) Rejection region = 0.05 (α value)



Testing a Hypothesis





The average weekly earnings for men in managerial and professional positions is $725. Do women in the same position have average weekly earnings that are less than those for men? A random sample of N=40 women in managerial positions showed X=$670 and Var = $102. Test the appropriate hypothesis using a = 0.01 Solution: Ho: U = 725 Ha: U 30) in order to achieve good power. But what happens when the sample size is small (N Ro or Rs <= -Ro







Few terminologies and their calculations



Abbreviation Variable CER



Equation



Value



subjects in control group subjects in experimental group events in control group

events in experimental group control event rate



= events / subjects in control group



250 150 100 15

0.4 or 40%



Abr EER ARR RRR NNT RR



Variable experimental event rate



Equation = events / subjects in experimental group



Value 0.1 or 10% 0.3 or 30% 0.75 3.33



absolute risk reduction = CER – EER (or increase) relative risk reduction (or increase) number needed to treat / harm odds ratio, relative risk = (CER - EER) / CER = 1 / ARR = CER / EER4



Randomisation

Randomisation is the process of assigning clinical

trial participants to treatment groups.



Randomisation gives each participant a known (usually equal) chance of being assigned to any of the groups. Successful randomisation requires that

group assignment cannot be predicted in advance.



Randomisation Advantages





If, at the end of a clinical trial, a difference in outcomes explanations for this difference would include: i) The intervention exhibits a real effect. ii) The outcome difference is solely due to chance.



occurs



between two treatment groups (say, intervention and control) possible



iii) There is a systematic difference (or bias) between the groups due to factors other than the intervention. Randomisation aims to obviate the third possibility.

  



Permits statistical methods to be applied to the data. Randomisation allows blinding. Current regulatory requirements require randomisation and blinding to be applied.



Randomisation disadvantages









1) If a variable is known to affect a disease outcome and is not controlled adequately than interpretation of results is difficult. 2) Practical problems.



Randomisation Procedures

   







Simple Randomisation Permuted Block Randomisation Stratified Randomisation Cluster Randomisation Dynamic (adaptive) random allocation



Bias

Bias is said to have occurred if the results observed reflect other factors in addition to (or even instead of) the effect of the treatment: Some potential sources of bias: Patient bias Care Provider bias Laboratory bias Analysis and Interpretation bias



 

 



CONFOUNDING

A problem resulting from the fact that one feature of study subjects has not been separated from a second feature, and



has thus been confounded with it, producing a spurious result. The spuriousness arises from the effect of the first feature being mistakenly attributed to the second feature.

Confounding can produce either a type 1 or a type 2 error,



but we usually focus on type 1 errors.



Blinding





All of these potential problems can be avoided if everyone involved in the study is blinded to the actual treatment the patient is receiving. Blinding (also called masking or concealment of treatment) is intended to avoid bias caused by subjective judgment in reporting, evaluation, data processing, and analysis due to knowledge of treatment.











Controls – Refers to group of patients who receive a treatment used for comparison with the trial medicine.



Hierarchy of Blinding



 



Open label: no blinding Single blind: patient or the investigator is blinded to treatment Double blind: patient and investigators (who often are also the health care providers and data collectors) blinded to treatment

Triple blind: statistician analyzing the data is also blind Full double blind: everyone who is coming in contact with the patient is blind including health care personal, nursing staff etc Full triple blind: everybody is blind who comes in contact with the patient or the investigator



 











Total clinical trial blind: everyone is blind who interacts directly with the patient, investigator or the data. Includes all the persons as in full triple blind as well as the radiologist who read radiographs, pathologists who read slides and so on and so forth



Open Label Studies

These may be useful for  Dose ranging studies.  Pharmacokinetic studies.  Pilot studies.  Phase 2 or 3 long term continuation trials  Postmarketing studies.  Compassionate plea trials. However, even these applications may be substantially biased by knowledge of the treatment given and may result in • toxicity over (or under) reported • efficacy over estimated. Even a small fraction of patients assigned at random to placebo will reduce these potential problems substantially.



Single Blind Studies





Only patient blind but not the investigator:





Justification: Double-blind is "impractical" because of need to adjust medication, medication affecting laboratory values, potential side effects, etc.







Rarely used.







Only investigator blind not the patient:





Justification: Unacceptable ethically to give an appropriate placebo treatment to a patient, and in such a case, the assessor (not the patient) should be the one blinded to the treatment. Double physician method has to be used.







Double Blind Studies





When both the subjects and the investigators are kept from knowing who is assigned to which treatment, the experiment is

called “double blind“.







Serve as a standard by which all studies are judged, since it

minimizes both potential patient biases and potential assessor



biases.





Should be used whenever possible, which is whenever it is ethically permissible to blind a patient.



Double Blinding : Techniques





Two physician method









Physician 1 – Unblinded physician speaks to and examines the patient, receive lab reports, evaluates the side effects and treatment effect. Physician 2 – Blinded physician receive reports from the physician 1 and evaluates the results.







Placebo







If only one drug has to be compared to the placebo. If 2 active drugs has to be compared.





Encapsulation





Disadvantages







Double dummy technique





Disadvantages



Placebo

 



Latin: Placebo, i shall be pleasing or acceptable.



Latin: Nocebo, i shall injure. Placebo – pharmacologically inert substance identical to the active drug to which it is compared.











Active control – medication whose efficacy has been

proven previously.



Active control or placebo controlled

Palcebo Objective

Difference Sought Analysis



Active Control



Real Pharmacological At least EQUIVALENCE , If Effect possible, improvement

Large One tailed Possibility Small Two tailed Test Confidence Interval and



Number of cases

Major Problem



Small

Ethical Consideration



Large

Choice of recognized drug and equitable condition of administration



Meta-analysis & Sys Review





A systematic review is an overview of primary studies



that used explicit and reproducible methods





A meta-analysis is a mathematical synthesis of the



results of two or more primary studies that addressed the

same hypothesis in the same way





Although meta-analysis can increase the precision of a result, it is important to ensure that the methods used for the review were valid and reliable



Advantages of systematic reviews

  



Explicit methods limit bias in identifying and rejecting studies Conclusions are more reliable and accurate because of methods used Large amounts of information can be assimilated quickly by healthcare providers, researchers, and policymakers

Delay between research discoveries and implementation of effective diagnostic and therapeutic strategies may be reduced Results of different studies can be formally compared to establish generalisability of findings and consistency (lack of heterogeneity) of results Reasons for heterogeneity (inconsistency in results across studies) can be identified and new hypotheses generated about particular subgroups



















Quantitative systematic reviews (meta-analyses) increase the precision of the overall result



When Can You Do MetaAnalysis?





Meta-analysis is applicable to collections of research that

 



 







are empirical, rather than theoretical produce quantitative results, rather than qualitative findings examine the same constructs and relationships have findings that can be configured in a comparable statistical form (e.g., as effect sizes, correlation coefficients, odds-ratios, etc.) are “comparable” given the question at hand



Thank You




Related docs
Other docs by Ashok Vashisth...
Biostats for Medical Students
Views: 103  |  Downloads: 3
Breast cancer-adjuvant therapy with paclitaxel
Views: 111  |  Downloads: 3
The Woman
Views: 69  |  Downloads: 2
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!