Document Sample

Sampling and Sample Size Computation Grace H. Encelan-Brizuela, MD, MSPH Department of Preventive and Community Medicine University of the East RAMON MAGSAYSAY MEMORIAL MEDICAL CENTER, Inc. Objectives of the Lecture At the end of this lecture, students should be able to: Define research populations Enumerate reasons for sampling Discuss ways of designing inclusion and exclusion criteria Discuss types of sampling methods Compute for sample size Introduction Challenge to every research protocol: It must specify a sample of subjects that: can be studied at an acceptable cost in time and money is large enough to control random error in generalizing the study findings to the population is representative enough to control systematic error in these inferences Introduction Basic terms and concepts… Population – complete set of people with specified set of characteristics Sample – subset of the population, selected so as to be representative of the larger population Introduction Basic terms and concepts… Target population – the large set of patients throughout the world to which the results will be generalized. Defined by clinical and demographic characteristics. Accessible population – the subset of the target population that is available for the study. Defined by geographic and temporal characteristics. Reasons for sampling 1. Samples can be studied more quickly than populations 2. A study of a sample is less expensive than studying an entire population 3. A study of an entire population is impossible in most situations 4. Sample results are often more accurate than results based on a population 5. If samples are properly selected, probability methods can be used to estimate the error in the resulting statistics 6. Samples can be selected to reduce heterogeneity STUDY PLAN RESEARCH QUESTION (Truth in the Study) (Truth in the Universe) STEP # 2 STEP # 3 STEP # 1 Accessible Population Intended Sample Target Populations Specify temporal Design an approach Specify clinical and And geographic to selecting the Demographic Characteristics sample Characteristics CRITERIA CRITERIA CRITERIA Representative of Representative of Well suited to the target populations accessible population Research Question and easy to and easy to do study SAMPLING SPECIFICATION Specification Establishing Inclusion Criteria Inclusion criteria – define the main characteristics of the target and accessible populations Designing Inclusion Criteria Considerations Examples Inclusion Specifying the characteristics A 5 year trial of calcium criteria that define populations that are supplementation for preventing relevant to the research osteoporosis might specify that the question and efficient for study: subjects be: Target Demographic characteristics White females age 45 – 50 population Clinical characteristics In good general health: no known life threatening disease; not taking long-term corticosteroids Specification Establishing Exclusion Criteria Exclusion criteria – indicate subsets of individuals who meet the eligibility criteria, but are likely to interfere with the quality of the data or the interpretation of the findings Designing Exclusion Criteria Considerations Examples Exclusion Specifying subsets of the A 5 year trial of calcium criteria population that will not be supplementation for preventing studied because of: osteoporosis might exclude subjects who are: A high likelihood of being lost to follow-up Plan to move out of state An inability to provide good Disoriented or having language data barriers Ethical barriers Kidney stone formers The subject’s refusal to Unwilling to accept possibility of participate random allocation to placebo group STUDY PLAN RESEARCH QUESTION (Truth in the Study) (Truth in the Universe) STEP # 2 STEP # 3 STEP # 1 Accessible Population Intended Sample Target Populations Specify temporal Design an approach Specify clinical and And geographic to selecting the Demographic Characteristics sample Characteristics CRITERIA CRITERIA CRITERIA Representative of Representative of Well suited to the target populations accessible population Research Question and easy to and easy to do study SAMPLING SPECIFICATION Designing Inclusion Criteria Considerations Examples Inclusion Specifying the characteristics A 5 year trial of calcium criteria that define populations that are supplementation for preventing relevant to the research osteoporosis might specify that the question and efficient for study: subjects be: Target Demographic characteristics White females age 45 – 50 population Clinical characteristics In good general health: no known life threatening disease; not taking long-term corticosteroids Accessible Geographic characteristics Patients attending the medical clinic population at the investigator’s hospital Temporal characteristics Between Jan 1 and Dec 31, 2006 SAMPLING Probability Sampling - uses a random process to guarantee that each unit of the population has a specified chance of selection Probability Sampling Simple Random sampling - every subject has an equal probability of being selected for the study. - recommended way is to use a table of random numbers or a computer generated list of random numbers Probability Sampling Simple Random sampling - process of enumerating every unit of the accessible population, and then selecting the sample at random - what are needed: accurate listing of the population mechanism to find and enroll those who are chosen Probability Sampling Systematic sampling - involves selecting by a periodic process; starting point is chosen at random Example: get 200 sample from a population of 3400 Procedure: Number all units 1 to 3400; divide population with the number to be sampled (3400/200 = 17). Select any number between 1 to 17 to be the k. Then select every 17th subject thereafter. Probability Sampling Systematic sampling NOTE: should not be used when a cyclic repetition is inherent in the sampling frame. e.g. not appropriate for selecting months of the year in a study of the frequency of different types of accidents, because some accidents occur most often at certain times of the year Probability Sampling Stratified Random sampling - involves dividing the population into subgroups according to characteristics and taking a random sample from each of these “strata” Probability Sampling Stratified Random sampling - characteristics used to stratify should be related to the measurement of interest - in Medicine, commonly used strata include: age, gender, severity of disease Probability Sampling Cluster sampling - process of taking a random sample of natural groupings of individuals in the population; very useful when the population is widely dispersed and it is impractical or costly to list and sample from all of its elements Probability Sampling Cluster sampling - clusters are commonly based on geographic areas or districts, so this approach is used more often in epidemiologic research than in clinical research SAMPLING Nonprobability Sampling - sampling method in which the probability that a subject is selected is unknown Nonprobability Sampling Consecutive Sampling - involves taking every patient who meets the selection criteria over a specified time interval or number of patients; it amounts to taking the complete accessible population over the duration of the study Nonprobability Sampling Convenience Sampling - process of taking those members of the accessible population who are easily available. Nonprobability Sampling Judgemental Sampling - involves handpicking from the accessible population those individuals judged most appropriate for the study Sample Size Computation Introduction Factors that affect the number of subjects required for a study: 1. Whether alpha level chosen is the usual (p value 0.05) or smaller 2. Whether beta error is considered in addition to alpha error 3. Whether the desired difference between means or proportions to be detected is fairly small or extremely small 4. Whether the research design involves paired or unpaired data 5. Whether a large or small variance is anticipated in the data set Introduction Review of basic concepts and terms Effect size Alpha level or Significance level – probability that a positive finding is due to chance alone. Power – the probability that the effect will be detected Recall… t= d sd √N Where: d is the mean difference that was observed, sd is the standard error of that mean difference, and N is the sample size To solve for N, rearrangements have to be done. The formula becomes N = (zα)2 * (s)2 (d)2 Derivation of the Basic Sample Size Formula Formula for the Calculation of Sample Size for studies commonly pursued in Medical Research Studies using the paired t test (e.g. before and after studies) and considering alpha (Type I) error only N = (zα)2 * (s)2 (d)2 Derivation of the Basic Sample Size Formula Formula for the Calculation of Sample Size for studies commonly pursued in Medical Research Studies using the Student’s t test (e.g. one experimental group and one control group) and considering alpha (Type I) error only N = (zα)2 * 2 * (s)2 (d)2 Derivation of the Basic Sample Size Formula Formula for the Calculation of Sample Size for studies commonly pursued in Medical Research Studies using the Student’s t test and considering alpha (Type I) error and beta (Type II) errors N = (zα + zβ )2 * 2 * (s)2 (d)2 Derivation of the Basic Sample Size Formula Formula for the Calculation of Sample Size for studies commonly pursued in Medical Research Studies using a test of differences in proportions and considering alpha (Type I) error and beta (Type II) errors N = (zα + zβ )2 * 2 * p(1 - p) (d)2 Derivation of the Basic Sample Size Formula Study Characteristics Assumptions made by Investigator Type of Study Before and after study of an anti-HPN drug Data sets Pre-treatment and post-treatment observations in the same group of subjects Variable Systolic blood pressure Standard deviation (s) 15 mm Hg Variance (s2) 225 mm Hg Data for alpha (zα) p = 0.05; therefore, 95% confidence desired (two-tailed test); Zα = 1.96 Difference to be 10 mm Hg or larger difference detected (d) between pre and post-treatment blood pressure values Derivation of the Basic Sample Size Formula N = (zα)2 * (s)2 (d)2 = (1.96)2 * (15)2 (10)2 = (3.84)*(225) (100) = 864 = 8.64 = 9 subjects total 100 Derivation of the Basic Sample Size Formula Study Characteristics Assumptions made by Investigator Type of Study RCT of an anti-HPN drug Data sets Observations in one experimental group and one control group Variable Systolic blood pressure Standard deviation (s) 15 mm Hg Variance (s2) 225 mm Hg Data for alpha (zα) p = 0.05; therefore, 95% confidence desired (two-tailed test); Zα = 1.96 Difference to be 10 mm Hg or larger difference detected (d) between mean blood pressure values of the experimental group and control group Derivation of the Basic Sample Size Formula N = (zα)2 * 2 * (s)2 (d)2 = (1.96)2 * 2 * (15)2 (10)2 = (3.84)*2*(225) (100) = 1728 = 17.28 100 = 18 subjects per group * 2 grps = 36 subjects Derivation of the Basic Sample Size Formula Study Characteristics Assumptions made by Investigator Type of Study RCT of an anti-HPN drug Data sets Observations in one experimental group and one control group Variable Systolic blood pressure Standard deviation (s) 15 mm Hg Variance (s2) 225 mm Hg Data for alpha (zα) p = 0.05; therefore, 95% confidence desired (two-tailed test); Zα = 1.96 Data for alpha (zβ) 20% beta error; therefore, 80% power desired (one-tailed test); Zβ = 0.84 Difference to be 10 mm Hg or larger difference detected (d) between mean blood pressure values of the experimental group and control group Derivation of the Basic Sample Size Formula N = (zα + zβ )2 * 2 * (s)2 (d)2 = (1.96+0.84)2*2* (15)2 (10)2 = (7.84)*2* (225) 100 = 3528 = 35.28 100 = 36 subjects per grp * 2 grps = 72 subjects Derivation of the Basic Sample Size Formula Study Characteristics Assumptions made by Investigator Type of Study RCT of a drug to reduce the 5yr mortality in patients with a particular form of cancer Data sets Observations in one experimental group and one control group Variable Success=5-yr survival after Tx; Failure= death within 5 yrs of Tx Variance, p (1-p) p=0.55;therefore, (1-p) = 0.45 Data for alpha (zα) p = 0.05; therefore, 95% confidence desired (two-tailed test); Zα = 1.96 Data for alpha (zβ) 20% beta error; therefore, 80% power desired (one-tailed test); Zβ = 0.84 Difference to be 0.1 or larger difference bet the success detected (d) (survival) of the E grp and that of the C grp) Derivation of the Basic Sample Size Formula N = (zα + zβ )2 * 2 * p(1 - p) (d)2 = (1.96+0.84)2 * 2 * (0.55)(0.45) (0.1)2 = (7.84)*2*(0.2475) 0.01 = 3.88 = 388 0.01 = 388 subjects per grp * 2 grps = 776 Bigger does not always mean better (in terms of sample size…at least) References: 1. Epidemiology, Biostatistics and Preventive Medicine (Jekel) 2. Designing Clinical Research (Hulley and Cummings) 3. Research Methods in Health and Medicine (Sanchez) 4. Basic and Clinical Biostatistics (Dawson and Trapp)

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 36 |

posted: | 8/31/2012 |

language: | English |

pages: | 45 |

OTHER DOCS BY G6rG9BcZ

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.