Embed
Email

sample

Document Sample
sample
Shared by: HC11112321137
Categories
Tags
Stats
views:
0
posted:
11/23/2011
language:
English
pages:
4
Chapter 12

Sample Size Determination



The appropriate sample size for any experiment is a function (influence) of many issues

which describe your experiment. For example, the appropriate sample size is affected by the

number of levels of the independent variable in your experiment, as well as the level of

confidence you employ, the tolerance set for determination of a significant effect, and the type of

data (dependent variable) in your experiment. To help standardize the material which follows, I

will employ a constant level of confidence of .05. In short, the probability level needed for

rejection of the null-hypothesis will be held constant. The material has been arranged into three

sections to correspond to the dependent variable types of

nominal, ordinal and interval data, respectively.



Sample Sizes for Nominal Data



From the equation for the Chi-square we have:



 ^ 2 ={sum{(O-E)}^2} over E In this equation O represents the observed

presence of a characteristic out of N subjects or kN where k indicates the proportion of presence

and N is the sample size. Assume that we wanted to have a tolerance (T) to achieve a

difference, therefore, E = kN+TN. Further, suppose that we wanted to achieve this difference

for each cell of the O2 observation matrix. Therefore, df =1, O2=3.84, and our equation

becomes:



3.84 ={{(KN-(NK+TN))}^2} over or

3.84={T^2 N} over {K+T}

{KN+TN} which gives through algebraic manipulation:

{3.84{(k+T)}} over {T ^2} =N Under the assumptions of the null-hypothesis k



= 0 and our equation becomes:

{3.84} over T =N Therefore, if you wanted to find a 30%

differences as statistically significant then the recommended sample size would be (after

rounding up to the nearest whole number):



{3.84} over {.3} =12.8=13 Remember, this number represents the

sample size for each cell of the observation matrix. For example, suppose that we had a study

of three companies to see if the ethnic distributions (defined as white, blacks, Hispanics, and

others) differed significantly. Can you see that we have 12 groups in the study? At a 30%

tolerance we would need 156 subjects (3 companies by 4 ethnic classifications = 12 groups, with

a minimum of 13 subjects in each group or (12)(13)=156 subjects).



Sample Sizes for Ordinal Data



The formula for determining the

Z={p sub 1-p sub 2} over sqrt {pqdifference between two proportions1 is as

over N} follows:

1

You may want to see Chapter 10 of the Volume I of this series.

In this formula, p1 and p2 indicate the proportions for two groups and p is given by the

following formula and q = 1- p:

p={f sub 1+f sub 2} over {N sub 1If we assume that our groups will have equal

+N sub 2} sample sizes, square both sides of the equation

above, and performs some algebraic

{pq Z ^2} over{ ({p sub 1-p submanipulation we can derive:

2 })}^2= N At this point we can make a conservative

assumption, which have the effect of inflating our sample size, that p = .5 and q = .5. Any other

value of p and q will produce a smaller estimate of N. It is best to error on in the direction of

increasing rather than decreasing our estimated sample size! Z is a constant of 1.96 and the

difference between p1 and p2 can be seen as a

N={0.96} over {T ^ 2} tolerance (T). Therefore, we have:

As an example of the application of the formula, suppose that we had four groups in study and

we wanted to find a 10% difference as statistically significant then we would need a individual

group size of:

N={0.96} over {{0.10} ^ 2}={0.96}or a total sample size of 288 subjects (3 groups

over {0.01}=96 by 96 subjects = 288 subjects).



Sample Sizes for Interval Data



Remember, an optional formula for the t-test for difference with interval data was as



t={overline X - µ} over sqrt{S ^2 over N}









follows:



t sqrt{S ^ 2 over N}over {overline X sub 1- µ} =1









If we modify the equation algebraically, we can derive the following:

If we let the difference between means define a tolerance “T” (or a difference needed for

significant results) then we have after squaring both sizes of the equation and a little algebraic



{S ^ 2 t ^ 2 }over T ^ 2 =N









manipulation:

Now, let us assume that our score for each individual was tabulated as a Z-score or

(Score-mean)/S then the standard deviation in our equation becomes one (1) and the formula

reads as follows:

{t ^ 2 }over T ^ 2 =N









If you examine the t-test table at the .05 level of confidence with any reasonable estimated

sample size, say more than 20, you will see that the t-values stabilize at about 2.00. Therefore,



4 over T ^ 2 =N









let us assume a t of 2.00 as our estimate then our formula reads as follows:

Tolerance, considering that our observations were recorded as Z-scores, can now be seen as the

proportion of a standard deviation needed to obtain a significant effect (if a significant effect is

indeed present). For example, suppose that we wanted to estimate the needed sample size for an

experiment in which we would like to find a 40% difference between our means as statistically

significant then we would need a sample of:

4 over {.4 ^ 2}=4 over {.16}=25 This sample size would indicate the

recommended sample size for each of the

groups in the experiment. Suppose that in our

experiment there were three groups then we would need 75 subjects, 25 for each group. You do

have some variability in the exact sample size. You might have 20 subjects in one group and 30

in another. At the end of this chapter is a table which summarizes the recommended sample

sizes for various tolerance levels.



Sample size chart



To save you some time many of the recommended sample sizes have been prepared in

the table below. These numbers represent the recommended sample sizes to determine

significance (at the .05 level of confidence) for experiments with varying tolerances. Tolerance,

again, means the amount of variability you would like to find as statistically significant (I

recommend at least 30%).

Dependent Variable Data Types



Tolerance Nominal Ordinal Interval

` Small 50% - - 11







For example, suppose that you intend to have three groups in your experiments, the dependent

variable is interval type of information, and you would like to find a 40% difference as

statistically significant. You would need 75 subjects in your study (25 for each of the three

groups).



Final Comments On Sample Size



This presentation avoided many issues which relate to sample size, like the chance of

type II error and the power of the statistical method. Perhaps the largest issue not considered in

this presentation was the political issues revolving around your advisory committee (if you are

writing a dissertation/thesis). Often you will find that the needed sample size, which determined

from some rationale process (like describe in this chapter), is not sufficient in your advisor‟s

opinions. In this situation you can cast rational explanations of sample size „to the wind.‟ If

you encounter this situation then try doubling the sample size. If that do fails to work, you may

consider calling for some emotional advice, my phone number is (619) 744-1774 and my E-mail

address on the INTERNET is Weedman@Elfin.con.


Related docs
Other docs by HC11112321137
???????????? ?????????????? ...
Views: 0  |  Downloads: 0
botucatuemDados
Views: 0  |  Downloads: 0
Manuten��o Tipos e Tend�ncias
Views: 4  |  Downloads: 0
Sample Network
Views: 2  |  Downloads: 0
Zumba Schedule DanceZ 10 01 11
Views: 0  |  Downloads: 0
90-590
Views: 0  |  Downloads: 0
Ellis8th Grade Prealgebra Week6
Views: 0  |  Downloads: 0
cat H MontLouis09
Views: 1  |  Downloads: 0
zalacznik_12081
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!