Document Sample

ANALYSING DATA First common questions when collecting data: 1. How big a sample to take? At least >10 in each „common‟ site sample, to compare with less at the „rare‟ site. Even here +/-1 gives a 10% error! So prefer more if possible. So:- as many as possible! 2. How many replicate samples to take? Imagine: Site A B a b a b Number of 14 12 8 10 Gammarus 16 6 4 2 16 17 3 1 12 14 5 14 18 9 9 11 Aa vs Ba = enough! No overlap/little variation Ab vs Bb = need more! Overlap/high variation How many replicates, continued…. Often we need to know if we have a truly representative „mean‟ value. If we plot „mean‟ against successively increasing number of samples…. The mean value begins to stabilise around a „true‟ mean. Whatever you want a trial run helps. Coping with variation. All biological data varies – we just have to cope with this! Human height data – values overlap/variation is symmetrical about the mean. Coping with variation, continued…. “Are men and women of different height?” Data overlap and so cannot predict! Need a system of assigning a probability that the data sets are genuinely different. STATISTICS! “Are men and women of different height?” Need to know about the mean heights and about the variation about the mean. Cannot Measure whole population, so use a sample. If no overlap of data, then you do not need statistics to show that a difference exists! Measuring variation about the mean…. Data x-x (x-x)2 6 +1 1 7 +2 4 3 -2 4 9 +4 16 0 -5 25 N = 25 0 50 x= 5 n = number of samples = 5 First measure of variation about the mean is: Standard deviation of sample, ‘s’ s= (x-x)2 n-1 Variance = square of s, or s2 If you want to find out how likely it is that the two samples of data come from different populations, and the variation is normally distributed (bell shaped), use a Students ‘t’ test. ‘t’ test, continued…. Test gives a value of „t‟; use table to look up value that „t‟ must exceed for any given „probability‟ value and sample size. We use a convention that if the probability of achieving the observed difference between the means is less than 1 in 20, we can accept that a difference exists that is likely to be due to a biological reason, rather than just to chance. „1 in 20‟, can be thought of as 5%, or P=0.05 If „t‟ exceeds value for „P=0.05‟ (or lower P), then we report the difference, using both „P‟ and biological difference: „There is a real difference in the heights of men and women, men generally being taller (P<0.05)‟. Note that, if P is taken as 1 in 20, then if we tested the situation 20 times, we would expect to get one false positive difference, even if there was no real difference between the samples being tested. (A Type 1 error) Differences between samples, continued…. If you have more than two samples to compare, use Analysis of Variance (ANOVA). ANOVA and „t‟ tests are parametric tests. If variation is not normal, or is very different in scale between the samples, use a non- parametric test, based on ranking data. Not correct to use „mean‟, better to use „median‟ to describe the „central tendency‟. The „median‟ is the middle value of a ranked series of numbers. (The „mode‟ is the most frequent value.) Normal and non-normal data…. A x-x (x-x)2 B C 5 0 0 21 +16 256 3 -2 4 5 0 0 1 -4 16 8 +3 9 5 0 0 1 -4 16 4 -1 1 5 0 0 1 -4 16 5 0 0 5 0 0 1 -4 16 5 0 0 x 5 5 5 x-x 0 0 0 (x-x)2 0 320 14 As „sum of squares‟ increases, so does s2 So: „Regular‟ = small s2/mean = <1 Rule of thumb „Clumped‟ = high s2/mean = >1 „Random‟ = mid s2/mean 1 Or use „Anderson-Darling‟ normality test. For non-normal data either transform’ data to restore it to normality (check it has become normal), or use ranking test. Ranking tests…. Rank A B Rank 1 2 7 3 2 4 10 6 4 8 13 8 5 9 18 9 7 12 465 10 19 Sum of ranks 36 n1 = no. counts in A, n2 = no. counts in B R1= sum of ranks in A, R2= sum ranks in B Mann Whitney „U‟ test: U1 = n1.n2 + n2(n2+1) – R2 2 U2 = n1.n2 + n1(n1+1) – R2 2 Look up smaller of U1 & U2. Small U = diff. Mann Whitney ‘U’ test - Students t test Kruskall-Wallis test - ANOVA Relationships…. Investigating relationships, often gradients, of various sorts, rather than no‟s in categories. E.g. exposed – sheltered; wet – dry ‘Is there a trend along a gradient?’ High r2 Low r2 y = intercept + slope.x (y = a + bx) Compare slope to 0, with reference to r2 Relationships continued…. For non-parametric data use Spearman Rank Remember relationship may be non-linear. Comparing numbers in categories Top of tree Middle Bottom of tree 21 46 5 Ho – there is no difference in numbers between categories Use X2 test: X2 = (O – E)2 E X2 = (21-24)2/24 + (46-24)2/24 + (5-24)2/24 X2 = 0.38 + 20.17 + 15.04 = 35.59 Look up for degrees of freedom = n-1 (ie 2) Find value of P that X2 exceeds: P<0.001 So: reject Ho – there is a difference! FINALLY…. Have a clear question! Have clear null hypothesis! 1. Explore your data:- Look at it Graph it Tabulate it Think about it Confirm that it is appropriate! 2. Decide if you need statistics:- Are differences obvious? What test is correct? Are data sufficient for the test? Do the test if needed 3. Support or reject Ho:- Report P value clearly Report biological result clearly! GOOD LUCK!!

DOCUMENT INFO

Shared By:

Categories:

Tags:
Analysing Data, how to, data analysis, research question, descriptive statistics, qualitative data, data collection, Data Service, data analysis tools, computer-readable data

Stats:

views: | 16 |

posted: | 2/12/2010 |

language: | English |

pages: | 12 |

OTHER DOCS BY malj

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.