# Data Analysis by jennyyingdi

VIEWS: 3 PAGES: 27

• pg 1
Data Analysis
Using Statistics
• There are several reasons researchers use
statistics in their research.
–   To describe
–   To identify relationships
–   To determine if there are differences
–   To identify other variables that may be
impacting on the research
To Describe
• Descriptive statistics are used to provide an
overview of the data
• Typically, the population being studied is
described statistically. This helps the reader
of the research to see if the research can be
generalized to other groups.
Descriptive Statistics
• The Mean – This is simply the average.
• The Median – This is the halfway number.
Half of the scores are above this number and
half are below. When one or two extreme
scores may skew the mean, the median is a
better descriptor.
• The Mode – This is the score found most
often in the data.
Descriptive Statistics
• The Range – This is the “distance” between
the highest score and the lowest score.
• Maximum – The highest score in a range.
• Minimum – The lowest score in a range.
Descriptive Statistics
• Standard Deviation – A number that
tells how close together or how spread
out the scores are. The smaller the
number the more closely grouped the
scores are.
• Typically, we like to see closely
grouped scores .
Standard Deviation
• Assume we have three groups of scores
– Group 1 – 1,2,3,4,5,6,7,8,9,10
– Group 2 – 5,5,5,5,5,5,5,5,5,5
– Group 3 – 4,5,6,4,5,6,4,5,6,5
• The mean score for each of the three groups
is 5.0, but the Standard Deviation is 3.0, 0,
0.8 respectively. Without looking at the raw
scores the SD tell us there is great variation
in scores for the first group, none in the
second, and a little variation in the third.
Standard Deviation
• It is accepted practice to also report the
Standard Deviation when you are reporting
the Mean in a research report.
• The Standard Deviation and the Variance
(the Standard Deviation is the square root of
the variance) are important in calculating
statistical tests.
Statistics Identify
Relationships
• In an earlier lesson we discussed the various
types of correlations and how to interpret
them. Correlational statistics are used to
identify relationships between variables.
– You might want to review this information.
Statistics are Used
to Identify Differences
• Whenever we conduct research, we often
want to know if the difference between two
groups is a “real” difference or did it just
happen by chance.
• There are several statistics used to
accomplish this.
t-test
• The t-test is used to determine if there
are differences between two groups
when the dependent variable is interval
or ratio.
– Test scores, job satisfaction scores, salary,
t-test
• There are two types of t-tests
– Independent or a two-sample t-test – is used for
comparing two separate groups of individuals. Is
group A different than group B?
– Paired t-test - is used for comparing the same
group of individuals on two scores (such as a
pretest score and a posttest score).
t-test
• The result of a t-test is a value for t,
such as t=4.61
• Unlike correlations the value of t means nothing,
it cannot be interpreted.
– You must look at the P value (Probability) associated
with the t-test. If the value of P is equal to or less than
.05, we can conclude the two groups are not the same.
In other words, our findings are statistically
significant.
Analysis of Variance
(ANOVA)
• ANOVA is used to determine if there
are differences among three or more
groups when the dependent variable is
interval or ratio.
– Test scores, job satisfaction scores, salary,
ANOVA
• The result of an ANOVA is reported as a value for
F such as F=4.61.
• Unlike correlations the value of F means nothing,
it cannot be interpreted.
– You must look at the P value (Probability) associated
with the F-value. If the value of P is equal to or less
than .05, we can conclude the three (or more) groups
are not the same. In other words, our findings are
statistically significant.
ANOVA
• One of the problems with ANOVA is
when we have statistically significant
results. The problem is which group is
different (because we are dealing with
three or more groups)? ANOVA
doesn’t identify the difference.
ANOVA
• In order to determine where the differences
are, we have to perform a procedure called
post hoc analysis. This technique allows us
to identify which groups are different from
the other groups.
• There are a variety of post hoc techniques
that can be used. It all depends upon the
characteristics of the data.
Chi Square

• Sometimes our dependent variable is
categorical in nature.
– Such as honor roll status; member or not;
socioeconomic status; rural, suburban or urban;
obese or not; etc.
• We use the Chi Square test. This can be used
with two groups, three groups, or more.
Chi Square
• The result of a chi square test is a value for
c2, such as c2 =4.61
• Unlike correlations the value of c2 means
nothing, it cannot be interpreted.
– You must look at the P value (Probability)
associated with the c2. If the value of P is equal
to or less than .05, we can conclude the groups
are not the same. In other words, our findings are
statistically significant.
A Problem
• One of the deficiencies of research in
agricultural and extension education is that it
tends to be simplistic. We may think one
variable is causing the effect and focus solely
on that variable, when in fact several
different variables may be combining to
cause the effect.
An Example
• Attendance at the annual conference of the
Association for Career and Technical
Education has been steadily declining. Why?
– Could it be the membership numbers have
declined, so we should expect a decline in
attendance?
– Or has the rising registration cost resulted in
declining attendance?
– Or is the location of the conference?
– Or is it something else?
• One major factor could be the problem or it
could be a combination of factors.
The Solution
• There is a statistical technique called
Multiple Regression that examines a number
of independent variables then identifies the
ones causing the change in the dependent
variable.
• This procedure can even identify the
contribution of each independent variable on
the dependent variable.
Multiple Regression
• It can also tell us how much of the
variance (difference) can be explained
by the independent variables we have
selected. There may be other variables
at work that we have not yet identified.
The Tools
• How do we go about calculating all of
these statistical tests?
– In the old days these were hand calculated.
It took several hours, even days.
– Today we use a computer.
Statistical Analysis
• Excel has a statistical module. If you do a
“standard” install of Excel, this module is not
loaded. You have to do a custom install and
select to load the statistical module. (installs
as standard on 2007)
• Excel can perform a number of the tests we
have discussed.
Statistical Analysis
• The primary statistical tool used by researchers in
agricultural and extension education is SPSS-
(Statistical Package for Social Scientists)
• This is an extremely powerful software program
and it is easy to use.
• This software is one of the installed applications
on your Novell launcher in the computer labs.
Statistical Analysis
• There are several web sites where you can paste
data and perform statistical analyses.

– Webstats http://www.webstatsoftware.com/

– Vassarstats http://faculty.vassar.edu/lowry/VassarStats.html

To top