STATISTICS IN EDUCATION
Prof. Madya Dr. Ananda Kumar Palaniappan
Faculty of Education, University of Malaya
Statistics in the Social Sciences including Education plays a rather different role compared to
pure statistics in the Pure Sciences. However, the theories underlying the various statistical
methods employed in the Social Sciences from which these theories are derived are essentially
the same for both disciplines. In the Social Sciences it is used to identify issues or problems,
determine the best approach in solving these problems and also to evaluate the approaches used
or test their efficacy. It is also used in validating and testing the reliability of questionnaires,
inventories and other psychometric measures of human capacities and capabilities in the
cognitive, affective and psychomotor domains. Other areas statistics are also very commonly
used are in sampling procedures, testing for normality, and testing assumptions prior to the
employment of the various statistical methods in testing the hypotheses proposed. This paper will
discuss some of the issues relating to the use of statistics in these areas and how they can be
best used to avoid employing the wrong statistical tests and making the wrong conclusions which
might invalidate the findings.
The role of statistics in research is growing by the day. It is now applied in all
fields of human endeavor including medicine, education, business, sports,
economics, to name a few. The use of statistics in educational research has
grown from just analyzing data for percentages, means and standard deviations
to more sophisticated techniques such as data mining and structural equation
modeling. These statistics have helped researchers to uncover the significant
relationships among the various variables relating to human potentials and
behaviors. The use of statistics has also been facilitated with the advent of
technology and softwares which have hastened the process. When previously
calculators were the norms of the day, today softwares like SPSS, SAS and
AMOS are making the use of statistics in research very convenient and user-
The word “statistics” is derived from the latin term statisticum collegium ("council
of state") and from the italian word statista which means statesman or politician.
The first usage of statistics was by states which collected data on its people for
effective administration. Later statistics involved not only collecting data but also
analyzing them. This process became more and more sophisticated until today
we find statistics playing a major role in testing the validity of complex models in
education, business and medicine where complex statistical packages like
AMOS, LISREL and SEPATH are used in model testing to inform policy and
Statistics is involved right from the beginning of the research process to data
collection, data analysis, interpretation of output from the analyses, derivation of
conclusions until the formulation of implications based on the findings of the
analysis of the data. Descriptive statistics involve summarizing data in terms of
means, standard deviations and testing for normality using statistical analyses
like Kolmogorov-Smirnov and Shapiro Wilk. Inferential statistics, on the other
hand, refers to inferences on the population using relevant inferential statistical
tests on the data from the sample.
The purpose of this paper is to discuss the various uses of statistics in
educational research. It not only discusses how statistics is used in identifying
problems in the Social Sciences, it also discusses how statistics is used in
determining the best approach for solving problems and testing the efficacy of
programs or approaches designed based on some sound theoretical
propositions. This paper will also discuss how statistics is used in enhancing the
reliability and validity of the instruments used to gather data and ascertaining the
normality of data collected.
Design of instruments
Statistics is increasing involved in the first stages of research in the social
sciences. In the designing of the instruments or questionnaires, statistics is used
to determine how inferences on the construct measured can be made. The
design of questionnaires depends on the operational definitions of the constructs
that is to be measured. Without a deep understanding of the constructs that are
measured by the questionnaire, it is impossible to interpret any result derived
from statistical analysis of the questionnaire. For example, the mean and
standard deviations of the items used in measuring a certain construct can only
be interpreted accurately if there is an objective approach of understanding the
meanings from the statistics. Statistics is also used in the interpretation of the
data, for example, whether a score of 40 would be considered high, average or
low. Using the percentile scores it would be possible to determine these
groupings based on the population norms.
Sampling and Sample Size
Sampling and sample size are two important areas where statistics is needed.
This is extremely important if results are to be considered valid and
generalizable. There are many different methods of sampling – simple random
sampling, stratified sample, cluster sampling and systematic sample, among
others. Each involved the use of statistics in ascertaining to what extent they are
representative of the population they are drawn from. In simple random sampling,
the sample should be drawn from the population where each member has an
equal chance of being selected. Hence, statistics ensures that there is
generalizablity in the findings.
Determination of the sample size is also another process that involves statistics.
The ‘right’ sample size for a particular study involves the statistical calculation
which involves Cohran’s (1977) formula which based on the nature of the data /
variable, the alpha level and standard deviation. A complete discussion on this
can be found in Bartlette, Kotrlik & Higgins (2001).
Pilot tests are conducted to test the validity and the reliability indices of the
instruments used especially when these instruments measure a certain
psychological construct like motivation or perception or a work-related construct
like job-satisfaction or performance.
Among the validity tests requiring statistical tests is the criterion-related validity
which requires Pearson product-moment correlation. To assess the internal
reliability of the instrument, the Cronbach – alpha is ascertained. To determine
the number of factors assessed by a certain instrument, Factor Analyses is used.
This stage of the research in education involves many processes where many
forms of statistical testing are used to test the various hypotheses formulated in
the beginning of the research process that is in the research conceptualization
stage. Unlike the pure sciences, research in the social sciences requires a
thorough rationale for the study be formulated based on theory and findings in
literature. This hypothesis testing stage involves the formulation of the null
hypothesis and the corresponding alternative analyses.
In the social sciences, there are essentially two major types of statistical tests:
parametric and non-parametric tests. The parametric tests are used when the
variables have normal distributions and the subjects are randomly selected from
the population. This is usually tested by ascertaining the skewness and kurtosis.
When these two values lie between -2 and +2, the distributions are considered
normally distributed and are thus amenable to parametric statistical tests like
Pearson Product-moment Correlation, r (for correlational tests) or the t-tests or
Analyses of Variance (ANOVA) (for comparison of groups). Normality are also
statistically tested using the Kolmogorov- Smirnov tests or Shapiro-Wilk tests.
However, these two tests impose very stringent tests for normality.
The probability level for the significance of the findings are also ascertained. In
the social sciences this is usually set at 95% confidence level or p = .05.
The next stage of hypothesis testing in educational research involves selecting
the right distribution to test the hypotheses proposed. Again statistics relating to
the normal distribution, t-distribution, F-distribution and Chi-square distribution
are commonly used in testing the significance of the calculated values.
Based on the level of significance and the degree of freedom involved, the
distribution table is used to obtain the critical values that will enable the testing of
the various statistics calculated. Based on whether the statistics calculated falls
in the “Reject Ho” zone or the “Non-reject zone”, decisions are made on whether
to reject the null hypothesis or not.
Another important concept in hypotheses testing is the Type I and Type II errors.
Type I error is said to have taken place when the true null hypothesis is rejected
while Type II error occurs when the false null hypothesis is not rejected. This is
shown in the Figure I.
Figure 1 Type I and Type II Errors (Source: Internet)
The probability of the Type I error is called α while the probability of Type II error
is β. The relationship between them is α = 1 – β. If we try to decrease α, β
In the social sciences, the results of statistical analyses are reported in a
standard format called the APA format. APA stands for American Psycholological
Association. In this format, the results of the t-tests, for example, are reported as
t (294) = 3.67, p < .01 with effect size .43 and power = .97 while the results of the
F-test are reported as: (F (2, 38) = .632, p >.05). The numbers in parenthesis
represent the degree of freedom, df.
An example of the multiple t-test comparison results are shown in Table 1
Comparisons of Multitalent Perception of Boys and Girls
Multitalent Perception Boys (n = 142) Girls (n = 154) t* Effect
M SD M SD size
Versatility Index 31.39 7.44 28.19 7.56 3.67* .43
Artistry 6.26 3.65 5.44 3.23 2.06* .24
Musical 6.77 3.38 6.35 3.20 1.09 .13
Creative Imagination 8.03 1.53 7.21 1.90 4.11* .47
Initiative 3.15 1.25 2.72 1.38 2.84* 33
Leadership 3.27 .95 3.01 1.04 2.20* .26
*With Bonferroni adjustments for multiple comparisons, p .05 = .008.
The effect sizes calculated based on Hedges’s bias-corrected estimate of mean
differences. The Bonferroni adjustments are made by dividing the overall level of
significance tested by the number of comparisons undertaken. In the example
above it is .05/6 = .008. So each test is tested for significance at p = .008 instead
of .05 because of the multiple comparisons involved.
Uses of Statistics in Educational Research
Statistics is increasingly used in evaluating solutions and programs in terms of
their efficacy and efficiency in solving problems. This is normally undertaken in
experimental studies with experimental and control groups where pretests and
post-tests are involved. The normal statistical test used is the analysis of
covariance or ANCOVA. In this analysis, the pretests are used as covariates
especially in the quasi-experimental designs.
Similar analysis is also possible using AMOS which is used to compare models
involving the treatment as a mediating variable in the model (Palaniappan, 2008).
Statistics are also used in testing the reliability and validity of instruments
constructed based on sound theories. In item analyses, the Cronbach alpha
statistic is normally used to test the internal reliability of the test. Factor analysis
is used to ascertain the number of factors involved in a certain construct.
Common Issues in Statistical Analyses in Education
There are number of common statistics related issues that lead to the invalidation
of the research. Some are listed below:
1) Wrong sampling procedures
2) Using instruments or questionnaires that are not reliable or valid
3) Not screening the data for outliers before analyzing
4) Not checking the normality of data before employing parametric statistical
5) Analyzing data without first testing the distribution
6) Using the wrong statistical analyses to test their hypotheses
7) Interpreting the SPSS output inaccurately
8) Overgeneralization from studies undertaken on limited or convenient sample.
Most researchers in the social sciences are aware of these common pitfalls in
statistical analyses and efforts are usually taken to avoid them so as to be able to
produce publishable research fit for international journals which are usually peer-
It can be said that statistics has led to many discoveries in the social sciences.
Among them are Factor Analyses that have led to the development of
instruments that assess intelligence, self-concept, creativity and other
psychological construct accurately. Item analyses have helped in the
development of instruments of high internal reliabilities. Parametric and non
parametric tests have enabled researchers to infer to the actual population based
on the analyses originating from the samples collected randomly from the
population. Hence, it can be said statistics will continue to be an important tool in
the discovery of more constructs and understanding of the various psychological
factors that we as human beings are largely dependent on.
Bartlett, J. E., Kotrlik, J. W. & Higgins, C. (2001). Organizational Research:
Determining appropriate sample size in survey research. Information
Technology, Learning and Performance Journal, 19(1). 43-50.
Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York: John Wiley &
Palaniappan, A. K. (2007) Sex differences in Multi-talent perceptions of
Malaysian students. Perceptual and Motor Skills, 105, 1052-1054.
Palaniappan, A. K. (2007). SPSS in Educational Research. Kuala Lumpur:
Palaniappan, A. K. (2008). AMOS in Educational Research. (In press)