# DATA HANDLING by rxk45T

VIEWS: 5 PAGES: 46

• pg 1
```									                                VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

INTRODUCTION

Statistics is the branch of mathematics that focuses on collecting, organising, displaying,
analysing and interpreting information (called data).
In other words, Statistics involves collecting information, organising the information into a
form that is manageable and easy to work with, constructing charts to represent the
information, and then interpreting and analysing the information in order to make
predictions for the future.

The picture below shows the 5 parts that make up any statistical process.

    The first step in every statistical process is to pose a question. The question that
you pose will determine the type of data that you need to collect and that way in
which the data must be collected, organised and represented.

    Every statistical process involves collecting data. Sometimes the data is already
available, and sometimes you will have to collect it yourself using surveys or
questionnaires.

    Once the data has been collected, the data must be organised.
This is done using:
 a variety of tables – tally table, frequency table, cumulative frequency table;
 a variety of different measures – measures of central tendency and

All materials developed by March North                     1
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

    The next step in the statistical process is to represent the data graphically by
means of bar, pie, line, scatter-plot and cumulative frequency graphs. The type of
graph used depends entirely on the nature of the data being handled.

    The next step, and the most important step in the statistical process, is analysing
the data. This involves studying the tables, measures and graphs that have been
calculated and constructed and making deductions and decisions relating to the
original question that was posed. Without this step, the statistical process is
meaningless.

    It is important to note that the arrows move both ways in the statistical process.
This means that every step in the process is dependent on the success of the
step before.
i.e. If the data that you collect is inaccurate, then no matter how well you organise
and represent the data the analysis will also be inaccurate. Similarly, if you
represent the data using the wrong graph, then the analysis will be inaccurate.

All materials developed by March North                   2
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

WEIGHTS AND HEIGHTS
In this part of the workshop you will collect information/data on weights and heights,
organise the data, draw graphs to represent the data, measure the data, analyse the data,
and make certain deductions relating to the data.

PART 1:            POSING A QUESTION

Possible questions relating to weight and height:

PART 2:            COLLECTING INFORMATION

2.1       Ways of collecting information:

A) Questionnaires:
A questionnaire is a form containing a variety of different questions. The questionnaire is
used to gather information relating to the questions from different people. The
questionnaire is given to a relevant group of people and the people complete the
questionnaire on their own and in their own time.

Questionnaires are used when you need to gather information that requires more than
simply a yes/no answer or when the respondents need time to think about the questions
before they answer them. Questionnaires are also used when the topic being dealt with is
of a sensitive nature and people need to be able to answer the questions in private.

A questionnaire must always be easy and quick to complete and must not require large
amounts of writing. Respondents will very quickly become frustrated if they have to answer
questions that require lots of writing and thinking.

All materials developed by March North                 3
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Example:

QUESTIONNAIRE ON ALCOHOL USE

Age:      ________________                                        Gender: Male              Female

Race:              Black                       White                     Indian                               Coloured

Other: ______________

Do you drink alcohol?                   Yes            No

Have you ever drunk alcohol?            Yes            No

How often do you drink alcohol?

Every night                  Once a week                          Twice a week                       3 times a week

Other:    _________________

How much alcohol do you drink at each sitting?

1drink                       2 drinks                  3 drinks          more than 3 drinks

In the space below, please feel free to provide any other information that you think might be relevant to this study.

All materials developed by March North                                           4
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

B) Surveys:
As with a questionnaire, a survey also contains a list of questions that extracts information
from a group of people. However, with a survey you ask the questions and you write down
the responses rather than the respondents writing down the answers themselves. You can
also conduct a survey by observing and recording the results – for example, counting the
number of people visiting particular shows at a shopping centre.

Survey’s are used to gather information when the questions being asked require a simple
yes/no type answer or when information is being gathered from people in a public place
where there is no time to fill out lengthy answers.

When designing a survey it is important to choose questions that are easy to answer, that
can be answered simply by circling an available option, and that do not require lengthy

Example:

MIDLANDS MALL SURVERY

Approximate age:                       10-18            19-25            25-30         > 30

Gender:            Male                Female

Race:              Black                        White                    Indian                      Coloured

Other: ______________

How often do you shop at the Midlands Mall during the week?
Every day               Once           Twice            Three Times                    Other: _____________

Please rate the parking facilities (1 = poor; 5 = excellent):
1                2                  3                 4                  5

Please rate the toilet facilities offered at the mall (1 = poor; 5 = excellent):
1                 2                   3                 4                 5

Rate the quality of the shops at the mall (1 = poor; 5 = excellent):
1                2                 3                 4                   5

If you were to visit a restaurant, what restaurant would it be?          _____________________________

Rate the quality of the food court and restaurants (1 = poor; 5 = excellent):
1                2                 3                4                5

Are there any improvements that you would like to see at the mall?
________________________________________________________________________________________________

All materials developed by March North                                   5
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

2.2       Populations, Samples and Bias:

Populations & Samples:
When collecting data, it is often impossible to collect data from every single person to
which the data might apply. Rather, the data is collected from a much smaller group of
people who are representative of the large group. This smaller group from whom the
information is collected is called the Sample Group and the whole group to which the
information applies the Population.

For example, if a student at a school wants to collect information on the number of
students in her school who drink alcohol on a regular basis, it would be impractical to try to
collect information from every person in the school. Rather, she might decide to collect
information from a 20 students in every grade. In this situation, the Population of the data
set would be all of the students in the school while the Sample Group would be 20
students in each grade from whom the information has been collected.

Bias:
When collecting data, it is crucial that the sample group from whom the information is
collected is representative of the whole population. If the sample group is not
representative of the population, then the data that is collected may be biased.

For example, with the girl collecting information on the number of students in her school
who drink alcohol on a regular basis, it would be important that she collected information
from both boys and girls, from students of different racial groups, and from students in
every grade. If she only collected information from girls, then the data would not be
representative of the whole school population; if she only collected information from
students in Grade 12, then the results would not be representative of the whole school;
and so on.

All materials developed by March North                6
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Questions:

1) a) What message does the graph below send?

b) Explain why the graph is biased or misleading?

(Source: www.eskom.co.za/annreport07, p. 10, 21 April 2008)

2) Explain why the graph below is biased or misleading.

(Source: Mail and Guardian, 14-19 March 2008)

All materials developed by March North    7
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

3) Both graphs below show the price of bread over the period 2000 to 2008.
Bread Price - 2000 to 2008
R 6.00

R 5.00

R 4.00

R 3.00
2000    2001     2002     2003     2004      2005       2006   2007      2008
Year

Bread Price - 2000 to 2008
R 10.00
R 9.00
R 8.00
R 7.00

R 6.00
R 5.00
R 4.00
R 3.00
R 2.00
R 1.00
R 0.00
2000     2001     2002     2003      2004       2005      2006       2007      2008

Year

a) If you were a bread producer and you wanted to create the impression that the increase
in the price of bread has not been too big, which graph would you use? Explain your

b) If you worked for a union and you wanted to argue that the price of bread has increased
significantly since the year 2000, which graph would you use? Explain your answer.

All materials developed by March North                                      8
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

2.3   Activity:
Design a survey that can be used to collect information on height and weight.

   Make sure to think carefully about what questions are relevant to the topic of
weight and height.

   Make sure the survey is user-friendly and easy to complete.

   Make sure that that the questions on the survey will ensure that the information
collected is not biased.

   Now use the survey that you have designed to collect information from the
teachers in Sue’s class.

All materials developed by March North                    9
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Part 3:            Organising Data
Frequency Table:
The most common way to organise data is by using a frequency table. This is a table that
records how often every value in the data set occurs.

Class Intervals:
When dealing with data in which there is large variation in the values in the data, it is often
useful to divide the data into a smaller number of categories – i.e. class intervals. These
class intervals reduce the number of data values that we have to deal with and make it
easier to graph the data.

For example:
The Department of Education has to deal with thousands and thousands of results. To
make it easier to make sense of those results, they have constructed “mark categories”
with a width of 10%. From 0 – 9%; 10 − 19%; 20 − 29%; etc. They have also attached a
rating to each of these categories:  80% = Level 7; 70 − 79% = Level 6; etc.

It is important that the class intervals are of an appropriate size. If the intervals are too big,
then you won’t get a true picture of the spread of the data. And if the intervals are too
small, then the data becomes difficult to work with because we return to having to deal
with a large amount of data.
It is also important to realise that once data has been organised into class intervals, then it
is impossible to accurately determine the average of the data or how spread out the data
is. This is because by dividing the data into class intervals and by determining the
frequency with which data values fall into a particular intervals, the actual values of the
original data values become “hidden”.

All materials developed by March North                   10
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Questions:

The table below shows height, weight, race and gender information relating to the
participants in this workshop.

Age      Gender   Race    Residence   Height (m)   Weight (kg)
Person 1          35        M      Black     Rural        1.61          53
Person 2          32        M      Black     Rural         1.7          70
Person 3          31        F      Black     Urban        1.58         52.6
Person 4          32        M      Black     Urban        1.78          78
Person 5          35        M      Black     Urban        1.65         91.7
Person 6          31        M      Black     Rural        1.85         89.8
Person 7          55        F      White     Urban         1.8          72
Person 8          31        F      Black     Urban         1.5          78
Person 9          29        F      Black     Rural        1.55          52
Person 10         45        M      Black     Urban        1.68          75
Person 11         48        F      White     Urban        1.65          76
Person 12         30        M      Black     Urban        1.76          72
Person 13         35        F      Black     Urban        1.66          74
Person 14         42        F      Black     Rural        1.56         109
Person 15         38        F      Black     Rural        1.59          68
Person 16         38        F      Black     Rural        1.64         78.1
Person 17         55        F      White     Urban         1.8          72
Person 18         41        F      Black     Urban        1.74         69.1
Person 19         33        M      Black     Rural        1.76          76
Person 20         38        F      Black     Urban        1.62         141
Person 21         31        M      Black     Rural        1.73         74.4
Person 22         43        F      Black     Urban        1.62         77.4
Person 23         61        F      White     Urban         1.6         73.4
Person 24         36        F      Black     Urban         1.9         74.9
Person 25         24        M      Black     Rural         1.2          65
Person 26         39        F      Black     Urban        1.65         110
Person 27         42        F      Black     Urban        1.67         96.8
Person 28         27        F      Black     Urban        1.67          70
Person 29         39        F      Black     Urban        1.61          82
Person 30         46        F      Black     Rural        1.42        103.7
Person 31         49        M      White     Urban        1.86          85
Person 32         37        M      Black     Urban        1.64         74.1
Person 33         24        M      Black     Rural        1.67          63
Person 34         25        F      Black     Rural        1.82         77.5
Person 35         27        F      Black     Rural        1.73         74.9
Person 36         21        F      Black     Rural        1.67          62
Person 37         28        F      Black     Rural        1.68          64
Person 38         34        M      Black     Urban        1.63         74.6
Person 39         31        M      White     Urban        1.75          75

All materials developed by March North                     11
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

3.1        Use the frequency and tally table below to organise the data according to the
heights of the various people.

Height Category                Tally                Frequency

< 1.5 m

1.5 - 1.59 m

1.6 - 1.69 m

1.7 - 1.79 m

1.8 - 1.89 m

1.9 - 1.99 m

>2m

3.2       Use the table below to organise the data according to the weights of the various
people.

Weight Category                Tally                Frequency

50-59 kg

60-69 kg

70-79 kg

80-89 kg

90-99 kg

100-109 kg

110-119 kg

>120 kg

All materials developed by March North                12
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

3.3       Now organise the data according to height and gender.

Frequency                         % Distribution
Height Category
Female                 Male          % Female            % Male

< 1.5 m

1.5 - 1.59 m

1.6 - 1.69 m

1.7 - 1.79 m

1.8 - 1.89 m

1.9 - 1.99 m

>2m

3.4       Organise the data according to weight and gender.

Frequency                         % Distribution
Weight Category
Female                 Male          % Female            % Male

50-59 kg

60-69 kg

70-79 kg

80-89 kg

90-99 kg

100-109 kg

110-119 kg

>120 kg

All materials developed by March North                         13
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

3.5       Organise the data according to Height and Residence (i.e. Urban or Rural)

Frequency                         % Distribution
Height Category
Rural                 Urban          % Rural            % Urban

< 1.5 m

1.5 - 1.59 m

1.6 - 1.69 m

1.7 - 1.79 m

1.8 - 1.89 m

1.9 - 1.99 m

>2m

3.6       Organise the data according to Weight and Residence (i.e. Urban or Rural)

Frequency                         % Distribution
Weight Category
Rural                 Urban          % Rural            % Urban

50-59 kg

60-69 kg

70-79 kg

80-89 kg

90-99 kg

100-109 kg

110-119 kg

>120 kg

All materials developed by March North                          14
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

3.7       Organise the data according to Height and Race.

Frequency                         % Distribution
Height Category
Black                 White          % Black            % White

< 1.5 m

1.5 - 1.59 m

1.6 - 1.69 m

1.7 - 1.79 m

1.8 - 1.89 m

1.9 - 1.99 m

>2m

3.8       Organise the data according to Weight and Race.

Frequency                         % Distribution
Weight Category
Black                 White          % Black            % White

50-59 kg

60-69 kg

70-79 kg

80-89 kg

90-99 kg

100-109 kg

110-119 kg

>120 kg

All materials developed by March North                          15
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Part 4:            Representing Data Graphically
4.1       On the given set of axes draw a graph to show the height distribution of the various
people surveyed.

4.2       On the given set of axes draw a graph to show the weight distribution of the various
people surveyed.

4.3       On the given set of axes draw a graph to show the % distribution of height by
gender of the various people surveyed.

4.4       On the given set of axes draw a graph to show the % distribution of weight by
gender of the various people surveyed.

4.5       On the given set of axes draw a graph to show the % distribution of height by
residence (i.e. urban or rural) of the various people surveyed.

4.6       On the given set of axes draw a graph to show the % distribution of weight by
residence (i.e. urban or rural) of the various people surveyed.

4.7       On the given set of axes draw a graph to show the % distribution of height by race
of the various people surveyed.

4.8       On the given set of axes draw a graph to show the % distribution of weight by race
of the various people surveyed.

4.9       Draw a graph to show a comparison between the number of males and females
who weigh between 60 kg and 70 kg (including 60 kg but not 70 kg).

All materials developed by March North               16
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    17
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    18
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    19
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    20
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    21
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    22
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    23
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    24
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

4.10      Explain when single bar graphs are the most effective graph for representing data.

4.11      Explain when double (compound) bar graphs are the most effective graph for
representing data.

4.12      Explain when pie charts are the effective for representing data.

4.13
4.13.1             Could we have used a line graph to represent any of the data?
Explain.

4.13.2             When is a line graph the most effective graph for representing data?

All materials developed by March North                       25
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Part 5:            Measuring Data – Measures of Central Tendency
5.1       Explaining Mean, Median and Mode:

Measures of central tendency measure the “centre” or “middle” of a data set. This “centre”
or “middle” value provides a benchmark value against which to compare the other values
in the data set. This “centre” or “middle” value also provides a value that is representative
of the majority of the values in the data set.

There are three different measures of central tendency:

A) Mean:
The mean average provides an indication of the middle of the data set by taking into
account all of the values in the data set.

Mean average = sum of the values in the data set ÷ no. of values in the data set

Example:

4         2         23      4      17    19      51     4     22   15

B) Median:
The median average provides an indication of the middle of the data set by only
considering the middle most value in the data once the data has been arranged in
ascending or descending order.

Example:
4         2         23      4      17     19     51     4     22   15

All materials developed by March North              26
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

C) Mode:
The modal average is the value that occurs most often in the data set. This value provides
an indication of the most common value in the data set.

Example:
4         2         23      4      17     19       51   4        22      15

5.2       Which average must you use?

Consider the following marks.

Precious              Thuleleni             Dorcas             Khosi          Maureen         Mbongiseni
52                    54                   53                69               78              59
Sne                  Zakithi              Thami              Sipho          Zibonele         Cynthia
100                    66                   50                59               52              52
Delani                Lloyd                 Marc              Sibisi           Sue           Thokozile
52                    53                   69                57               52              53

5.2.1 Calculate the mean, median and modal average of the test scores.

5.2.2 Which average provides the most realistic indication of the average test

All materials developed by March North                            27
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

5.3       But be careful …

5.3.1 Consider the following set of values:

1         2         3       40      85   92       100

Explain why the median average of the data will not provide an appropriate
indication of the average of the data.

5.3.2 The table below contains the ice-cream preferences of the students in a
class.

John                Chocolate
Jill                Strawberry
Mpho                Rasberry
Cindy               Blueberry
Ishmaeel            Vanilla
Rudy                Chocolate
Toni                Blueberry
Zanele              Chocolate
Thami               Strawberry
Cynthia             Chocolate
Sne                 Rasberry

a) If you were having a party, which type of ice-cream would you buy for this
class?

b) Do you think using the modal average might create some unhappiness in
this situation?

Summary:

Mean Average
You use the mean average when there are no outliers in the data set. If there are outliers
in the data then the mean average will be skewed and will be either too high or too low
depending on the position of the outliers.

Median Average
You use the median average when the mean average is not suitable − i.e. when there are
outliers in the data.
If there are no outliers in the data set then the mean and median averages will be similar.

Modal Average
You use the modal average when you want to know how often something occurs in a data
set.

All materials developed by March North                   28
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

5.4       Questions on the Height and Weight Data:

5.4.1 Average height of the group:

a) Calculate the mean height of the people in the data set.

b) Calculate the median height of the people in the data set.

c) Calculate the modal height of the people in the data set.

d) Which average provides the most realistic impression of the average
height of the people in the data set?

5.4.2 Average weight of the group:

a) Calculate the mean weight of the people in the data set.

b) Calculate the median weight of the people in the data set.

c) Calculate the modal weight of the people in the data set.

d) Which average provides the most realistic impression of the average
weight of the people in the data set? Explain.

5.4.3 Average height of females vs. males

a) Complete the following table for the females and males in the group:

Mean Height                    Median Height
Females
Males

b) Which average provides the most realistic impression of the average
height of the females? Explain.

c) Which average provides the most realistic impression of the average
height of the males? Explain.

d) Compare the average heights of the males and females and describe
what you notice.

All materials developed by March North                    29
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

5.4.4 Average weight of females vs. males

a) Complete the following table for the females and males in the group:

Mean Weight                     Median Weight
Females
Males

b) Which average provides the most realistic impression of the average
weight of the females? Explain.

c) Which average provides the most realistic impression of the average
weight of the males? Explain.

d) Compare the average weights of the males and females and describe
what you notice.

All materials developed by March North                    30
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Part 6:            Measuring Data – Measures of Spread
Introduction:

Consider the following marks for two different classes in a test. The test was out of 10
marks.

Class 1      Class 2
Student 1          5            4
Student 2          6            6
Student 3          7            7
Student 4          6            7
Student 5          6            6
Student 6          5            5
Student 7          5            4
Student 8          4            3
Student 9          6            6
Student 10         7            8
Student 11         7            7
Student 12         7            7
Student 13         6            6
Student 14         5            5
Student 15         5            5
Student 16         4            2
Student 17         5            5
Student 18         7            7
Student 19         7            8
Student 20         7            9

All materials developed by March North              31
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Introduction continued …

There are some situations in which an average does not provide enough or sufficient
information to be able to make sense of what is happening in the data set and especially
about what is happening away from the middle of the data set. In such situations we make
use of “Measures of Spread”. The different measures of spread provide us with a way to
divide the data up into different size groups, which will give us a picture of what is
happening in the data on either side of the middle (average) of the data set. The measures
of spread also provide us with a way to determine how spread out the values in a data set
are and whether or not they are grouped closely together and whether or not there are
outliers in the data.

There are 4 main measures of spread:
Quartiles, Percentiles, Standard Deviation and Variance

The picture below shows that quartiles and percentiles are used when the Median average
has been calculated for the data set; and standard deviation and variance are used when
the Mean average has been calculated.

Measures of Central Tendency

Median (i.e. outliers)                               Mean

Quartiles & Percentiles                   Standard Deviation & Variance

All materials developed by March North                         32
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

6.1       Range

Range of a data set = highest value − lowest value

Questions:
The table below shows the marks of two classes for a test (/10).

Class 1      Class 2
Student 1          5            4
Student 2          6            6
Student 3          7            7
Student 4          6            7
Student 5          6            6
Student 6          5            5
Student 7          5            4
Student 8          4            3
Student 9          6            6
Student 10         7            8
Student 11         7            7
Student 12         7            7
Student 13         6            6
Student 14         5            5
Student 15         5            5
Student 16         4            2
Student 17         5            5
Student 18         7            7
Student 19         7            8
Student 20         7            9

6.1.1
a) Calculate the range of the marks in class 1.

b) Calculate the range of the marks in class 1

c) Compare the range of the marks for class 1 and class 2. What does the
difference between the ranges tell you about the performance of class 1 and
class 2 in the test?

All materials developed by March North               33
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Consider the following numbers:
1 10                   11     12    13    25

Range of the numbers = 25 – 1 = 24

But the numbers are not actually widely spread???

Questions on the Height and Weight Data:

6.1.3
a) Determine the range of the heights of the males and females.

b) Compare the ranges of the heights and write down what the difference in
ranges tells you about the heights of the males and females.

6.1.4
a) Determine the range of the weights of the males and females.

b) Compare the ranges of the weights and write down what the difference in
ranges tells you about the weights of the males and females.

All materials developed by March North               34
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

6.2       Quartiles & 5 Number Summaries
Quartiles:
Quartiles divide a data set up into 4 groups.
 Q2 lies in the middle of the data set. i.e. Q2 is the median of the dataset.

    Q1 lies at the 25% mark in the data set. i.e. At Q1, 25% of the values lie below Q1 and
75% lie above.
Q1 lies at the middle of the group of values that lie below the median.

    Q3 lies at the 75% mark in the data set. i.e. At Q3, 75% of the values lie below Q1 and
25% lie above.
Q3 lies at the middle of the group of values that lie above the median.

Example:
Below are the test results (/10) for a class. The marks have been arranged in ascending
order.

Mark (/10)
John                    1
Trudi                   2
Mpho                    3
Thulani                 3
Jacob                   3                      Q1
Suzanne                   4
Zipho                     4
Chelsea                   5
Max                       5
Josh                      5                    Median (Q2)
Rebecca                   6
Inky                      6
Brandon                   6
Thuli                     7
Nomkhosi                  7
Jabu                      7                    Q3
Marc                      8
Vaughn                    8
Kate                      9

So, for this test:
 25% of the class scored 3 or less for the test.
 The median mark for the test was 5/10 (50%)
 75% of the students scored 7/10 or less for the test.
 Only 25% of the students scored more than 7/10 for the test.

Based on these results, the test could be considered to be a hard test as 25% of the class
scored 30% or below and half the class scored 50% or below for the test.
All materials developed by March North                        35
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

5 Number Summaries:
A 5 number summary gives us a way to summaries information about a data set. We can
then use 5 number summaries to make comparisons and deductions about the values in a
data set.

A 5 number summary includes the following information:
 Minimum (smallest) value in the dataset
 1st Quartile
 Median (2nd quartile)
 3rd Quartile
 Maximum (biggest) value in the data set

Example:
The table below shows the results of the students in two different classes in a
test (/30). The results have been arranged in ascending order.

Class 1       Class 2
3             3
5            14
5            15
8            15
Q1              11            16
Q1
13            17
13            18
14            18
15            19
15            20
15            20
Median (Q2)
18             22               Median (Q2)
19             23
19             23
19             23
20             23
Q3                22             25
23             26           Q3
25             27
26             27
26             28
30             29
30

All materials developed by March North                  36
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

We can construct the following 5 number summaries for each class:

Class 1                                    Class 2
Minimum        5                            Minimum       15
Q1             12                           Q1            17
Q2             16,5                         Q2            22
Q3             22,5                         Q3            26
Maximum        30                           Maximum       30

By comparing the five-number summaries of the test scores for the two different classes,
we can make deductions about the performance of the students in each of the classes.

    Although the minimum and maximum of the test scores for both classes are the same,
the quartiles of the test scores for Class 2 are much higher than the quartiles of the test
scores for Class 1. This means that the scores in Class 1 contain more “low” scores
than Class 2. In other words, the learners in Class 1 generally scored lower marks than
the learners in Class 2.
    Similarly, Q3 is higher for Class 2 than for Class 1. This means that a greater
percentage of students in Class 2 scored “very high marks” than in Class 1.

Questions on the Height and Weight Data:

6.2.1
a) Construct a 5 number summary for the weight distribution of the whole group.

b) Explain what the 5 number summary tells us about the weight distribution of the
whole group.

6.2.2
a) Construct separate 5 number summaries for the heights of the females and
males.

b) Compare the 5 number summaries and comment on what they tell us about the
heights of the males and females.

6.2.3
a) Construct separate 5 number summaries for the weights of the females and
males.

b) Compare the 5 number summaries and comment on what they tell us about the
weights of the males and females.

All materials developed by March North                37
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

6.3       Percentiles
Percentiles divide a data set into percentage groupings.

For example:
 The 50th percentile is the middle (median) of a dataset.

   The 25th percentile is the value at which 25% of the other values in the dataset
lie at or below. 75% of the values also lie above this percentile.
The 25th percentile is the same as Q1.

   The 75th percentile is the value at which 75% of the other values in the dataset
lie at or below. 25% of the values also lie above this percentile.

   The 10th percentile is the value at which 10% of the other values in the dataset
lie at or below.

   And so on …

Example:

3         8           9          10   12          14       15   15          17       18   19
th                          th                        th
25 percentile                50 percentile             75 percentile

How would we calculate the 10th percentile?

There are 11 numbers in the list.
The 10th percentile will lie in a position at 10% of the dataset.
So: 10% × 11 = 1,1
≈ 2 (rounded up)
(you always round up when calculating percentile positions)

So, the 2nd number in the list = “8” is the 10th percentile.

How would be calculate the 65th percentile?

There are 11 numbers in the list.
The 65th percentile will lie in a position at 65% of the dataset.
So: 65% × 11 = 7,15
≈ 8 (rounded up)
(you always round up when calculating percentile positions)

So, the 8th number in the list = “15” is the 65th percentile.

All materials developed by March North                               38
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

6.4       Real-Life Applications of Percentiles

6.4.1 The “Road to Health” chart is given on the page below.
a) Explain what it means that a child has an age-for-weight ratio that lies on
the 50th percentile.

b) Explain what it means that a child has an age-for-weight ratio that lies on
the 97th percentile.

c) What percentile lies at 60% of the 50th percentile?

6.4.2
a) What would be considered to be an average weight for an 18 month old
baby?

b) What would be considered to be an average weight for a 2 year old child?

c) A 12 month old baby has an age-for-weight ratio that lies on the 97th
percentile. Approximately how much does this child weigh?

d) A 22 month old child has age-for-weight ratio that lies on the 3rd percentile.
Approximately how much does this child weigh?

e) A 3 month old baby weights 7,2 kg. According to the Road to Health
Chart, would this baby be considered to have an above average, average, or
below average age-for-weight ratio? Explain.

f) How much would a 10 month old baby have to weigh to be considered to
have an average age-for-weight ratio.

g) A 14 month old baby weighs 5,5 kg. Could this baby be suffering from

6.4.3 Can you think of any problems with the “Road to Health” chart?

All materials developed by March North                    39
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
(Source: Sanofi Pasteur, Vaccination Record (2008))
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    40
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

6.5            Body Mass Index
6.5.1          Calculating BMI:

A more accurate way of determining the weight status of an individual is to use their Body
Mass Index.

The body mass index of an individual is calculated using the following formula:

Body Mass Index (BMI) (kg/m2) = weight (kg) ÷ (height (m))2

Example:
A man weighs 87 kg and is 1,76 m tall.

 BMI = 87 kg ÷ (1,76 m)2
= 87 kg ÷ 3,0976
= 28,1 kg/m2       (rounded off to one decimal place)

6.5.2          Using BMI to Determine the Weight Status of an Adult:

The BMI of an adult older than 20 years is used to classify weight status in according to
the following categories:

BMI                     Classification
<18.5                   Underweight
>= 18.5 and < 25                    Normal
>= 25 and < 30                   Overweight
> 30                       Obese

So, the man who weighs 87 kg and is 1,76 m tall would be classified as being
“Overweight”.

All materials developed by March North               41
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Questions on Height and Weight Data:

The table below shows the height, weight, BMI and weight status of the 39 people from
whom data was collected.

BMI
Height    Weight   (kg/m^2)      Weight status
Person 1            1.61       53          20.4
Person 2             1.7       70          24.2
Person 3            1.58     52.6          21.1
Person 4            1.78       78
Person 5            1.65     91.7         33.7
Person 6            1.85     89.8         26.2
Person 7             1.8       72         22.2
Person 8             1.5       78         34.7
Person 9            1.55                  21.6
Person 10           1.68       75         26.6
Person 11           1.65       76         27.9
Person 12           1.76       72         23.2
Person 13           1.66       74         26.9
Person 14           1.56      109         44.8
Person 15           1.59       68         26.9
Person 16           1.64     78.1
Person 17            1.8       72         22.2
Person 18           1.74     69.1         22.8
Person 19           1.76       76         24.5
Person 20           1.62      141         53.7
Person 21           1.73     74.4         24.9
Person 22           1.62                  29.5
Person 23            1.6     73.4         28.7
Person 24            1.9     74.9
Person 25            1.2       65         45.1
Person 26           1.65      110         40.4
Person 27           1.67     96.8
Person 28           1.67       70         25.1
Person 29           1.61       82         31.6
Person 30           1.42    103.7         51.4
Person 31                      85         24.6
Person 32           1.64     74.1         27.6
Person 33           1.67       63         22.6
Person 34           1.82     77.5         23.4
Person 35           1.73     74.9         25.0
Person 36           1.67       62         22.2
Person 37                      64         22.7
Person 38           1.63     74.6         28.1
Person 39           1.75       75         24.5

All materials developed by March North                   42
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

a) Complete the table by filling in the missing values.

b) Use the frequency table below to summarise the BMI data for the 39 people:

BMI                         Classification          Frequency              % of Total
<18.5                        Underweight
>= 18.5 and < 25                       Normal
>= 25 and < 30                      Overweight
> 30                           Obese

c) According to the frequency table, do you think there is a problem with weight amongst
the 39 people surveyed? Explain.

d) Calculate the average BMI for men and the average BMI for women and make a
deduction about which group is the healthier group.

6.5.3          BMI for Children (2to 20 years)

For a child, one their BMI has been calculated, the BMI is the mapped on a BMI-for-Age
percentile graph (see below).
(Source: Centre for Disease Control, www.cdc.gov/growthcharts)

The weight status of the child is then classified according to the following criteria:

Weight Status                      Percentile Range
th
Underweight                        Less than the 5 percentile
 5 percentile and < 85 percentile
th                  th
Healthy weight
 85 percentile and < 95 percentile
th                  th
At risk of overweight
 95 percentile
th
Overweight

Example:
A 9 year old girl weighs 32 kg and is 1,25 m tall.

BMI (kg/m2) = 32 kg ÷ (1,25 m)2
= 32 kg ÷ 1,5625 m2
≈ 20,5kg/m2 (rounded off to one decimal place)

Mapping this BMI value on the BMI-for-Age growth chart places the girl on or just above
the 90th percentile.
According to the table above, this places her in the “At risk of being overweight” category.

All materials developed by March North                              43
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

Questions:

a) Determine the weight status of a 4 year old girl who weighs 15 kg and is
1 m tall.

b) Determine the weight status of a 15 year old girl who weighs 42 kg and is
1,4 m tall.

c) Determine the weight status of an 18 year old boy who weighs 81 kg and is 1,8 m tall.

d) Determine the weight status of a 10 year old boy who weighs 28 kg and is 1,18 m tall.

e) What is the average BMI for a 12 year old girl?

f) What is the average BMI for a 19 year old girl?

g) If the BMI values for 1 000 girls between the ages of 2 and 20 years were to be
calculated, how many of them could we expect to be:
i) Underweight;
ii) At risk of being overweight;
iii) Overweight?

h) A girl with a BMI of 18kg/m2 has is “At Risk of Being Overweight”. Provide a range of
possible ages of this girl.

i) A 10 year old girl has a “Healthy Weight” status. Provide a range of possible BMI values
for this girl.

j) A 13 years old girl is 1,55 m tall and weighs 54 kg.
How much weight would this girl have to lose in order to have a “Healthy Weight” status?

All materials developed by March North              44
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    45
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians
VULA MATHEMATICAL LITERACY WORKSHOP
7 – 10 JUNE 2008

Data Handling

All materials developed by March North    46
Tel: 083 627 8188, mnorth@stannes.co.za
Working for Mathematically Literate Mathematicians

```
To top