VIEWS: 5 PAGES: 46 POSTED ON: 7/14/2012
VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling INTRODUCTION Statistics is the branch of mathematics that focuses on collecting, organising, displaying, analysing and interpreting information (called data). In other words, Statistics involves collecting information, organising the information into a form that is manageable and easy to work with, constructing charts to represent the information, and then interpreting and analysing the information in order to make predictions for the future. The picture below shows the 5 parts that make up any statistical process. The first step in every statistical process is to pose a question. The question that you pose will determine the type of data that you need to collect and that way in which the data must be collected, organised and represented. Every statistical process involves collecting data. Sometimes the data is already available, and sometimes you will have to collect it yourself using surveys or questionnaires. Once the data has been collected, the data must be organised. This is done using: a variety of tables – tally table, frequency table, cumulative frequency table; a variety of different measures – measures of central tendency and measures of spread. All materials developed by March North 1 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling The next step in the statistical process is to represent the data graphically by means of bar, pie, line, scatter-plot and cumulative frequency graphs. The type of graph used depends entirely on the nature of the data being handled. The next step, and the most important step in the statistical process, is analysing the data. This involves studying the tables, measures and graphs that have been calculated and constructed and making deductions and decisions relating to the original question that was posed. Without this step, the statistical process is meaningless. It is important to note that the arrows move both ways in the statistical process. This means that every step in the process is dependent on the success of the step before. i.e. If the data that you collect is inaccurate, then no matter how well you organise and represent the data the analysis will also be inaccurate. Similarly, if you represent the data using the wrong graph, then the analysis will be inaccurate. All materials developed by March North 2 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling WEIGHTS AND HEIGHTS In this part of the workshop you will collect information/data on weights and heights, organise the data, draw graphs to represent the data, measure the data, analyse the data, and make certain deductions relating to the data. PART 1: POSING A QUESTION Possible questions relating to weight and height: PART 2: COLLECTING INFORMATION 2.1 Ways of collecting information: A) Questionnaires: A questionnaire is a form containing a variety of different questions. The questionnaire is used to gather information relating to the questions from different people. The questionnaire is given to a relevant group of people and the people complete the questionnaire on their own and in their own time. Questionnaires are used when you need to gather information that requires more than simply a yes/no answer or when the respondents need time to think about the questions before they answer them. Questionnaires are also used when the topic being dealt with is of a sensitive nature and people need to be able to answer the questions in private. A questionnaire must always be easy and quick to complete and must not require large amounts of writing. Respondents will very quickly become frustrated if they have to answer questions that require lots of writing and thinking. All materials developed by March North 3 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Example: QUESTIONNAIRE ON ALCOHOL USE (Where necessary, please circle the appropriate answer) Age: ________________ Gender: Male Female Race: Black White Indian Coloured Other: ______________ Do you drink alcohol? Yes No Have you ever drunk alcohol? Yes No How often do you drink alcohol? Every night Once a week Twice a week 3 times a week Other: _________________ How much alcohol do you drink at each sitting? 1drink 2 drinks 3 drinks more than 3 drinks Comments: In the space below, please feel free to provide any other information that you think might be relevant to this study. All materials developed by March North 4 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling B) Surveys: As with a questionnaire, a survey also contains a list of questions that extracts information from a group of people. However, with a survey you ask the questions and you write down the responses rather than the respondents writing down the answers themselves. You can also conduct a survey by observing and recording the results – for example, counting the number of people visiting particular shows at a shopping centre. Survey’s are used to gather information when the questions being asked require a simple yes/no type answer or when information is being gathered from people in a public place where there is no time to fill out lengthy answers. When designing a survey it is important to choose questions that are easy to answer, that can be answered simply by circling an available option, and that do not require lengthy answers. Example: MIDLANDS MALL SURVERY Approximate age: 10-18 19-25 25-30 > 30 Gender: Male Female Race: Black White Indian Coloured Other: ______________ How often do you shop at the Midlands Mall during the week? Every day Once Twice Three Times Other: _____________ Please rate the parking facilities (1 = poor; 5 = excellent): 1 2 3 4 5 Please rate the toilet facilities offered at the mall (1 = poor; 5 = excellent): 1 2 3 4 5 Rate the quality of the shops at the mall (1 = poor; 5 = excellent): 1 2 3 4 5 If you were to visit a restaurant, what restaurant would it be? _____________________________ Rate the quality of the food court and restaurants (1 = poor; 5 = excellent): 1 2 3 4 5 Are there any improvements that you would like to see at the mall? ________________________________________________________________________________________________ All materials developed by March North 5 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 2.2 Populations, Samples and Bias: Populations & Samples: When collecting data, it is often impossible to collect data from every single person to which the data might apply. Rather, the data is collected from a much smaller group of people who are representative of the large group. This smaller group from whom the information is collected is called the Sample Group and the whole group to which the information applies the Population. For example, if a student at a school wants to collect information on the number of students in her school who drink alcohol on a regular basis, it would be impractical to try to collect information from every person in the school. Rather, she might decide to collect information from a 20 students in every grade. In this situation, the Population of the data set would be all of the students in the school while the Sample Group would be 20 students in each grade from whom the information has been collected. Bias: When collecting data, it is crucial that the sample group from whom the information is collected is representative of the whole population. If the sample group is not representative of the population, then the data that is collected may be biased. For example, with the girl collecting information on the number of students in her school who drink alcohol on a regular basis, it would be important that she collected information from both boys and girls, from students of different racial groups, and from students in every grade. If she only collected information from girls, then the data would not be representative of the whole school population; if she only collected information from students in Grade 12, then the results would not be representative of the whole school; and so on. All materials developed by March North 6 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Questions: 1) a) What message does the graph below send? b) Explain why the graph is biased or misleading? (Source: www.eskom.co.za/annreport07, p. 10, 21 April 2008) 2) Explain why the graph below is biased or misleading. (Source: Mail and Guardian, 14-19 March 2008) All materials developed by March North 7 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 3) Both graphs below show the price of bread over the period 2000 to 2008. Bread Price - 2000 to 2008 R 6.00 R 5.00 Bread Price R 4.00 R 3.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 Year Bread Price - 2000 to 2008 R 10.00 R 9.00 R 8.00 R 7.00 Bread Price R 6.00 R 5.00 R 4.00 R 3.00 R 2.00 R 1.00 R 0.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 Year a) If you were a bread producer and you wanted to create the impression that the increase in the price of bread has not been too big, which graph would you use? Explain your answer. b) If you worked for a union and you wanted to argue that the price of bread has increased significantly since the year 2000, which graph would you use? Explain your answer. All materials developed by March North 8 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 2.3 Activity: Design a survey that can be used to collect information on height and weight. Make sure to think carefully about what questions are relevant to the topic of weight and height. Make sure the survey is user-friendly and easy to complete. Make sure that that the questions on the survey will ensure that the information collected is not biased. Now use the survey that you have designed to collect information from the teachers in Sue’s class. All materials developed by March North 9 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Part 3: Organising Data Frequency Table: The most common way to organise data is by using a frequency table. This is a table that records how often every value in the data set occurs. Class Intervals: When dealing with data in which there is large variation in the values in the data, it is often useful to divide the data into a smaller number of categories – i.e. class intervals. These class intervals reduce the number of data values that we have to deal with and make it easier to graph the data. For example: The Department of Education has to deal with thousands and thousands of results. To make it easier to make sense of those results, they have constructed “mark categories” with a width of 10%. From 0 – 9%; 10 − 19%; 20 − 29%; etc. They have also attached a rating to each of these categories: 80% = Level 7; 70 − 79% = Level 6; etc. It is important that the class intervals are of an appropriate size. If the intervals are too big, then you won’t get a true picture of the spread of the data. And if the intervals are too small, then the data becomes difficult to work with because we return to having to deal with a large amount of data. It is also important to realise that once data has been organised into class intervals, then it is impossible to accurately determine the average of the data or how spread out the data is. This is because by dividing the data into class intervals and by determining the frequency with which data values fall into a particular intervals, the actual values of the original data values become “hidden”. All materials developed by March North 10 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Questions: The table below shows height, weight, race and gender information relating to the participants in this workshop. Age Gender Race Residence Height (m) Weight (kg) Person 1 35 M Black Rural 1.61 53 Person 2 32 M Black Rural 1.7 70 Person 3 31 F Black Urban 1.58 52.6 Person 4 32 M Black Urban 1.78 78 Person 5 35 M Black Urban 1.65 91.7 Person 6 31 M Black Rural 1.85 89.8 Person 7 55 F White Urban 1.8 72 Person 8 31 F Black Urban 1.5 78 Person 9 29 F Black Rural 1.55 52 Person 10 45 M Black Urban 1.68 75 Person 11 48 F White Urban 1.65 76 Person 12 30 M Black Urban 1.76 72 Person 13 35 F Black Urban 1.66 74 Person 14 42 F Black Rural 1.56 109 Person 15 38 F Black Rural 1.59 68 Person 16 38 F Black Rural 1.64 78.1 Person 17 55 F White Urban 1.8 72 Person 18 41 F Black Urban 1.74 69.1 Person 19 33 M Black Rural 1.76 76 Person 20 38 F Black Urban 1.62 141 Person 21 31 M Black Rural 1.73 74.4 Person 22 43 F Black Urban 1.62 77.4 Person 23 61 F White Urban 1.6 73.4 Person 24 36 F Black Urban 1.9 74.9 Person 25 24 M Black Rural 1.2 65 Person 26 39 F Black Urban 1.65 110 Person 27 42 F Black Urban 1.67 96.8 Person 28 27 F Black Urban 1.67 70 Person 29 39 F Black Urban 1.61 82 Person 30 46 F Black Rural 1.42 103.7 Person 31 49 M White Urban 1.86 85 Person 32 37 M Black Urban 1.64 74.1 Person 33 24 M Black Rural 1.67 63 Person 34 25 F Black Rural 1.82 77.5 Person 35 27 F Black Rural 1.73 74.9 Person 36 21 F Black Rural 1.67 62 Person 37 28 F Black Rural 1.68 64 Person 38 34 M Black Urban 1.63 74.6 Person 39 31 M White Urban 1.75 75 All materials developed by March North 11 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 3.1 Use the frequency and tally table below to organise the data according to the heights of the various people. Height Category Tally Frequency < 1.5 m 1.5 - 1.59 m 1.6 - 1.69 m 1.7 - 1.79 m 1.8 - 1.89 m 1.9 - 1.99 m >2m 3.2 Use the table below to organise the data according to the weights of the various people. Weight Category Tally Frequency 50-59 kg 60-69 kg 70-79 kg 80-89 kg 90-99 kg 100-109 kg 110-119 kg >120 kg All materials developed by March North 12 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 3.3 Now organise the data according to height and gender. Frequency % Distribution Height Category Female Male % Female % Male < 1.5 m 1.5 - 1.59 m 1.6 - 1.69 m 1.7 - 1.79 m 1.8 - 1.89 m 1.9 - 1.99 m >2m 3.4 Organise the data according to weight and gender. Frequency % Distribution Weight Category Female Male % Female % Male 50-59 kg 60-69 kg 70-79 kg 80-89 kg 90-99 kg 100-109 kg 110-119 kg >120 kg All materials developed by March North 13 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 3.5 Organise the data according to Height and Residence (i.e. Urban or Rural) Frequency % Distribution Height Category Rural Urban % Rural % Urban < 1.5 m 1.5 - 1.59 m 1.6 - 1.69 m 1.7 - 1.79 m 1.8 - 1.89 m 1.9 - 1.99 m >2m 3.6 Organise the data according to Weight and Residence (i.e. Urban or Rural) Frequency % Distribution Weight Category Rural Urban % Rural % Urban 50-59 kg 60-69 kg 70-79 kg 80-89 kg 90-99 kg 100-109 kg 110-119 kg >120 kg All materials developed by March North 14 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 3.7 Organise the data according to Height and Race. Frequency % Distribution Height Category Black White % Black % White < 1.5 m 1.5 - 1.59 m 1.6 - 1.69 m 1.7 - 1.79 m 1.8 - 1.89 m 1.9 - 1.99 m >2m 3.8 Organise the data according to Weight and Race. Frequency % Distribution Weight Category Black White % Black % White 50-59 kg 60-69 kg 70-79 kg 80-89 kg 90-99 kg 100-109 kg 110-119 kg >120 kg All materials developed by March North 15 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Part 4: Representing Data Graphically 4.1 On the given set of axes draw a graph to show the height distribution of the various people surveyed. 4.2 On the given set of axes draw a graph to show the weight distribution of the various people surveyed. 4.3 On the given set of axes draw a graph to show the % distribution of height by gender of the various people surveyed. 4.4 On the given set of axes draw a graph to show the % distribution of weight by gender of the various people surveyed. 4.5 On the given set of axes draw a graph to show the % distribution of height by residence (i.e. urban or rural) of the various people surveyed. 4.6 On the given set of axes draw a graph to show the % distribution of weight by residence (i.e. urban or rural) of the various people surveyed. 4.7 On the given set of axes draw a graph to show the % distribution of height by race of the various people surveyed. 4.8 On the given set of axes draw a graph to show the % distribution of weight by race of the various people surveyed. 4.9 Draw a graph to show a comparison between the number of males and females who weigh between 60 kg and 70 kg (including 60 kg but not 70 kg). All materials developed by March North 16 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 17 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 18 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 19 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 20 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 21 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 22 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 23 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 24 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 4.10 Explain when single bar graphs are the most effective graph for representing data. 4.11 Explain when double (compound) bar graphs are the most effective graph for representing data. 4.12 Explain when pie charts are the effective for representing data. 4.13 4.13.1 Could we have used a line graph to represent any of the data? Explain. 4.13.2 When is a line graph the most effective graph for representing data? All materials developed by March North 25 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Part 5: Measuring Data – Measures of Central Tendency 5.1 Explaining Mean, Median and Mode: Measures of central tendency measure the “centre” or “middle” of a data set. This “centre” or “middle” value provides a benchmark value against which to compare the other values in the data set. This “centre” or “middle” value also provides a value that is representative of the majority of the values in the data set. There are three different measures of central tendency: A) Mean: The mean average provides an indication of the middle of the data set by taking into account all of the values in the data set. Mean average = sum of the values in the data set ÷ no. of values in the data set Example: 4 2 23 4 17 19 51 4 22 15 B) Median: The median average provides an indication of the middle of the data set by only considering the middle most value in the data once the data has been arranged in ascending or descending order. Example: 4 2 23 4 17 19 51 4 22 15 All materials developed by March North 26 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling C) Mode: The modal average is the value that occurs most often in the data set. This value provides an indication of the most common value in the data set. Example: 4 2 23 4 17 19 51 4 22 15 5.2 Which average must you use? Consider the following marks. Precious Thuleleni Dorcas Khosi Maureen Mbongiseni 52 54 53 69 78 59 Sne Zakithi Thami Sipho Zibonele Cynthia 100 66 50 59 52 52 Delani Lloyd Marc Sibisi Sue Thokozile 52 53 69 57 52 53 5.2.1 Calculate the mean, median and modal average of the test scores. 5.2.2 Which average provides the most realistic indication of the average test score for this class? Explain your answer. All materials developed by March North 27 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 5.3 But be careful … 5.3.1 Consider the following set of values: 1 2 3 40 85 92 100 Explain why the median average of the data will not provide an appropriate indication of the average of the data. 5.3.2 The table below contains the ice-cream preferences of the students in a class. John Chocolate Jill Strawberry Mpho Rasberry Cindy Blueberry Ishmaeel Vanilla Rudy Chocolate Toni Blueberry Zanele Chocolate Thami Strawberry Cynthia Chocolate Sne Rasberry a) If you were having a party, which type of ice-cream would you buy for this class? b) Do you think using the modal average might create some unhappiness in this situation? Summary: Mean Average You use the mean average when there are no outliers in the data set. If there are outliers in the data then the mean average will be skewed and will be either too high or too low depending on the position of the outliers. Median Average You use the median average when the mean average is not suitable − i.e. when there are outliers in the data. If there are no outliers in the data set then the mean and median averages will be similar. Modal Average You use the modal average when you want to know how often something occurs in a data set. All materials developed by March North 28 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 5.4 Questions on the Height and Weight Data: 5.4.1 Average height of the group: a) Calculate the mean height of the people in the data set. b) Calculate the median height of the people in the data set. c) Calculate the modal height of the people in the data set. d) Which average provides the most realistic impression of the average height of the people in the data set? 5.4.2 Average weight of the group: a) Calculate the mean weight of the people in the data set. b) Calculate the median weight of the people in the data set. c) Calculate the modal weight of the people in the data set. d) Which average provides the most realistic impression of the average weight of the people in the data set? Explain. 5.4.3 Average height of females vs. males a) Complete the following table for the females and males in the group: Mean Height Median Height Females Males b) Which average provides the most realistic impression of the average height of the females? Explain. c) Which average provides the most realistic impression of the average height of the males? Explain. d) Compare the average heights of the males and females and describe what you notice. All materials developed by March North 29 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 5.4.4 Average weight of females vs. males a) Complete the following table for the females and males in the group: Mean Weight Median Weight Females Males b) Which average provides the most realistic impression of the average weight of the females? Explain. c) Which average provides the most realistic impression of the average weight of the males? Explain. d) Compare the average weights of the males and females and describe what you notice. All materials developed by March North 30 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Part 6: Measuring Data – Measures of Spread Introduction: Consider the following marks for two different classes in a test. The test was out of 10 marks. Class 1 Class 2 Student 1 5 4 Student 2 6 6 Student 3 7 7 Student 4 6 7 Student 5 6 6 Student 6 5 5 Student 7 5 4 Student 8 4 3 Student 9 6 6 Student 10 7 8 Student 11 7 7 Student 12 7 7 Student 13 6 6 Student 14 5 5 Student 15 5 5 Student 16 4 2 Student 17 5 5 Student 18 7 7 Student 19 7 8 Student 20 7 9 Which class did better in the test? Explain your answer. All materials developed by March North 31 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Introduction continued … There are some situations in which an average does not provide enough or sufficient information to be able to make sense of what is happening in the data set and especially about what is happening away from the middle of the data set. In such situations we make use of “Measures of Spread”. The different measures of spread provide us with a way to divide the data up into different size groups, which will give us a picture of what is happening in the data on either side of the middle (average) of the data set. The measures of spread also provide us with a way to determine how spread out the values in a data set are and whether or not they are grouped closely together and whether or not there are outliers in the data. There are 4 main measures of spread: Quartiles, Percentiles, Standard Deviation and Variance The picture below shows that quartiles and percentiles are used when the Median average has been calculated for the data set; and standard deviation and variance are used when the Mean average has been calculated. Measures of Central Tendency Median (i.e. outliers) Mean Quartiles & Percentiles Standard Deviation & Variance Measures of Spread All materials developed by March North 32 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 6.1 Range Range of a data set = highest value − lowest value Questions: The table below shows the marks of two classes for a test (/10). Class 1 Class 2 Student 1 5 4 Student 2 6 6 Student 3 7 7 Student 4 6 7 Student 5 6 6 Student 6 5 5 Student 7 5 4 Student 8 4 3 Student 9 6 6 Student 10 7 8 Student 11 7 7 Student 12 7 7 Student 13 6 6 Student 14 5 5 Student 15 5 5 Student 16 4 2 Student 17 5 5 Student 18 7 7 Student 19 7 8 Student 20 7 9 6.1.1 a) Calculate the range of the marks in class 1. b) Calculate the range of the marks in class 1 c) Compare the range of the marks for class 1 and class 2. What does the difference between the ranges tell you about the performance of class 1 and class 2 in the test? All materials developed by March North 33 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Disadvantage of the Range: Consider the following numbers: 1 10 11 12 13 25 Range of the numbers = 25 – 1 = 24 But the numbers are not actually widely spread??? Questions on the Height and Weight Data: 6.1.3 a) Determine the range of the heights of the males and females. b) Compare the ranges of the heights and write down what the difference in ranges tells you about the heights of the males and females. 6.1.4 a) Determine the range of the weights of the males and females. b) Compare the ranges of the weights and write down what the difference in ranges tells you about the weights of the males and females. All materials developed by March North 34 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 6.2 Quartiles & 5 Number Summaries Quartiles: Quartiles divide a data set up into 4 groups. Q2 lies in the middle of the data set. i.e. Q2 is the median of the dataset. Q1 lies at the 25% mark in the data set. i.e. At Q1, 25% of the values lie below Q1 and 75% lie above. Q1 lies at the middle of the group of values that lie below the median. Q3 lies at the 75% mark in the data set. i.e. At Q3, 75% of the values lie below Q1 and 25% lie above. Q3 lies at the middle of the group of values that lie above the median. Example: Below are the test results (/10) for a class. The marks have been arranged in ascending order. Mark (/10) John 1 Trudi 2 Mpho 3 Thulani 3 Jacob 3 Q1 Suzanne 4 Zipho 4 Chelsea 5 Max 5 Josh 5 Median (Q2) Rebecca 6 Inky 6 Brandon 6 Thuli 7 Nomkhosi 7 Jabu 7 Q3 Marc 8 Vaughn 8 Kate 9 So, for this test: 25% of the class scored 3 or less for the test. The median mark for the test was 5/10 (50%) 75% of the students scored 7/10 or less for the test. Only 25% of the students scored more than 7/10 for the test. Based on these results, the test could be considered to be a hard test as 25% of the class scored 30% or below and half the class scored 50% or below for the test. All materials developed by March North 35 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 5 Number Summaries: A 5 number summary gives us a way to summaries information about a data set. We can then use 5 number summaries to make comparisons and deductions about the values in a data set. A 5 number summary includes the following information: Minimum (smallest) value in the dataset 1st Quartile Median (2nd quartile) 3rd Quartile Maximum (biggest) value in the data set Example: The table below shows the results of the students in two different classes in a test (/30). The results have been arranged in ascending order. Class 1 Class 2 3 3 5 14 5 15 8 15 Q1 11 16 Q1 13 17 13 18 14 18 15 19 15 20 15 20 Median (Q2) 18 22 Median (Q2) 19 23 19 23 19 23 20 23 Q3 22 25 23 26 Q3 25 27 26 27 26 28 30 29 30 All materials developed by March North 36 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling We can construct the following 5 number summaries for each class: Class 1 Class 2 Minimum 5 Minimum 15 Q1 12 Q1 17 Q2 16,5 Q2 22 Q3 22,5 Q3 26 Maximum 30 Maximum 30 By comparing the five-number summaries of the test scores for the two different classes, we can make deductions about the performance of the students in each of the classes. Although the minimum and maximum of the test scores for both classes are the same, the quartiles of the test scores for Class 2 are much higher than the quartiles of the test scores for Class 1. This means that the scores in Class 1 contain more “low” scores than Class 2. In other words, the learners in Class 1 generally scored lower marks than the learners in Class 2. Similarly, Q3 is higher for Class 2 than for Class 1. This means that a greater percentage of students in Class 2 scored “very high marks” than in Class 1. Questions on the Height and Weight Data: 6.2.1 a) Construct a 5 number summary for the weight distribution of the whole group. b) Explain what the 5 number summary tells us about the weight distribution of the whole group. 6.2.2 a) Construct separate 5 number summaries for the heights of the females and males. b) Compare the 5 number summaries and comment on what they tell us about the heights of the males and females. 6.2.3 a) Construct separate 5 number summaries for the weights of the females and males. b) Compare the 5 number summaries and comment on what they tell us about the weights of the males and females. All materials developed by March North 37 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 6.3 Percentiles Percentiles divide a data set into percentage groupings. For example: The 50th percentile is the middle (median) of a dataset. The 25th percentile is the value at which 25% of the other values in the dataset lie at or below. 75% of the values also lie above this percentile. The 25th percentile is the same as Q1. The 75th percentile is the value at which 75% of the other values in the dataset lie at or below. 25% of the values also lie above this percentile. The 10th percentile is the value at which 10% of the other values in the dataset lie at or below. And so on … Example: 3 8 9 10 12 14 15 15 17 18 19 th th th 25 percentile 50 percentile 75 percentile How would we calculate the 10th percentile? There are 11 numbers in the list. The 10th percentile will lie in a position at 10% of the dataset. So: 10% × 11 = 1,1 ≈ 2 (rounded up) (you always round up when calculating percentile positions) So, the 2nd number in the list = “8” is the 10th percentile. How would be calculate the 65th percentile? There are 11 numbers in the list. The 65th percentile will lie in a position at 65% of the dataset. So: 65% × 11 = 7,15 ≈ 8 (rounded up) (you always round up when calculating percentile positions) So, the 8th number in the list = “15” is the 65th percentile. All materials developed by March North 38 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 6.4 Real-Life Applications of Percentiles 6.4.1 The “Road to Health” chart is given on the page below. a) Explain what it means that a child has an age-for-weight ratio that lies on the 50th percentile. b) Explain what it means that a child has an age-for-weight ratio that lies on the 97th percentile. c) What percentile lies at 60% of the 50th percentile? 6.4.2 a) What would be considered to be an average weight for an 18 month old baby? b) What would be considered to be an average weight for a 2 year old child? c) A 12 month old baby has an age-for-weight ratio that lies on the 97th percentile. Approximately how much does this child weigh? d) A 22 month old child has age-for-weight ratio that lies on the 3rd percentile. Approximately how much does this child weigh? e) A 3 month old baby weights 7,2 kg. According to the Road to Health Chart, would this baby be considered to have an above average, average, or below average age-for-weight ratio? Explain. f) How much would a 10 month old baby have to weigh to be considered to have an average age-for-weight ratio. g) A 14 month old baby weighs 5,5 kg. Could this baby be suffering from malnutrition? Explain your answer. 6.4.3 Can you think of any problems with the “Road to Health” chart? All materials developed by March North 39 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians (Source: Sanofi Pasteur, Vaccination Record (2008)) VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 40 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling 6.5 Body Mass Index 6.5.1 Calculating BMI: A more accurate way of determining the weight status of an individual is to use their Body Mass Index. The body mass index of an individual is calculated using the following formula: Body Mass Index (BMI) (kg/m2) = weight (kg) ÷ (height (m))2 Example: A man weighs 87 kg and is 1,76 m tall. BMI = 87 kg ÷ (1,76 m)2 = 87 kg ÷ 3,0976 = 28,1 kg/m2 (rounded off to one decimal place) 6.5.2 Using BMI to Determine the Weight Status of an Adult: The BMI of an adult older than 20 years is used to classify weight status in according to the following categories: BMI Classification <18.5 Underweight >= 18.5 and < 25 Normal >= 25 and < 30 Overweight > 30 Obese So, the man who weighs 87 kg and is 1,76 m tall would be classified as being “Overweight”. All materials developed by March North 41 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Questions on Height and Weight Data: The table below shows the height, weight, BMI and weight status of the 39 people from whom data was collected. BMI Height Weight (kg/m^2) Weight status Person 1 1.61 53 20.4 Person 2 1.7 70 24.2 Person 3 1.58 52.6 21.1 Person 4 1.78 78 Person 5 1.65 91.7 33.7 Person 6 1.85 89.8 26.2 Person 7 1.8 72 22.2 Person 8 1.5 78 34.7 Person 9 1.55 21.6 Person 10 1.68 75 26.6 Person 11 1.65 76 27.9 Person 12 1.76 72 23.2 Person 13 1.66 74 26.9 Person 14 1.56 109 44.8 Person 15 1.59 68 26.9 Person 16 1.64 78.1 Person 17 1.8 72 22.2 Person 18 1.74 69.1 22.8 Person 19 1.76 76 24.5 Person 20 1.62 141 53.7 Person 21 1.73 74.4 24.9 Person 22 1.62 29.5 Person 23 1.6 73.4 28.7 Person 24 1.9 74.9 Person 25 1.2 65 45.1 Person 26 1.65 110 40.4 Person 27 1.67 96.8 Person 28 1.67 70 25.1 Person 29 1.61 82 31.6 Person 30 1.42 103.7 51.4 Person 31 85 24.6 Person 32 1.64 74.1 27.6 Person 33 1.67 63 22.6 Person 34 1.82 77.5 23.4 Person 35 1.73 74.9 25.0 Person 36 1.67 62 22.2 Person 37 64 22.7 Person 38 1.63 74.6 28.1 Person 39 1.75 75 24.5 All materials developed by March North 42 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling a) Complete the table by filling in the missing values. b) Use the frequency table below to summarise the BMI data for the 39 people: BMI Classification Frequency % of Total <18.5 Underweight >= 18.5 and < 25 Normal >= 25 and < 30 Overweight > 30 Obese c) According to the frequency table, do you think there is a problem with weight amongst the 39 people surveyed? Explain. d) Calculate the average BMI for men and the average BMI for women and make a deduction about which group is the healthier group. 6.5.3 BMI for Children (2to 20 years) For a child, one their BMI has been calculated, the BMI is the mapped on a BMI-for-Age percentile graph (see below). (Source: Centre for Disease Control, www.cdc.gov/growthcharts) The weight status of the child is then classified according to the following criteria: Weight Status Percentile Range th Underweight Less than the 5 percentile 5 percentile and < 85 percentile th th Healthy weight 85 percentile and < 95 percentile th th At risk of overweight 95 percentile th Overweight Example: A 9 year old girl weighs 32 kg and is 1,25 m tall. BMI (kg/m2) = 32 kg ÷ (1,25 m)2 = 32 kg ÷ 1,5625 m2 ≈ 20,5kg/m2 (rounded off to one decimal place) Mapping this BMI value on the BMI-for-Age growth chart places the girl on or just above the 90th percentile. According to the table above, this places her in the “At risk of being overweight” category. All materials developed by March North 43 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling Questions: a) Determine the weight status of a 4 year old girl who weighs 15 kg and is 1 m tall. b) Determine the weight status of a 15 year old girl who weighs 42 kg and is 1,4 m tall. c) Determine the weight status of an 18 year old boy who weighs 81 kg and is 1,8 m tall. d) Determine the weight status of a 10 year old boy who weighs 28 kg and is 1,18 m tall. e) What is the average BMI for a 12 year old girl? f) What is the average BMI for a 19 year old girl? g) If the BMI values for 1 000 girls between the ages of 2 and 20 years were to be calculated, how many of them could we expect to be: i) Underweight; ii) At risk of being overweight; iii) Overweight? h) A girl with a BMI of 18kg/m2 has is “At Risk of Being Overweight”. Provide a range of possible ages of this girl. i) A 10 year old girl has a “Healthy Weight” status. Provide a range of possible BMI values for this girl. j) A 13 years old girl is 1,55 m tall and weighs 54 kg. How much weight would this girl have to lose in order to have a “Healthy Weight” status? All materials developed by March North 44 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 45 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians VULA MATHEMATICAL LITERACY WORKSHOP 7 – 10 JUNE 2008 Data Handling All materials developed by March North 46 Tel: 083 627 8188, mnorth@stannes.co.za Working for Mathematically Literate Mathematicians