VIEWS: 6 PAGES: 17 CATEGORY: Statistics POSTED ON: 1/9/2013
Lec-2 Freq. Distt. ( Cont.&Disct.Data).ppt
CLASSIFICATION AND TABULATION Tariq Mahmood Bajwa University of Veterinary & Animal Sciences Lahore. 1 FREQUENCY DISTRIBUTION A frequency distribution is a tabular arrangement of data in which various items are arranged into classes or groups and the number of items falling in each class is stated. The number of observations falling in a particular class is referred to as class frequency or simply frequency and is denoted by "f". In frequency distribution all the values falling in a class are assumed to be equal to the midpoint of that class. Data presented in the form of a frequency distribution is also called grouped data. Data which have not been arranged in a systematic order are called raw data or ungrouped data. 2 CLASS LIMITS The class limits are defined as the number or the values of the variables which are used to separate two classes. The smaller number is called lower class limit and larger number is called upper class limit. For discrete variables, class boundaries are the same as the class limits. Sometimes classes are taken as 20--25, 25--30 etc In such a case, these class limits means " 20 but less than 25", "25 but less than 30" etc Class Boundaries The class boundaries are the precise numbers which separate one class from another. The main object to defined class boundaries is to removes the difficulty, if any, in knowing the class to which a particular value should be assigned. The class boundary is located midway between the upper limit of a class and the lower limit of the next higher class. 3 CLASS MARKS OR MIDPOINTS The class mark or the midpoint is that value which divides a class into two equal parts. It is obtained by dividing the sum of lower and upper class limits or class boundaries of a class by 2. CLASS INTERVAL Class interval is the length of a class. It is obtained by I. The difference between the upper class boundary and the lower class boundary. (Not the difference between class limits). II. The difference between either two successive lower class limits or two successive upper class limits. III. The difference between two successive midpoints. A uniform class interval is usually denoted by "h". 4 CONSTRUCTION OF A FREQUENCY DISTRIBUTION Decide the number of classes No hard and fast rule for deciding on the no of classes. Statistical experience tells us that no less than 5 and no more than 20 classes are generally used. The number of classes is determine by the formula i.e K=1+3.3 log(n). Where K denotes the number of classes and n denotes the total number of observations. Determine the range of variation of the data. The difference between the largest and smallest values in the data is called the range of the data. i.e R = largest observation - smallest observation Where R denote the range of the data. 5 Determine the approximate size of class interval The size of the class interval is determine by dividing the range of the data by the number of classes i.e h= R/K Where h denotes the size of the class interval. In case of fractional results the next higher whole number is usually taken as the size of the class interval. Decide where to locate the class limits The lower class limit of the first class is started just below the smallest value in the data and then add class interval to get lower class limit of the next class, repeat this process until the lower class limit of the last class is achieved. Distribute the data into appropriate classes Take an observation and marked a vertical bar "I"(Tally) against the class it belongs. 6 Example The following data is the final plant height (cm) of thirty plants of wheat. Construct a frequency distribution 87 91 89 88 89 91 87 92 90 98 95 97 96 100 101 96 98 99 98 100 102 99 101 105 103 107 105 106 107 112 7 Step- 1: Calculate the Range R = Largest observation - Smallest observation = 112 - 87 = 25 Step- 2: Number of classes The number of classes is determine by the formula K = 1+3.3 log (n) = 1+3.3 log(30)= 1+3.3(1.4771)= 5.87 = 6 Step-3: Size of class interval The size of the class interval h= R/K h = 25/6 = 4.17 = 5 8 Step- 4: Choose the lowest value Minimum Value = 87, so start the class interval from 86. Step-5: Calculate the mid point Average of lower and upper class limits Step- 6: Convert the class limits to class boundaries h midpiont 2 Step-7: Assigned the observations to the Classes Starting from first observation and assigned the observation to the classes they belong. Tally mark is made in the tally column against this class. 9 The following data is the final plant height (cm) of thirty plants of wheat. 87 91 89 88 89 91 87 92 90 98 95 97 96 100 101 96 98 99 98 100 102 99 101 105 103 107 105 106 107 112 10 Class Class Mid- Entries Tally f c.f. Limits Boundaries Points 86------90 85.5-----90.5 88 87,89,88,89,87,90 IIII I 6 6 91------95 90.5-----95.5 93 91,91,92,95 IIII 4 10 96----100 95.5----100.5 98 98,97,96,100,96,98,99,98, IIII IIII 10 20 100,99 101--105 100.5--105.5 103 101,102,101,105,103,105 IIII I 6 26 106--110 105.5--110.5 108 107,106,107 III 3 29 111--115 110.5--115.5 113 112 I 1 30 Total 30 Frequency distribution of the height of plants. 11 FREQUENCY DISTRIBUTION TABLE Class Class Mid Tally Freq C.F Limits Boundaries Points uency (f) 86---90 85.5---90.5 88 ///// 6 6 91---95 90.5---95.5 93 //// 4 10 96---100 95.5---100.5 98 //////// 10 20 101---105 100.5--105.5 103 ///// 6 26 106---110 105.5– 110.5 108 /// 3 29 111---115 110.5--115.5 113 / 1 30 30 12 Example Suppose we walk in the nursery class of a school and we count the no. of Books and copies that students have in their bags. Suppose the no. of books and copies are 3,5,4,4,5,5,5,6,5,5,8,85,5,5,6,6,6,6,6,6,6,6,6,6,6,7,,7,7,7, 7,7,8,,8,8, 7, 9,9,97,7,7,9,9,9, and so on. Representation of Data in a Discrete Frequency Distribution X Tally Frequency 3 | 1 4 ||| 3 5 |||| |||| 9 6 |||| |||| ||| 13 7 |||| |||| 10 8 ||| 3 9 |||| | 6 Total 45 Relative Frequency Distribution X Frequency Relative/ %age Frequency 3 1 1/45 x 100 = 2.22% 4 3 3/45 x 100 = 6.67% 5 9 9/45 x 100 = 20% 6 13 13/45 x 100 = 28.89% 7 10 10/45 x 100 = 22.22% 8 3 3/45 x 100 = 6.67% 9 6 6/45 x 100 = 13.33% Total 45 Cumulative Frequency Distribution X Frequency Cumulative Frequency 3 1 1 4 3 1+3 = 4 5 9 4+9 = 13 6 13 13+13 = 26 7 10 26+10 = 36 8 3 36+3 = 39 9 6 39+6 = 45 Total 45 Frequency The number of values falling in a particular category Cumulative frequency Sum of the observed frequency plus all above class frequencies Notations X,Y,Z, n, N,∑ (Summation) 17