# 251x9811 2/11/98

Document Sample

```					251y0312 9/26/03                              ECO251 QBA1
FIRST HOUR EXAM
October 1, 2003
Name: _____KEY____________
Social Security Number: _____________________

Part I.   (32 points)

1.   The process of using sample statistics to draw conclusions about true population parameters is
called
a) *statistical inference.
b) the scientific method.
c) sampling.
d) descriptive statistics.

2.   A summary measure that is computed to describe a characteristic of an entire population is called
a) *a parameter.
b) a census.
c) a statistic.
d) the scientific method.

3.   Which of the following is a discrete quantitative variable?
a) the Dow Jones Industrial Average
b) the volume of water released from a dam
c) the distance you drove yesterday
d) *the number of employees of an insurance company

TABLE 1-1
The manager of the customer service division of a major consumer electronics company is interested in determining
whether the customers who have purchased a videocassette recorder made by the company over the past 12 months are
satisfied with their products.

4.   Referring to Table 1-1, the possible responses to the question "Are you happy, indifferent, or
unhappy with the performance per dollar spent on the videocassette recorder?, " if we write down a
1 for ‘happy, ’ a 2 for ‘unhappy’ and a 3 for ‘indifferent, are the following kind of random
variable.
a) ratio
b) *nominal
c) interval
d) ordinal

1
251y0312 9/26/03

TABLE 2-2
At a meeting of information systems officers for regional offices of a national company, a survey was taken to
determine the number of employees the officers supervise in the operation of their departments, where X is the
number of employees overseen by each information systems officer.
X    f_
1     7
2     5
3     11
4     8
5     9

5.   Referring to Table 2-2, how many regional offices are represented in the survey results?
a) 127
b) 5
c) 15
d) *40        n      
f

TABLE 2-5
The following are the durations (in minutes) of a sample of long-distance phone calls made within the continental
United States, reported by one long-distance carrier:

Relative
Time (in Minutes)                  Frequency
0 but less than 5                  0.37
5 but less than 10                 0.22
10 but less than 15                0.15
15 but less than 20                0.10
20 but less than 25                0.07
25 but less than 30                0.07
30 but less than 35                0.02

6.   Referring to Table 2-5, if 1,000 calls were randomly sampled, how many calls lasted under 10
minutes?
a) 220                                              class                    f rel   Frel
b) 370                                        0 but less than 5             0.37 0.37
c) 410                                        5 but less than 10            0.22 0.59
d) *590                                       10 but less than 15           0.15 0.74
The answer is the                             15 but less than 20           0.10 0.84
20 but less than 25           0.07 0.91
cumulative frequency                          25 but less than 30           0.07 0.98
nd
for the 2 class                               30 but less than 35           0.02 1.00
multiplied by 1000.

7.   If I make a graph of the data in table 2-5 (Assume the table represents a sample of 1000 calls) with
the following x and y coordinates for the first five points: {(0, 0), (5, 370), (10, 590), (15, 740) ,
(20, 840)}, a one-word name for this type of graph is _ogive_ , and the last point on the line could
be (45, _1000_ ) Explanation: The x points are the upper limits of the class, starting at the last
empty class. The y points are the cumulative frequencies, gotten by multiplying the Frel column
by 1000. When the graph gets to x = 35, y hits 1000 and is 1000 for all subsequent points.

2
251y0312 9/26/03

8.   Referring to Table 2-5, what is Frel for the percentage of calls that lasted under 20 minutes?
a) 0.10
b) 0.76
c) *0.84                        Look at the table.
d) None of the above – write in the correct answer.

TABLE 2-7
The stem-and-leaf display below contains data on the number of months between the date a civil suit is filed and
when the case is actually adjudicated for 50 cases heard in superior court.
Stem          Leaves
1             234447899
2             22223455678889
3             0011135778
4             02345579
5             112466
6             158

9.  Referring to Table 2-7, the civil suit with the fourth shortest waiting time between when the suit
was filed and when it was adjudicated had a wait of _14__ months. Explanation: The first four
numbers are 12, 13, 14, 14.


k3
10. Eunice computes the following statistics from a sample
n
x  x 3 , 3 ,
(n  1)(n  2)               s

 x  x    2

,
3mean  mode
, k4 
n2

n  1        
x  x 4 3n  13 s 4 
             . She
n 1        std .deviation         n  1n  2n  3           n              n2     
                                
thinks the sample represents a population that is skewed to the right. Which of the statistics would
show skewness and what sign should she expect from them? (No partial credit on this one.)
Answer: Any legitimate measure of skewness would be positive if the population is skewed to the


x  x 3 -
n
right. From your formula table, the measures of skewness are: (i) k 3 
(n  1)(n  2)
k3                                             3mean  mode
skewness, (ii) g 1        3
- relative skewness and (iii) SK                     - Pearson’s measure of
s                                               std .deviation
skewness.

The other two are s    2

 x  x    2

- the sample variance, which is always positive and
n 1

measures dispersion and             k4 
n2

n  1      
x  x 4 3n  13 s 4 
               - the
n  1n  2n  3           n           n2        
                                
coefficient of excess (in the outline), which measures kurtosis.

11. In a perfectly symmetrical distribution with one mode.
a) the arithmetic mean equals the median.
b) the median equals the mode.
c) the arithmetic mean equals the mode.
d) *all of the above.
e) none of the above.

3
251y0312 9/26/03

12. According to the Bienayme-Chebyshev rule (I called it Chebyshef’s Inequality), at least 93.75% of
all observations in any data set are contained within a distance of how many standard deviations
around the mean?
a) 1
b) 2
c) 3
d) *4
Explanation: If at least 93.75% are ‘in,’ then at most 6.25% are out in the tails. The rule says
that 1 k 2 is the proportion in the tails, defined as the points below   k and the points
above   k . If you try out the values here, you will find       1
42
   1
16    .0625, so k must be 4.
More directly, you could solve 1    1
k2
 .9375 , by trying the four values of k that were
given. This is a problem that was done in class.

13. Evaluate the following statements. (i) The median of the values 3.4, 4.7, 1.9, 7.6, and 6.5 is 4.05.
(ii) In a set of numerical data, the value for Q3 can never be smaller than the value for Q1. (iii) In a
set of numerical data, the value for Q2 is always halfway between Q1 and Q3.
a) (i) and (ii) are false.
b) *(i) and (iii) are false.
c) (ii) and (iii) are false
d) Only one of the statements is false.
e) All of the statements are false.
Explanation: The numbers in order are  .9, 3.4 ,4.7 ,6.5 ,7.6 , so the median is 4.7 and (i) is
1
wrong. The order of the quartiles is Q1, median, Q3. If all the middle numbers are the same,
Q3 could equal both the median and Q1, but it could never be smaller than Q1, so (ii) is true.
Q2 is the second quartile and it could be any value between Q1 and Q3, depending of what the
numbers are. Its position, however, is halfway between them, so (iii) is false.

14. Which one of the following statements is false?
a) In a sample of size 40, the sample mean is 15. In this case, the sum of all observations in
the sample is    x  600.
b) *A population with 200 elements has an arithmetic mean of 10. From this
information, it can be shown that the population standard deviation is 15.
c) The median of a data set with 20 items would be the average of the 10th and 11th items in
the ordered array.
d) The coefficient of variation measures variability in a data set relative to the size of the
arithmetic mean.
e) If every possible group of 10 individuals in the population is equally likely to be chosen
to be in the sample, we must be taking a simple random sample of 10.
f) All of the above statements are false.

15. Which of the following is NOT a measure of central tendency?
a) the arithmetic mean
b) the geometric mean
c) the mode
d) *the interquartile range

4
251y0312 9/26/03

16. Which of the following is most sensitive to extreme values?
a) the median
b) the interquartile range
c) *the arithmetic mean
d) the 1st quartile

5
251y0312 9/26/03

Part II. (Ng pp 77-79) (8 points)

The data below represent the amount of grams of carbohydrates in a serving of breakfast cereal. It is a
sample containing 11 numbers. Note:     x  217 ,    
x 2  4541

{11, 15, 23, 29, 19, 22, 21, 20, 15, 25, 17}
Find:
a) The First Quartile (1.5)
b) The Standard Deviation (2)
c) The Coefficient of variation (1.5)
d) The five-number summary (3)
 x , x 2 , x3 , x 4 , x5 , x 6 , x 7 , x8 , x9 , x10 , x11 
Solution: a) Put the numbers in order.  1                                                                .
11, 15, 15, 17, 19, 20, 21, 22, 23, 25, 29 
n  11, so the first quartile is at position  pn  1  .2512  3.0 , and Q1  x3  15. Or if
a.b  3.0, x1 p  x.75  x a  .bx a 1  x a   x3  0x 4  x3   15  017  15.

b) x 
 x  217  19.7273, so, using the computational formula, s   x  2
2
 nx 2
n      11                                                                n 1
4541 1119.7273  2
260.17
                                26.017 . s  26.017  5.101 .
10             10
st .deviation s     5.101
c) C                             0.2586.
mean      x 19.7273
d) For the median position  pn  1  .512  6.0 and for the third quartile, position  pn  1  .7512
 9.0 . So, x.50  x6  20 and Q3  x.75  x9  23. The 5 number summary would be {lower bound, Q1,
median, Q3, upper bound} or  , 15, 20, 23, 29 .
11

6
251x0312 9/23/03                                   ECO251 QBA1
FIRST EXAM
October 1, 2003
TAKE HOME SECTION
-
Name: _____KEY________________
Social Security Number: _________________________

Throughout this exam show your work! Please indicate clearly what sections of the problem you are
answering and what formulas you are using.

Part III. Do all the Following (11 Points) Show your work!

1. My Social Security Number is 265398248. If I use each digit as a frequency in and the intervals below, I
get:
Class               Frequency                            Assume that this data represents a sample of rents paid in
Chester County.
\$0- 5999                     2                              a. Calculate the Cumulative Frequency (0.5)
\$6000- 11999                 6
b. Calculate The Mean (0.5)
\$12000- 17999                5
\$18000- 23999                3                              c. Calculate the Median (1)
\$24000- 29999                9                              d. Calculate the Mode (It is possible but unlikely that there is
\$30000- 35999                8                              more than one)(0.5)
\$36000- 41999                2                              e. Calculate the Variance (1.5)
\$42000- 47999                4                              f. Calculate the Standard Deviation (1)
\$48000- 53999                8                              g. Calculate the Interquartile Range (1.5)
h. Calculate a Statistic showing Skewness and Interpret it
(1.5)
i. Make a frequency polygon of the Data (Neatness
Counts!)(1)
j. Extra credit: Put a (horizontal) box plot below the
Replace my Social Security number with your own in the                  histogram using the same scale. (1)
frequency column. To make the problem easier, you may
replace all zeros in your new frequency column with 10s.

Solution: x is the midpoint of the class. Our convention is to use the midpoint of 0 to 2, not 1.99999. Note
also, that the midpoints and class limits have been divided by 1000. Most numbers should be multiplied by
1000, the variance should be multiplied by 1,000,000 and k 3 by 1,000,000,000.
fx 2          fx 3     xx        f x  x  f  x  x  f  x  x 
2           3
class               f F x          fx
A    0- 5.999     2    2    3      6            18            54     -26.1702     -52.340 1369.76 -35846.9
B    6-11.999     6    8    9     54           486          4374     -20.1702    -121.021 2441.02 -49236.0
C   12-17.999     5   13   15     75          1125         16875     -14.1702     -70.851 1003.97 -14226.5
D   18-23.999     3   16   21     63          1323         27783     - 8.1702     -24.511   200.26 -1636.1
E   24-29.999     9   25   27    243          6561        177147     - 2.1702     -19.532    42.39   -92.0
F   30-35.999     8   33   33    264          8712        287196       3.8298      30.638   117.34   449.4
G   36-41.999     2   35   39     78          3042        118638       9.8298      19.660   193.25  1899.6
H   42-47.999     4   39   45    180          8100        364500      15.8298      63.319 1002.33 15866.6
I   48-53.999     8   47   51    408         20808       1061208      21.8298     174.638 3812.32 83222.1
47             1371         50175       2058075                    0.000 10182.64   400.2

n    f  47,  fx             fx  50175,  fx  2058075,  f x  x   0,
 1371,           2                       3

 f x  x 2  10182.64, and  f x  x 3  400.2. Note that, to be reasonable, the mean, median and
quartiles must fall between 0 and 54.
a. Calculate the Cumulative Frequency (1): (See above) The cumulative frequency is the whole                            F column.

b. Calculate the Mean (1): x 
 fx  1371  29.1702
n          47

7
c. Calculate the Median (2): position  pn  1  .548  24 . This is above F  16 and below F  25, so
 pN  F 
the interval is E, 24-29.999 in thousands. x1 p  L p           w so
 fp 
        
 .547  16 
x1.5  x.5  24               6  24  0.8333310  24.5000
     9       
d. Calculate the Mode (1) The mode is the midpoint of the largest group. Since 9 is the largest frequency,
the modal group is E, 24 to 29.999 and the mode is 27 (in thousands).

e. Calculate the Variance (3): s 2 
 fx    2
 nx 2

51075 4729.17022 11082.673
           221.3627 or
n 1                      46             46

s2 
 f x  x        2


10182.64
 221.3617 . The computer got 221.362. (in millions)
n 1                46
f. Calculate the Standard Deviation (2): s  221.3627  14.8783 or s  221.3617  14.8782 (in
thousands)
g. Calculate the Interquartile Range (3): First Quartile: position  pn  1  .2548  12 . This is above
 pN  F 
F  8 and below F  13, so the interval is C, 12-17.999. x1 p  L p           w gives us, in thousands,
 fp 
        
 .2547  8 
Q1  x1.25  x.75  12               6  16.500 .
      5      
Third Quartile: position  pn  1  .7548  36 . This is above F  35 and below F  39, so the interval
 .7547  35 
is H, 42-47.999. x1.75  x.25  42                6  42.375 .
      4       
IQR  Q3  Q1  42.375 16.500  25.875 (in thousands).
h. Calculate a Statistic showing Skewness and interpret it (3):
k 3
n
(n  1)(n  2)

fx 3  3x                                          
fx 2  2nx 3  47 2058075 329.170250175  24729.17023
4645

0.02270532058075 4390844.4  2333168.3

 0.0227053 399.3  9.066 .

 f x  x                  400.2  9.087 (The computer gets 9.0849) or
n                                 47
or k 3                                       3

(n  1)(n  2)                         4645
k3             9.085
g1                                .00276
s   3
14.87823
3mean  mode 329.1702  27
or   Pearson's Measure of Skewness SK                                              0.4376
std .deviation        14.8782
Because of the positive sign, the measures imply skewness to the right.
i. A frequency polgon is a simple line graph with frequency on the y-axis and the numbers 0- 54 (thousand)
on the x-axis. Since class A has a frequency of 2 plotted at x = 3 and the class width is 6, it should really
start at x = -3 and y = 0. You should, at least show, the line falling across the y axis. Sinne the last non-
empty class is 48-53.999, with its frequency plotted at x = 51, there should be a zero at x = 57.
j. The box plot should show the median and the quartiles. (See text)

8
251y0312 9/26/03

2. My Social Security Number is 265398248. If I write it in clumps of 2 numbers and add 100 to the end, I
get:
26, 53, 98, 24, 8, 100.
Write your social security number the same way, so that you have a list of six numbers. Note: If any of these
five numbers is a zero, change it to a one. For these five numbers, compute the a) Geometric Mean b)
Harmonic mean, c) Root-mean-square (1point each). Label each clearly. If you wish, d) Compute the
geometric mean using natural or base 10 logarithms. (1 points extra credit each ).

Solution: Note that         x  209 . This is not used in any of the following calculations and there is no reason
why you should have computed it!
a) The Geometric Mean.

x 
1
x g  x1  x 2  x 3  x n  n  n                       265398248100  6 2592844800  2592844800   1
5                                                                6


 25928448 0.16667  37.0648.

b) The Harmonic Mean.
1 1                              1 
 x  6  26  53  98  24  8  100   6 0.0384615 0.0188679 0.010204 0.00036099 0.125  0.01
1 1           1             1   1           1    1           1

xh n                                            


1
0.20289454  0.0338157.                 So xh 
1

1
 29.57208

6                                                       1           1       0.0338157
n           x
c) The Root-Mean-Square.
x rms 
2     1
n    
x 2  26 2  532  982  24 2  8 2  1002  676  2809  9604  576  64  10000
1
6
                           1
6

x
1

1
23729  3954.83 . So x rms                             2
 3954.83  62.8875 .
6                                               n

d) (i)
 
ln x g 
1
n    ln(x)  6 ln26  ln53  ln98  ln24  ln8  ln100
1


1
3.25809 3.97029 4.58497  3.17805 2.07944 4.60517  1 21.67594  3.6127
6                                                           6
So x g  e 3.6127
 37.0644 .
(ii)
 
log x g 
1
n    log( x)  6 log26  log53  log98  log24  log8  log100 
1


1
1.41497 1.72428 1.99123 1.38021 0.90309 2.00000  1 9.41378  1.56896.
6                                                          6
So x g  101.56896  37.0649 .

Notice that the original numbers and all the means are between 8 and 100.

9

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 8 posted: 12/11/2011 language: Latin pages: 9