PowerPoint Presentation by S4x3HoR


									Course Topics
   Simple statistical methods for data analysis using Excel.
    • descriptive statistics,
    • an introduction to statistical inference, and
    • linear regression models.
   Excel workbooks for computing elementary statistics
    using the Data Analysis toolkit.
   Transferring digital information (graphs and tables) into
    Word documents, developing presentations in Power
   Publishing documents on the web
Optional Texts
   Statistics with Microsoft Excel by B.J. Dretzke
    (Recommended for students that are not
    familiar with Excel)
   Introduction to the Practice of Statistics, by
    David S. Moore and George P. McCabe
   Elementary Statistics (2002), by M. F. Triola.
   The Basic Practice of Statistics (2000), by D.S.
Useful links
   Surfstat: an online text in introductory
   Statistics at Square One:
   The DePaul University library offers a number
    of good books on Excel using books 24X7: IT
Getting ready for the class
•   Open Excel
•   Check that the Tools menu contains the Data Analysis
•   If not, use Tools|Add Ins… and click on box labeled
    Analysis ToolPak
     Exploratory Data Analysis
The goal of data analysis is to gain information from the data.
Long listings of data are of little value.
Statistical methods come to help us.

Exploratory data analysis: set of methods to display and summarize the data.

Data on just one variable: the distribution of the observations is analyzed by

I.    Displaying the data in a graph that shows overall patterns and unusual
      observations (stem-and-leaf plot, bar chart, histogram, box plot, density

II.   Computing descriptive statistics that summarize specific aspects of the
      data (center and spread).
Observed variables
Data contain information about a group of individuals or subjects

A variable is a characteristic of an observed individual which takes
different values for different individuals:
     Quantitative variable (continuous) takes numerical values.
     Ex.: Height, Weight, Age, Income, Measurements
     Qualitative/Categorical variable classifies an individual into
     categories or groups.
     Ex. : Sex, Religion, Occupation, Age (in classes e.g. 10-20, 20-30, 30-
The distribution of a variable tells us what values it takes and how often it
takes those values

Different statistical methods are used to analyze quantitative or categorical
Graphs for categorical
The values of a categorical variable are labels.
The distribution of a categorical variable lists the count or
  percentage of individuals in each category.

         Wireless surfers by Age
                Bar Chart                        Pie chart
60%                  42%                         55>
40%                                              5%
20%                                5%
 Counts: 212        168            20    35-54               18-34
                                          42%                 53%
        18-34       35-54          55>

 A sample of 400 wireless internet users.
Another Example
              Wireless internet users
           Male                 288 (72%)
          Female                112 (28%)
           Total               400 (100%)
           Wireless surfers by gender
                   Bar chart

   100%            72%

    50%                             28%

                   Male            Female
Assigning Categories
Example: On the morning of April 10, 1912 the Titanic
sailed from the port of Southampton (UK) directed to NY.
Altogether there were 2,201 passengers and crew
members on board. This is the table of the survivors of
the famous tragic accident.

                          Survived                Dead
                     Male       Female     Male      Female
     First class     62              141   118           4
    Second class     25              93    154           13
     Third class     88              90    422           106
   Crew members      192             20    670           3
      The Histogram
Example: CEO salaries
Forbes magazine published data on the best small firms in 1993. These were firms with
    annual sales of more than five and less than $350 million. Firms were ranked by five-
    year average return on investment. The data extracted are the age and annual salary of
    the chief executive officer for the first 59 ranked firms.

   Salary of chief executive officer (including
   bonuses), in $thousands

   145 621 262 208 362 424 339 736 291
   58 498 643 390 332 750 368 659 234
   396 300 343 536 543 217 298 1103 406
   254 862 204 206 250 21 298 350 800
   726 370 536 291 808 543 149 350 242
   198 213 296 317 482 155 802 200 282
   573 388 250 396 572
Drawing a histogram
1.      Construct a distribution table:
     i.    Define class intervals or bins (Choose intervals of equal width!)
     ii.   Count the percentage of observations in each interval
     iii. End-point convention: left endpoint of the interval is included,
           and the right endpoint is excluded, i.e. [a,b)
2.      Draw the horizontal axis.
3.      Construct the blocks:
       Height of block = percentages!

     The total area under an histogram must be 100%

To top