Introduction to Statistics by 7n2Mpe


									Introduction to Statistics
           Intro. to Statistics
   What is Statistics?
    • “…a set of procedures and rules…for
      reducing large masses of data to
      manageable proportions and for
      allowing us to draw conclusions from
      those data”
                 Intro. to Statistics
   What can Stats do?
    • Make data more manageable
          Group of numbers:
                               6, 1, 8, 3, 5, 4, 9
          Average is: 36/7 = 5 1/7
          Graphs:
                            1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
                Intro. to Statistics
   What can Stats do?
    • Allow us to draw conclusions from the data
          Variable = Coolness
          Group #1: 6, 1, 8, 3, 5, 4, 9
            • People who take my stats class
            • Average is 5 1/7
          Group #2: 8, 3, 4, 2, 7, 1, 4
            • People who take other people’s stats classes
            • Average is 4 ¼
          What can we conclude from these numbers?
    • Allows us to do this objectively and
            Intro. to Statistics
   “Quantitative”           “Qualitative”
    • Involves                • Describes the
      measurement               nature of
    • Data in numerical         something
      form                    • Answers “What” or
    • Answers “How              “Of what kind”
      much” questions           questions
    • Objective and           • Often evaluative
      results in                and ambiguous
            Intro. to Statistics
   Qualitative Distinctions:
    • “Good” versus “Bad”
    • “Right” versus “Wrong”
    • “A Lot” versus “A Little”
   Quantitative Distinctions:
    • 5 1/7 versus 4 ¼
    • 25% versus 50%
    • 1 hour versus 24 hours
             Basic Terminology
   Summarizing versus Analyzing
   Descriptive Statistics
   Inferential Statistics
    • Inference from sample to population
    • Inference from statistic to parameter
    • Factors influencing the accuracy of a sample’s
      ability to represent a population:
         Size
         Randomness
         Basic Terminology
• Size –
     Sample of 5 cards from a deck of 52
       • 2 of Clubs, 10 of Diamonds, Jack of Hearts, 5 of
         Clubs, and 7 of Hearts
     What could we conclude about the full deck
      from this sample about what the full deck
      looks like without any prior knowledge of a
      deck of cards?
     Compare this to a sample of 51/52 cards –
      What could we conclude from this sample?
         Basic Terminology
• Randomness –
     This time lets use the same 5 card sample,
      but this time the deck is unshuffled
       • 2 of Clubs, 10 of Clubs, Jack of Clubs, 5 of Clubs,
         and 7 of Clubs
     What would we conclude about the
      characteristics of our population (the deck)
      this time versus when the sample was more
      random (shuffled)?
          Basic Terminology
   Most often, the aim of our research
    is not to infer characteristics of a
    population from our sample, but to
    compare two samples
    • I.e. To determine if a particular
      treatment works, we compare two
      groups or samples, one with the
      treatment and one without
               Basic Terminology
    • We draw conclusions based on how similar the
      two groups are
          If the treated and untreated groups are very similar,
           we cannot declare the treatment much of a success
   Another way of putting this in terms of
    samples and populations is determining if
    our two groups/samples actually come
    from the same population, or two different
                Basic Terminology
   Group A (Treated) and B (Untreated)
    are sampled from different
    populations/treatment worked:

            Group A                     Group B
    Population of Well People   Population of Sick People
          Basic Terminology
   Group A and B are sampled from the
    same population/treatment didn’t

                     Group A
                     Group B
             Population of Sick People
             Basic Terminology
   Quantitative Data
    • Dimensional/Measurement Data versus
      Categorical/Frequency Count Data
         Dimensional
           • When quantities of something are measured on a
           • Answers “how much” questions
           • I.e. scores on a test, measures of weight, etc.
       Basic Terminology
   Categorical
     • When numbers of discrete entities have to be
          Gender is an example of a discrete entity –

           you can be either male or female, and nothing
           else – speaking of “degree of maleness”
           makes little sense
     • Answers “how many” questions
     • I.e. number of men and women, percentage of
       people with a given hair color
             Basic Terminology
   A dimensional variable can be
    converted into a categorical one
    • Convert scores on a test (0-100) into
      “Low”, “Medium”, and “High” groups –
      0-33 = Low; 34-66 = Medium, and 67-
      100 = High
         The groups are discrete categories (hence
          “categorical”), and you would now count
          how many people fall into each category

To top