Docstoc

AP_Course_Audit_AP_Statistics

Document Sample
AP_Course_Audit_AP_Statistics Powered By Docstoc
					         AP Course Audit: Manlius Pebble Hill (AP Statistics)

         We will cover how to use the graphing calculators each time we encounter a feature that the graphing
         calculator can accommodate. These features include the following: calculating the mean, calculating the
         standard deviation, calculating the median, creating scatter plots, creating box plots, linear regression,
         non – linear regression, 1 sample t and z tests, 2 samples t and z tests, z confidence intervals, t
         confidence intervals, chi – squared tests for goodness of fit, chi – squared tests for homogeneity and
         independence.

         Assignments are designed so that each student uses a unique topic or ends up with a unique data set
         upon which they do an independent calculation. This fosters independence of thought, confidence and
         keeps students honest about what they personally understand.

         Each day we cover an AP problem in class from one of the released exams. The main goal is a complete
         understanding of the problem and how it relates to the day’s topic.

         All graphical displays that a student creates should be done with the help of Excel or other similar
         graphical tool.



         Projects:

         Each student must write an article for the school newspaper. Before they can submit an article they
         must design a survey or experiment, create a sampling plan, gather the data, analyze the data and come
         to a conclusion given the data set. Then they have to write an article summarizing what they found
         along with at least one graphical aid for any reader of that article.




Day      Topic                 Description                  Activity                  Assignment            Textbook
                                                                                                            HW
    QI           Quarter I: The Data Analysis Process, Collecting Data & Methods for Describing Data
1        Variability in      We will discuss the       We will measure the They will
         Inferential         concept of what is a      salinity of normal    measure the
         Statistics          ‘typical’ range of values drinking water at     lengths of two
                             for a measurable          our school and then different types
                             quantity. Then we will    measure the salinity of leaves from
                             discuss how we can use of that same water       two different
                             that range of values and after a ‘toxic spill’. trees. Then
                             how ‘frequently’ those    Then we will try to   they will try to
                             measurements occur        determine whether     give a criterion
                             can help to make a        the drinking water    as to what
                             decision.                 is contaminated.      range of values
                                                                             distinguish one
                                                                          tree from
                                                                          another.
2   Types of Data             Frequency          We create a survey      They pick two      Sec 1.4 #
    and Simple                 Distribution for   that we could use to    of the United      1.9, 1.11,
    Graphical                  Categorical        collect frequency       States’ top        1.19
    Displays                   Data               information so that     commodities
                              Frequency          we can practice         (or from the
                              Relative           displaying bar charts   student’s home
                               Frequency          and dot plots           country) from
                              Bar Charts                                 the FAO and
                              Dot Plots                                  track the
                                                                          production and
                                                                          revenue for the
                                                                          past 6 years.
                                                                          Then they
                                                                          describe what
                                                                          they found with
                                                                          a bar chart.
3   Sampling                Why sample?          Designing a survey      Exploring          Sec 2.2 #
    Methods and             Sample sizes         to determine how        sampling           2.5, 2.11,
    Bias                    Selection bias       many hours students     methods in the     2.13, 2.25
                            Measurement          spend on homework       context farming
                             or response          at our school in the    (plant wilt) and
                             bias                 upper school. Along     how to get a
                            Non-response         with a discussion       good sample
                             bias                 about how to do the     given some
                            Conceptual bias      actual sampling.        uncontrolled
                            Simple Random                                variables.
                             samples
                            Stratified
                             random
                             sampling
                            Cluster
                             sampling
                            Systematic
                             sampling
                            Why not
                             Convenience
                             sampling?
                            Why not
                             volunteer
                             sampling?
                            How important
                             sampling biases
                             are for
                             researchers
                             when designing
                             experiments.
4   Statistical Studies:    Why do               In groups they will     Given four         Sec 2.3 #
    Observation and      statistical      pretend that they          different types    2.27, 2.33,
    Experiment           studies?         have just unearthed        of statistical     2.35, 2.37
                       The difference    a new archeological        studies, each
                         between          find. Then they will       student must
                         observational    try to list the types of   determine the
                         studies and      things they may            goal of the
                         experiments.     want to learn from         study and what
                       When you can      the site along with        would be
                         draw cause and   what types of              enough
                         effect           information from the       information to
                         relationships    site would influence       draw a cause
                         between          a study.                   and effect
                         measured                                    relationship
                         quantities.                                 between any
                       Confounding                                  measured
                         variables.                                  quantities.
5   Simple             The design of a   We will discuss the        Each student       Sec 2.4 #
    Comparative         good              Stroop effect and          will find and      2.39, 2.41,
    Experiments         experiment        each group will            evaluate           2.43, 2.45
                       An example        design and perform         several famous
                        experiment        an experiment              experiments
                       Randomization     testing the Stroop         based on the
                       Blocking          effect.                    criteria that we
                       Direct Control                               discussed in
                       Blocking                                     class.
6   More On            Control groups
    Experimental       Placebo
    Design             Single blind
                        experiments
                       Double blind
                        experiments
7   Survey Design      The different     We will discuss the        Each student       Sec 2.6 #
                        tasks of a        yearly class survey        will read the      2.59, 2.61
                        respondent.       that I give to the         National
                       Comprehension     students and how to        Geographic
                       Retrieval from    improve the                article ‘Opium
                        memory.           questions in the           Wars’ in order
                       Answering the     survey.                    to refine their
                        questions.                                   understanding
                       Common                                       of the concept
                        stumbling                                    ‘survey’. They
                        blocks in                                    will explore the
                        responding.                                  depth of
                                                                     understanding
                                                                     that the author
                                                                     has of the
                                                                     subject, but
                                                                     compare how
                                                                     small the
                                                                       sample size is in
                                                                       an article like
                                                                       that with what
                                                                       we know about
                                                                       sampling.
8    Review:
     Chapters 1 & 2
9    Exam: Chapters
     1&2
10   Displaying           Comparative        Students explore the     Each student        Sec 3.1 #
     Categorical           bar charts.        comparative hunting      will compare        3.3, 3.5,
     Data:                Pie charts for     success between          the top 20          3.11, 3.15 +
     Comparative Bar       categorical        Egrets and Herons        commodities         Test
     Charts and Pie        data.              via comparative bar      produced by         Corrections
     Charts               Stacked bar        charts and frequency     two different
                           charts             data.                    countries using
                                                                       comparative
                                                                       bar charts and
                                                                       pie charts.
11   Displaying           How to             We explore different     Each student        Sec 3.2 #
     Numerical Data:       construct stem     aspects of stem and      will locate a       3.17, 3.21,
     Stem and Leaf         and leaf plots     leaf plots in order to   different real      3.23
     Plots                Outliers           clarify the              life example of a
                          Spread             construction of these    stem and leaf
                                              plots.                   plot.
12   Displaying           Histograms for     We revisit the data      We revisit the      Sec 3.3 #
     Numerical Data:       discrete           from measuring the       students            3.25, 3.27,
     Frequency             numerical data.    salinity of the          measurements        3.33, 3.35
     Distributions        Histograms for     school’s drinking        of the several
     and Histograms        continuous         water and use            leaf lengths
                           numerical data     histograms to make       from two
                           (with the aid of   any arguments            different types
                           stem and leaf      clearer and visually     of trees and
                           plots)             appealing.               have them
                          Frequency and                               describe
                           relative                                    visually what
                           frequency                                   the differences
                           distributions.                              are.
                          Examples
13   Displaying           How to             We will create an        Each student        Sec 3.4 #
     Bivariate             construct and      example of a scatter     will have to        3.41, 3.43,
     Numerical Data        label a scatter    plot using raw data      make several        3.49, 3.53
                           plot.              from the FAO and         new scatter
                          Time series        explore the meaning      plots using raw
                           plots.             of the trend in the      data from the
                          Trends (linear     data. Then we will       FAO. They will
                           and non –          discuss the              also have to
                           linear).           implications of the      describe the
                                              data for the leaders     trends that they
                                             of a given country.      see along with
                                                                      any
                                                                      implications of
                                                                      those trends.
14   Describing the        The difference   “Stringing Students      Using raw data      Sec 4.1 #
     Center of a Data       between the      Along” is an activity    from the FAO,       4.5, 4.9,
     Set Numerically        words            that explores how to     each student        4.13, 4.15
                            ‘population’     sample objects like      will make
                            and ‘sample’.    bank queues to           several
                           Mean.            determine center         estimations of
                           Median.          and variability. We      the center of a
                           Proportion of    look and two             data set. They
                            successes.       different sampling       will also have to
                           Trimming data.   methods for strings      find a data set
                                             of varying length in a   for which an
                                             bag, and try to          average does
                                             determine whether        not make sense.
                                             either method shows
                                             any sampling bias.
15   Describing the        The importance   We will revisit the      They will have      Sec 4.2 #
     Variability in a       of variability   water salinity data      to describe the     4.21, 4.23,
     Data Set               and spread.      and describe the         variability of      4.25, 4.29
                           Standard         data’s center and        the
                            deviation.       variability              commodities
                           Interquartile    numerically using        that they chose
                            range.           the concepts from        last time.
                                             the past couple of
                                             days.
16   Summarizing a         How box plots    “Capture –               Activity 4.2        Sec 4.3 #
     Data Set: Box          can summarize    Recapture” is an         (SADA) is an        4.31,4.33,
     Plots                  data.            activity that            activity that       4.35, 4.37
                           Skeletal box     demonstrates a           explores the
                            plots.           method used by           possible shapes
                           Modified box     naturalists to           of box plots
                            plots.           estimate the size of     given different
                           Outliers.        populations that are     data sets.
                           Extreme          hard to estimate. We
                            outliers.        will simulate the
                           Cost – to –      process with
                            Charge ratio.    Pepperidge Farm
                                             gold fish.
17   Interpreting          How to           “Sampling Pennies”       Each student        Sec 4.4 #
     Center and             measure          is an activity that      will go back to     4.39, 4.41,
     Variability:           ‘distance from   acts as an               their ERB           4.43, 4.45
     Chebyshev’s            the center’ in   introduction to the      scores and find
     Rule, the              terms of         concept of a             the mean,
     Empirical Rule,        standard         distribution. It also    standard
     and Z – Scores         deviations.      makes use of             deviation of the
                           Chebyshev’s      calculations that        population and
                                             estimate the center,     then compare
                                rule.              variability of a data   their score to
                               The empirical      set. We can then        these. Then
                                rule.              check empirical         they will
                               Z – scores.        results against         calculate what
                               percentile         predictions for how     score they
                                                   many data points are    would have
                                                   supposed to be in a     needed in order
                                                   range.                  to get in a
                                                                           certain
                                                                           percentile.
18     Extra Day
19     Review:
       Chapters 3 & 4
 QII                       Quarter II: Bivariate Data, Probability and Distributions
20     Exam: Chapters
       3&4
21     Correlation             How to             In class we       Each student must        Sec 5.1 # 5.1,
                                calculate          explore the       find a linear            5.5, 5.9, 5.11
                                correlation.       concept of        relationship in a
                               What               correlation by    scholarly scientific
                                correlation        looking at        article and
                                means.             GPA scores        summarize what the
                               When a set of      (for 9th, 10th    linear relationship.
                                bivariate          11th and 1st
                                numerical data     semester
                                has a good         senior year)
                                correlation.       along with
                               What the           SAT scores
                                formula for        and ERB
                                correlation        scores to see
                                mean.              which pair of
                               What               numerical
                                correlation2       data sets yield
                                means.             the strongest
                                                   correlation.
22     Linear                  Formula for the    We generate       They will                Sec 5.2 #
       Regression:              y – intercept of   several data      measure/ask for the      5.17, 5.19,
       Fitting a Line to        the regression     sets for          height and weight of     5.21, 5.25
       Bivariate Data           line.              temperature       10 family members
                               Formula for the    and try to        and calculate the
                                slope of the       estimate          equation of the
                                regression line.   absolute zero.    regression line for
                               Formula for the                      their data set. They
                                slope of a                           will have to make a
                                regression line                      scatter plot of their
                                that goes                            data and include the
                                through the                          regression line. Then
                                origin.                              they will try to
                               Examples                             predict the height and
                                                                     weight of future
                             showing the                         members of their
                             difference                          family while avoiding
                             between lines                       the danger of
                             that are known                      extrapolation.
                             to go through
                             the origin and
                             lines that might
                             not go through
                             the origin.
                            The fact that
                             the regression
                             line goes
                             through the
                             point (average
                             x value,
                             average y
                             values)
                            Dependent
                             versus
                             independent
                             variable.
                            Danger of
                             extrapolation.
                            Absolute Zero.
23   Assessing the Fit      Residuals.         Students         Each student look for     Sec 5.3 #
     of a Line              Predicted          match            two numerical data        5.33, 5.35,
                             values.            equations of     sets that they think      5.37, 5.39
                            Residual plots     regression       will have a linear
                            Coefficient of     lines to         relationship on
                             determination.     scatter plots    data.gov and then
                            How residual       that are         they create a (clearly
                             plots can          similar to       labeled) scatter plot
                             uncover            each other.      for the data, calculate
                             curvature in a     The scatter      and graph the
                             data set that      plots are        regression line,
                             was previously     created in       calculate and
                             thought to be      such a way       interpret the
                             straight.          that only one    correlation, calculate
                                                point changes.   and interpret the
                                                The points       standard deviation
                                                moves either     about the regression
                                                far away but     line.
                                                on the
                                                regression
                                                line or far
                                                away and
                                                perpendicular
                                                to the
                                                regression
                                                  line. We also
                                                  compare the
                                                  correlations
                                                  in these
                                                  instances.
24   Non-Linear        We try fitting a           In this activity   Each student must go     Sec 5.4 #
     Relationships     straight line to non –     we look at         home and keep track      5.47, 5.49,
     and               linear data. Then we       data from the      of the temperature of    5.51, 5.53
     Transformations   try changing the           NOAA               a cooling liquid (hot
                       regression line to a       regarding          chocolate or tea that
                       regression ‘curve’. We     monthly            they can drink
                       revisit challenge of       averages of        afterwards). Then
                       noticing when data         CO2 over a few     they will have to try
                       that looks straight is     decades and        to fit the data with a
                       not straight. Then we      try to fit the     line while looking for
                       explore the concept of     curve as           clues as to how the
                       ‘linearizing’ data. We     accurately as      data might not be
                       finally make a list of     possible. The      linear. Then they
                       traditional                we plot the        have to try to find a
                       linearizations.            actual carbon      good non – linear
                                                  level versus       model. Finally they
                                                  the predicted      have to check that
                                                  carbon level       their non – linear
                                                  and calculate      model is a good fit by
                                                  the                plotting actual versus
                                                  correlation        predicted values.
                                                  between these
                                                  two, in order
                                                  to see how
                                                  good of a fit
                                                  our model is.
25   Chance                  Chance              We write out       Each student then        Sec 6.1 # 6.1,
     Experiments and          experiment.         the sample         performs a similar       6.3, 6.5, 6.7
     Events                  Sample space.       space for the      (but simpler)
                             Event.              sum of the top     experiment at home
                             Simple event.       faces after        with flipping a coin.
                             Tree.               rolling two        First, they make a
                             Sample space        dice. Then we      predicted sample
                              tree.               try to check       space. Then they
                             Compliment of       this               check actual
                              A                   prediction         experimental values
                             ‘or’ versus ‘and’   against reality    against that predicted
                             disjoint            by rolling two     sample space. They
                                                  dice and add       must create a relative
                                                  the numbers        frequency histogram
                                                  on the top         of the predicted and
                                                  faces. We          actual data sets.
                                                  check the
                                                  relative
                                                  frequency of
                                              the results
                                              from actually
                                              rolling the
                                              dice against
                                              the predicted .
26   Definition of         Classical         We explore         Each student             No textbook
     Probability            definition of     the difference     performs a similar       homework.
                            probability.      between the        experiment to the
                           Relative          classical and      bottle cap experiment
                            frequency         relative           at home, but with
                            definition of     frequency          Hershey Kisses.
                            probability.      definitions of
                           Subjective /      probability by
                            weighted          writing out
                            definition of     the sample
                            probability.      space for the
                           Then we           result of
                            discuss the       flipping a
                            main              plastic bottle
                            differences       cap. Then
                            between the       comparing
                            different         that
                            definitions of    prediction to
                            probability by    the actual
                            checking the      results of
                            predictions of    flipping a
                            each one          bottle cap.
                            against the
                            other.
27   Basic Properties      Probabilities     In this activity   Students then design     Sec 6.3 #
     of Probability.        are between …     we encounter       a test for the false     6.15, 6.17,
                           The probability   the ‘law of        version of the ‘law of   6.19, 6.21
                            of the whole      averages’ in       averages’ and look
                            sample space is   its popular,       the results of those
                            …                 but false form.    tests to see if they
                           What property     We classify        think the ‘law of
                            do disjoint       various            averages’ is true or
                            events have in    statements         false. This also helps
                            the context of    that use the       introduce the concept
                            probability?      ‘law of            of hypothesis testing.
                           What is the       averages’ and
                            relationship      try to find
                            between an        what is
                            event and its     correct and
                            compliment in     what is wrong
                            the context of    about them.
                            probability.      We use this to
                           The law of        gain a
                            large numbers.    stronger
                                              grasp of a
                                          more correct
                                          property
                                          found in the
                                          context of
                                          probability,
                                          the ‘law of
                                          large
                                          numbers’.
28   Conditional       Definition of     In this activity   Each student will         Sec 6.4 #
     Probability        conditional       students           solve theoretically,      6.29, 6.33,
                        probability.      explore            and try the following     6.35, 6.37
                       Why               ‘Monty Hall’       experiment: Three
                        conditional       problem. We        cards are put into a
                        probability is    introduce the      box. One card is red
                        needed.           problem,           on both sides, one
                       How to use two    solve the          card is green on both
                        way tables to     problem and        sides and one card is
                        help calculate    then try the       red on one side and
                        conditional       problem            green on the other. If
                        probabilities.    empirically        you get a prize if you
                       When you can      with cards.        correctly guess the
                        use conditional                      color on the other
                        probability.                         side of th card that
                                                             you randomly picked
                                                             form the box should
                                                             you always guess the
                                                             same color, a
                                                             different color, or are
                                                             the two strategies the
                                                             same?
29   Independence      Formula for       In this activity   Each student will do      Sec 6.5 #
                        independence.     students           some research on          6.41, 6.47,
                       Why we need a     investigate        diffraction and           6.51, 6.57
                        concept like      the frequency      comment on whether
                        independence.     with which         each electron is
                       When we can       push pins fall     acting independently
                        use the concept   point down.        of other electrons in
                        of                But they do so     the diffraction
                        independence.     in two             experiment.
                       Examples.         different
                                          ways. The
                                          first way is by
                                          dropping
                                          push pins one
                                          at a time. The
                                          second way,
                                          however, is by
                                          dropping 10
                                          push pins at a
                                          time. The goal
                                        is to see if in
                                        the second
                                        method push
                                        pins fall show
                                        independence
                                        .
30   General          General          In this activity   In this assignment       Sec 6.6 #
     Probability       Addition Rule.   we look at the     students explore the     6.59, 6.61,
     Rules            General          concept of         concept (and             6.63, 6.69
                       Multiplication   conditional        formula) for
                       Rule.            probability in     conditional
                      Law of Total     the context of     probability in the
                       Probability.     defective and      context of medical
                      Bayes’           non –              tests for a disease.
                       Theorem.         defective          Since a medical test
                                        parts.             can show positive
                                        Students are       when the individual
                                        given two          does NOT have the
                                        types of bolts     disease, and since the
                                        from ‘two          test can show a
                                        different          negative when the
                                        machines’          individual DOES have
                                        that produce       the disease,
                                        bolts. Each        conditional
                                        machine has a      probability is one of
                                        different          the appropriate tools
                                        success rate of    for dealing with
                                        producing          medical tests.
                                        non –
                                        defective
                                        parts.
                                        Students then
                                        take samples
                                        from these
                                        bolt
                                        collections
                                        and compare
                                        the
                                        theoretical
                                        probabilities
                                        that they
                                        calculated
                                        first with the
                                        actual
                                        frequencies
                                        with which a
                                        specific type
                                        of bolt
                                        showed up.
31   Review:
     Chapters 5 & 6
32   Exam: Chapters
     5&6
33   Random                 Random             In this activity   Each student has to      Sec 7 # 7.1,
     Variables               variable.          we examine         find 20 statistics, 10   7.3, 7.5, 7.7
                            Discrete           the concept of     of which are from a
                             random             ‘streaky           discrete random
                             variables.         behavior’ and      variable, and 10 of
                            Continuous         what               which are from a
                             random             constitutes        continuous random
                             variables.         streaky            variable.
                            The difference     behavior.
                             between            First, as a
                             discrete and       class we look
                             continuous.        at a real
                                                sequence of
                                                coin flips
                                                versus a made
                                                up sequence
                                                of coin flips.
                                                We try to
                                                figure out
                                                which is
                                                which based
                                                on the
                                                ‘streakiness’
                                                of the
                                                sequence.
                                                Then they
                                                construct
                                                their own real
                                                sequence of
                                                coin flips and
                                                analyze it for
                                                streakiness.
34   Probability            Definition of a    In this activity   Each student must        Sec 7.4 #
     Distributions for       probability        we create a        watch a basketball       7.27, 7.29,
     Discrete Random         distribution for   probability        game and keep track      7.31, 7.37
     Variables               a discrete         distribution       of the sequence of
                             random             for the            shots and whether
                             variable.          machine bolt       the shot was made or
                            Properties of a    activity that      not. Then each
                             probability        we did before      student will have to
                             distribution.      where two          create a probability
                                                difference         distribution function
                                                machine            for the random
                                                create bolts       variable ‘number of
                                                that are           successful shots in a
                                                defective or       row’.
                                                non –
                                               defective at
                                               different
                                               rates.
35   Probability            Definition of     In this activity   Each student must         Sec 7.3 #
     Distributions for       probability       we create a        make a probability        7.21, 7.23
     Continuous              density           probability        density function for
     Random                  function for      density            the temperature
     Variables               continuous        function for       readings of their
                             random            the                house with the goal of
                             variables.        continuous         being able to
                            Relationship      random             distinguish one
                             and difference    variable pH in     student’s house from
                             between           different          another simply based
                             continuous and    liquids.           on the data /
                             discrete                             calculation / graphics
                             probability                          that they make.
                             distributions.
                            Calculating
                             probabilities
                             using a table
                             for a
                             probability
                             density
                             function of a
                             continuous
                             random
                             variable.
                            Why the area
                             represents
                             probability.
36   Mean and               Mean of a         In this activity   Each student will         Sec 7.4 #
     Standard                random            we try to          have to gather a          7.27, 7.29,
     Deviation of a          variable.         measure the        range of gas station      7.31, 7.37
     Random                 Standard          ‘length of a       information,
     Variable                deviation of a    mechanical         including number of
                             random            pencil’. The       gallons, total price,
                             variable.         challenge is to    price per gallon, time
                            Why               measure it         of day. They will have
                             probability       with respect       to make a sampling
                             shows up in the   to a specific      plan, get permission
                             formula for the   individual and     to gather the data,
                             mean.             so get a sense     gather the data and
                            Some example      of how long a      then analyze the data.
                             calculations      particular         They will have to
                             using raw data.   person likes       make a probability
                                               the graphite       distribution function
                                               to be when         for their data and
                                               they write.        create a visual aid for
                                                                  their function.
37   Binomial and       When to use        In this activity   When going to a           Sec 7.5 #
     Geometric           the binomial       we explore         parking lot students      7.45, 7.51,
     Distribution        distribution.      the geometric      will have to count        7.59, 7.61
                        How to             distribution       how many cars it
                         calculate the      in the context     takes until they get to
                         probabilities      of prizes like     say a Toyota. After
                         associated with    ‘cracker jack’     repeating this count
                         the binomial       prizes where       several times each
                         distribution.      they have to       student will comment
                        How the            buy a certain      on whether they
                         geometric          number of          think that their
                         distribution       boxes before       distribution is
                         relates to the     they get the       geometric and what
                         binomial           prize that         they think the actual
                         distribution.      they want to.      proportion of Toyotas
                        When to use        The students       they think were in the
                         the geometric      make a             parking lot.
                         distribution.      calculation for
                        How to do          predicting
                         calculations       how many
                         with the           tries it will
                         geometric          take and then
                         distribution.      test that
                        Examples.          prediction
                                            empirically.
38   Normal             The general        In this activity   Each student will         Sec 7.6 #
     Distributions       shape of the       we try to          have to go home and       7.67, 7.69,
                         normal             measure the        make sample the           7.71, 7.73
                         distribution.      length of our      electric meter
                        How to             classroom.         readings when they
                         calculate areas    From the data      get home during
                         using a table.     that we collect    some time frame.
                        How z – score      we calculate       They will have to
                         relates to         the mean and       think about how they
                         calculating        standard           will sample the
                         areas.             deviations.        meter. Then after
                        How                We also            they have gathered
                         probability        calculate a        the data they need to
                         notation works     probability        calculate the mean
                         with normal        distribution       and standard
                         distributions.     function from      deviation. They also
                        Upper tailed,      the data.          need to thing about
                         lower tailed                          whether the data that
                         and two – tailed                      they have looks
                         calculations.                         normal. Assuming
                        Symmetry of                           that the data is
                         the normal                            normal they will also
                         distribution.                         have to make some
                                                               guesses as to how
                                                               many measurements
                                                                     they think will fall in
                                                                     a certain range of
                                                                     values.
39      Checking for               What does it      We check the   Each student checks        Sec 7.7 #
        Normality                   mean for a data data from our    their data from the        7.81, 7.83,
                                    set to look       measurement    electric meter             7.85, 7.89
                                    normal?           of the length  readings for
                                What do you          of our         normality.
                                    compare to see classroom for
                                    if a data set it  normality.
                                    normal.
                                Using
                                    correlation
                                    between
                                    theoretical
                                    cumulative
                                    probability and
                                    actual
                                    cumulative
                                    probability to
                                    determine if
                                    data appears to
                                    be normal.
                                How to use the
                                    correlation
                                    table to
                                    determine if
                                    data appears to
                                    be normal or
                                    not.
 QIII                 Quarter III: Distributions, Confidence Intervals and Hypothesis Testing
40      Approximating           Sometimes the In this activity Each student                    Sec 7.8 #
        Discrete                    shape of a        we explore       continues the            7.93, 7.95,
        Distributions               discrete          the concept of experiment home by         7.97, 7.99
                                    distribution is   using a          trying to measure out
                                    similar to the    continuous       1 cup of pasta shells
                                    shape of a        distribution     by weight to see if
                                    continuous        to               that data produces
                                    distribution.     approximate      better results than
                                When can we          a discrete       either measuring cup
                                    interchange the distribution       did.
                                    two               in the context
                                    distributions?    of ‘process
                                Binomial             control’. Each
                                    versus Normal     group will try
                                    distributions.    to measure
                                Examples of          out 1 cup of
                                    checking          ‘ice cream’
                                    binomial data     (pasta shells)
                          for normality.    in two
                         Noting that the   different
                          two still         ways.
                          produce           First, each
                          slightly          group will try
                          different         to measure
                          probabilities.    one cup with
                         What other        an opaque
                          distributions     measuring
                          are similar to    cup and
                          each other?       ‘intuition’.
                                            Then each
                                            group will use
                                            a transparent
                                            measuring
                                            cup and try to
                                            get an exact
                                            cup. Then we
                                            will average
                                            the results
                                            from each
                                            group and try
                                            to determine
                                            if either
                                            measurement
                                            shows a
                                            significant
                                            advantage
                                            over the
                                            other. The
                                            random
                                            variable ‘the
                                            number of
                                            pasta shells in
                                            the measuring
                                            cup’ is a
                                            discrete
                                            random
                                            variable. We
                                            will compare
                                            the results
                                            from this
                                            random
                                            variable with
                                            a continuous
                                            one.
41   Statistics and      Comparing one     In this activity   Each student then      Sec 8.1 # 8.1,
     Sampling             measurement       we do a            does an extension to   8.3, 8.7, 8.11
     Variability          from one          similar            what we covered that
                          sample with       activity as in     day by taking a look
                           the mean of        the              at the random
                           one sample out     introduction     variable X = ‘number
                           of several         except we use    of students in one of
                           samples.           the random       my classes’. They
                          Example using      variable X =     must create a
                           the random         ‘number of       probability
                           variable X = ‘#    children that    distribution (along
                           of cars in one     a trustee from   with a visual
                           family’            our school       representation for
                           compared with      has’.            that distribution) for
                           Y = ‘average                        the random variable
                           number of cars                      X and then compare it
                           between two                         to the probability
                           families’.                          distribution for Y =
                          Comparing the                       ‘the average number
                           distribution for                    of students in two of
                           a random                            my classes’.
                           variable with
                           the sampling
                           distribution of
                           the sample
                           mean.
                          How the shapes
                           of X and Y do
                           NOT have to be
                           the same.
42   The Sampling         Comparing the      In ‘Cents and    Each student then        Sec 8.2, 8.17,
     Distribution of       distribution of    the Central      will then look up the    8.19, 8.21,
     the Sampling          single             Limit            number of coins          8.23
     Mean                  measurements       Theorem’ we      minted in different
                           of a random        explore how      years to help explain
                           variable X with    the Central      why the distribution
                           the distribution   Limit            of dates from pennies
                           of the averages    Theorem          that we saw in class
                           from samples       works. The       looked the way it did.
                           of size N.         distribution
                          How the            that we take
                           distribution of    samples from
                           the averages       is the
                           from several       distribution
                           samples of size    dates of 100
                           N can have a       pennies.
                           different shape
                           than the
                           original
                           distribution.
                          How the
                           distribution of
                           the averages
                           from different
                           samples of size
                           N become more
                           ‘Gaussian’ as
                           the size of the
                           sample
                           increases.
                          How the mean
                           of the
                           distribution of
                           the averages
                           from samples
                           of size N gets
                           closer to actual
                           population
                           mean from
                           which the
                           samples come.
                          How the
                           standard
                           deviation of the
                           averages from
                           samples of size
                           N get smaller as
                           N increases.
                          The Central
                           Limit Theorem.
                          Using
                           simulation to
                           demonstrate
                           that the Central
                           Limit Theorem
                           works.
43   The Sampling         Revisiting the     In this activity   Students then             Sec 8.3 #
     Distribution of       definition of      we look at the     continue this study by    8.27, 8.29,
     the Sample            the sample         proportion of      looking up the            8.31, 8.33
     Proportion            proportion of      non -              proportion of
                           successes.         Caucasian          different ethnic
                          How the            students at        groups in our city and
                           distribution of    our school to      develop a sampling
                           the sample         understand         plan for determining
                           proportion of      the concept of     the ethnicity from a
                           successes is       the sampling       sample of people
                           related to the     distribution of    (without having to
                           distribution of    the sample         ask them their
                           the sampling       proportion.        ethnicity – i.e. simply
                           mean.              Each group         by watching people).
                          An example of a    creates a plan     Then they compare
                           rope making        to sample          their sample to the
                            company that       students in     know/estimated size
                            makes ropes        the hall way    of different ethnicity
                            for two            for their       groups.
                            different          ethnicity.
                            groups of          Then we look
                            people. One        at the
                            group uses         distribution of
                            rope               the
                            decoratively       proportion of
                            and the other      non –
                            uses rope to       Caucasian
                            haul cargo. The    students in
                            second group       those samples
                            needs the rope     and compare
                            to withstand a     that
                            certain level of   distribution
                            force before       to the know
                            the rope           proportion of
                            breaks. How        non –
                            does the rope      Caucasian
                            company            students.
                            determine how
                            well their ropes
                            satisfy the
                            second
                            customer?
                           A calculation of
                            the probability
                            that if you buy
                            120 rope at
                            least 110 of
                            them will be
                            able to
                            withstand the
                            required
                            amount of force
                            to haul a load.
44   Review: Chapter
     7&8
45   Exams: Chapters
     7&8
46   Point Estimation      The definition     In this activity   Each student has to     Sec 9.1 # 9.1,
                            of a point         we try to          compare statistics      9.3, 9.4, 9.7
                            estimate           estimate the       from both sides of a
                           True value of a    value of the       controversial issue
                            population         gravitational      and try to determine
                            characteristic     constant           if the statistics are
                           Unbiased           using several      consistent or
                            statistics         different          inconsistent with
                                               methods. We        each other.
                           versus biased      try to
                           statistics         determine if
                          Precision          the method is
                           versus             biased and if
                           accuracy           the method is
                                              valid based on
                                              the data.
47   Large Sample         The definition     In a large jar     Each student has to       Sec 9.2 #
     Confidence            of a confidence    with pennies       find an example of a      9.11, 9.13,
     Interval for a        interval.          and quarters       confidence interval in    9.15, 9.17
     Population           Confidence         students find      the news and explain
     Proportion            level.             an                 what the statistic is
                          95%                appropriate        measuring and if they
                           confidence         sample size in     think the interval is a
                           interval.          order to get at    good one.
                          Large sample       least 10
                           confidence         quarters with
                           interval for the   95%
                           population         confidence in
                           proportion.        a random
                          Standard error.    sample of
                          Bound on the       coins from the
                           standard error     jar. Then they
                           of the             generalize
                           estimation B       their
                           associated with    calculation to
                           a 95%              predict
                           confidence         sample sizes
                           interval.          needed in
                          Sample size        order to get N
                           requirements       quarters with
                                              95%
                                              confidence in
                                              a random
                                              sample of
                                              coins from a
                                              jar.
48   Confidence           Assumptions        In this activity   Each student extends      Sec 9.3 #
     Interval for a        before using a     we consider        this activity by          9.29, 9.31,
     Population Mean       one – sample z     the mean of        finding a statistic       9.33, 9.35
                           confidence         executives         whose mean is
                           interval for a     and                reported. They then
                           population         determine a        must calculate a
                           mean.              sample size to     sample size to
                          Sample size        estimate the       estimate the true
                           requirements       true               population mean for
                           before use a       population         that statistic. After
                           one – sample       mean salary        getting a random
                           confidence         of executives.     sample they must
                           interval for a     Then we look       then compare the
                            population        up the             mean that they
                            mean.             salaries of a      calculated with the
                           Student’s t –     random             reported mean.
                            distributions     sample of
                            versus z          executives
                            distributions.    and compare
                           One – sample t    the mean with
                            confidence        the reported
                            intervals for a   mean.
                            sample mean.
49   Hypothesis and        A test of         In this activity   Each student must        Sec 10.1 #
     Test Procedures        hypotheses or     we take            then find one            10.1,10.3,
                            test procedure    several            experiment and           10.5, 10.7
                           Null hypothesis   experiments        describe the null
                           Alternative       and                hypothesis and
                            hypothesis        determine          alternative
                           The different     what the null      hypothesis.
                            possible          and
                            alternative       alternative
                            hypotheses.       hypotheses
                                              are for each
                                              experiment.
                                              We also
                                              explore why
                                              the
                                              researchers
                                              did not chose
                                              different
                                              alternative
                                              hypotheses.
50   Errors in             Test              In this activity   Each student must     Sec 10.2 #
     Hypothesis             procedures.       we revisit the     find an example of a  10.11, 10.13,
     Testing               Type I error.     ‘cards in box’     treatment with        10.15, 10.17
                           Type II error.    problem and        known type I and type
                           Level of          calculate the      II errors.
                            significance.     known type I
                           How to choose     and type II
                            an alpha level    errors. Then
                            and why should    we test those
                            not make the      predicted
                            alpha level       values against
                            smaller than it   experimental
                            needs to be.      results that
                                              we make in
                                              class.
51   Large Sample          Test statistics   We make a          Each student then has    Sec 10. 3 #
     Hypothesis Tests      P – value         hypothesis         to make a hypothesis     10.23, 10.25,
     for a Population      Observed          regarding the      regarding the number     10.27, 10.29
     Proportion             significance      proportion of      of students with black
                                              boys and girls     hair at our school.
                            level              at our school.    Then they need to
                           What the P –       Then we           decide if our student
                            value means.       decide if our     body is large enough
                           How to phrase      student body      to perform a large –
                            a response to a    is large          sample hypothesis
                            given P – value    enough to         test for the
                            (accept versus     perform a         population
                            fail to reject).   large – sample    proportion of
                           Upper tailed       hypothesis        students with black
                            tests and lower    test for the      hair. Then they need
                            tailed tests       population        to take a random
                            versus two         proportion of     sample to estimate
                            tailed tests.      boys (or girls)   the number of
                           An outline of      in the school.    students with black
                            the steps in a     Then we take      hair at our school.
                            hypothesis         a random          Then each student
                            testing            sample to         will compare results
                            analysis.          estimate the      with other students in
                                               number of         class.
                                               boys (or girls)
                                               at our school.
                                               Then we
                                               compare the
                                               estimation
                                               with the
                                               actual
                                               number.

                                               (or we can do
                                               a
                                               skittles/m&m’
                                               s related
                                               activity)
52   Hypothesis Tests      Z and T            We make a         Each student must        Sec 10.4 #
     for a Population       confidence         hypothesis for    make a hypothesis for    10.41, 10.43,
     Mean                   intervals when     the mean SAT      the mean sunrise         10.45, 10.47
                            the population     score in our      time in our city. Then
                            standard           school over       they need to
                            deviation is       the past few      determine how many
                            known and not      years. Then       days they would need
                            known.             we determine      in order to use a
                           The definition     how many          hypothesis test for a
                            of degrees of      years and         population mean.
                            freedom and        students we       Then they need to
                            how to             need in order     design a sampling
                            calculate the      to use a          plan for getting a
                            degrees of         hypothesis        random sample of
                            freedom in the     test for          days. Then after
                            basic sense.       population        gathering the data
                           Upper tailed       mean. Then        and calculating a
                           and lower          we take a          sample mean they
                           tailed tests       random             need to compare their
                           versus two         sample of SAT      results with one
                           tailed tests.      scores from        year’s worth of
                          The definition     the given          sunrise times. They
                           of statistically   years and          need to make a box
                           significant.       compare it         plot of each to show
                                              with the           their results visually.
                                              reported
                                              school mean
                                              SAT score.
53   Power and            The definition     In this activity   Each student needs to Sec 10.5 #
     Probability of        of the power of    we compare         find two test         10.59, 10.61,
     Type II Error         a test.            two test           procedures with       10.63, 10.65
                          Visually how to    procedures.        known type I and type
                           think about the    First, we          II errors. Then they
                           power of a test.   calculate the      need to compare the
                          What factors       probabilities      power of each test
                           have an effect     for type I and     and describe under
                           on the power of    type II errors     what circumstances
                           a test?            in each test       their conclusion is
                          When the null      procedure.         valid.
                           hypothesis is      Then check
                           true versus        those
                           when the null      probabilities
                           hypothesis is      empirically in
                           false.             class.
54   Review:
     Chapters 9 & 10
55   Exam: Chapters
     9 & 10
56   Inferences           When you           In this activity   Each student must         Sec 11.1 #
     Concerning the        might need to      we return to       clearly state a null      11.1, 11.5,
     Difference            use a difference   our data from      hypothesis of the         11.9, 11.13
     Between Two           of means.          the activity       difference of between
     Population or        Comparing          where we           the mean electrical
     Treatment             treatments.        measured the       usage in their house
     Means Using          Formulas for       constant of        when everyone is
     Independent           the difference     gravity using      awake and the mean
     Samples               between            different          electrical usage in
                           sample means       methods. We        their house after
                           using              use this data      every one has gone to
                           independent        to determine       bed. They need to
                           samples.           if either          create a sampling
                          Assumptions        method shows       plan to get a random
                           for the using      a significant      sample during those
                           the above          difference         times. Then after
                           formulas.          from each          gathering their data
                                              other and the      and making their
                                              accepted           calculations they
                                             value for the      need to determine if
                                             gravitation        the data shows any
                                             constant.          significant difference
                                                                from their prediction.
                                                                They should
                                                                speculate any causes
                                                                given their
                                                                conclusion.
57   Inferences           The definition    In this activity   Each student must         Sec 11.2 #
     Concerning the        of ‘paired’.      students           clearly state a null      11.31, 11.35,
     Difference           Examples of       compare the        hypothesis for the        11.37, 11.39
     Between Two           situations that   difference         difference of the
     Populations or        require paired    between the        mean temperature of
     Treatment             values.           mean listed        one floor of his or her
     Means Using          Assumptions       weight of          family’s house
     Paired Samples.       before making     candies with       compared with the
                           inferences        the same size      mean temperature of
                           about the         and the mean       another floor of
                           difference        measured           his/her family’s
                           between means     weight of          house. The
                           when using        candies with       temperature readings
                           paired samples.   the same size.     should happen at the
                          Paired t                             same time. The
                           confidence                           students should then
                           intervals.                           comment on how well
                                                                paired the data sets
                                                                are. They should also
                                                                speculate as to any
                                                                causes given their
                                                                conclusion.
58   Large Sample         Assumptions       In this activity   Each student must         Sec 11.3 #
     Inferences            before making     we compare         challenge a member        11.41, 1.43,
     Concerning a          inferences        the difference     of their family to a      11.45, 11.47
     Difference            about the         between the        game of basketball
     Between Two           difference        proportion of      and try the day’s
     Populations or        between two       basketball         activity at home.
     Treatment             population (or    shots made by
     Proportions           treatment)        team A and
                           proportions.      the
                          Formulas for      proportion of
                           the difference    basketball
                           between two       shots made by
                           population (or    team B. Two
                           treatment)        teams from
                           proportions.      class make a
                                             series of
                                             basketball
                                             shots and
                                             keep track of
                                             successful
                                              shots and
                                              misses. They
                                              think about
                                              what they
                                              need to do in
                                              order to
                                              satisfy the
                                              assumptions
                                              of the test.
                                              They also
                                              make a clear
                                              statement
                                              about what
                                              the null
                                              hypothesis is
                                              in this
                                              context. After
                                              they make
                                              enough shots
                                              we compare
                                              actual
                                              proportion of
                                              successes
                                              between the
                                              two teams.
59   Chi –Squared          What the null     In this activity   Each student must         Sec 12.1 #
     Tests for              hypothesis        we look at         find data on              12.1, 12.3,
     Univariate             looks like for    data from          www.data.gov upon         12.5, 12.7
     Categorical Data       univariate        drosophila         which they can
                            categorical       fruit flies and    perform a chi-
                            data.             compare            squared test for
                           How to create     predicted          univariate categorical
                            the alternative   ratios of          data. They must make
                            hypothesis and    inherited          sure the data satisfies
                            how to notice     traits with        the assumptions
                            the alternative   actual ratios      needed in order to
                            hypothesis.       of inherited       perform the test.
                           Expected          traits. This is    After they do the
                            versus            in conjunction     calculation they need
                            observed          with AP            to explain what the
                            counts.           Biology lab.       resulting chi-squared
                           Chi – squared                        value means and if
                            value.                               the data shows any
                           How to use the                       significant difference
                            chi – squared                        between the
                            values to make                       hypothesized
                            inferences.                          proportions or not.
                           Chi – squared
                            tables.
                          Assumptions
                           needed in
                           order to make
                           inferences
                           using the chi –
                           squared value.
QIV                      Quarter IV: Chi-Squared Tests, AP Exam and Topics
60    Tests for             Two ways           In this activity   Each student must        Sec 12.2 #
      Homogeneity            tables.            we compare         compare several          12.17, 12.19,
      and                   Marginal totals.   several            basketball teams         12.21, 12.23
      Independence in       How to             different          against at least four
      a Two – Way            calculate          famous             different
      Table                  expected           authors            characteristics (like
                             values for a two   against the        rebounds, shot
                             way table.         following:         success proportion,
                            What the null      how many           etc.) to see if their
                             hypothesis is      books got to       collection of
                             when using the     the NY Times       characteristics show
                             chi – squared      best-seller        significant
                             values and two     list, how many     differences between
                             – way tables.      books became       the teams. They need
                            Assumptions        movies, how        to look for which
                             needed in          many books         characteristic or team
                             order to make      had sequels        shows the most
                             inferences         and how            contribution and in
                             using a chi –      many books         which direction.
                             squared value      they have
                             for two ways       published.
                             tables.
61    Review:
      Chapters 11 & 12
62    Exam: Chapters
      11 & 12
63    AP Review
64    AP Review
65    AP Review
66    AP Review
67    AP Review
68    AP Exams
69    AP Exams
70    AP Exams
71    AP Exams
72    AP Exams
73    Discrimination
74    Discrimination
75   Chapter 13
76   Chapter 13
77   ANOVA
78   ANOVA
79   ANOVA
80   ANOVA




     Exploring Data
     Activity: Food and Agricultural Organization (AP Statistics)



     Materials: Laptop



     Statistics is a tool that is meant to analyze and help us understand data. To this end we will
     need several sources of data. The first source that we will make use of is the Food and
     Agriculture Organization of the United Nations.



     Please go to http://faostat.fao.org/



     Click on ‘want to register?’



     Register yourself at FAO by filling in the information for the following:



           Name
           Manlius Pebble Hill for the organization
           Educational institution for the type of organization
           USA for the country
           Check the first column of boxes (and any others that interest you)
           Use your school email address
           Make up a password
We will use data from this website today and throughout the year.



Go to http://www.fao.org/economic/ess/en/ and find the current agricultural yearbook.



Find the spread sheets for the following:



      Total and Agricultural Population (including forestry and fisheries) (A1)
      Human Development Index and Poverty (G4)


Find the definitions for total population, agricultural population, human development index and
poverty. Find the units for each of the categories.



Copy the columns for 2009 in each of the following categories into a new excel worksheet titled
Excel Practice 1 AP Statistics. Make sure the countries match the data in each row.



      Name of Country
      total population
      agricultural population
      human development index
      Poverty Prevalence
      Year Poverty Prevalence was recorded


In a new column you are going to calculate the ratio of agricultural population to total population.
Label the column as such and then in the first row (lining up with the first country) place an
equation that looks like ‘=E11/D11’ which should represent the ratio of agricultural population to
total population of the first country. In this formula agricultural population for Afghanistan was in
column E row 11 and the agricultural population for Afghanistan was in column D row 11.



Then copy the formula in that cell and paste it to the rest of the cells in that row all the way down
to the last country. The numbers should all be different and represent each country’s ratio.



Note: all formulas in a cell for Excel should be preceded by an equal sign.
Excel has a list of statistical functions that you can use, these are listed under ‘statistical
functions’ in the help search menu. You will be using several of these functions from excel.
Some of these include the following:



      =AVERAGE( range of cells ) This produces the average of all the numbers that you
       highlighted.
      =MEDIAN( range of cells) This finds the median or middle number of the cells that you
       highlighted.
      =SUM (range of cells ) This adds up the numbers in the cells that you highlighted
      =STDEV (range of cells ) This produces the Sample Standard Deviation for the cells that
       you highlighted.
      =CORREL ( first range of cells, second range of cells ) This produces the Correlation
       between the variables represented in the given two ranges of cells [usually two columns
       or two rows].


Do the following calculations with the data that you copied from the FAO statistical yearbook:



      Find the average agricultural population
      Find the median agricultural population
      Find the sum of the agricultural population and compare it to the world agricultural
       population. What should be true about these numbers?
      Find the standard deviation of the agricultural population.
      Find the correlation between the column labeled Human Development Index and the
       column that should represent the ratio ‘agricultural population : total population’




Make the following scatter plots and label the axes and each scatter plot:

      Human Development Index versus Proportion of Population that Farms
      Poverty Prevalence versus Proportion of Population that Farms




What is the definition of the term ‘human development index’?
What is the definition of the word ‘poverty’?



What do you notice about the overall trends in each scatter plot? Does it look like there is any
relationship between the different variables that you plotted?



What do the points on the x axis mean?



What would this suggest about what a nation should do to improve its human development
index?



Would your solution in the previous question automatically reduce poverty in a given country?
Why or why not?



What is considered the typical trend with respect to the percentage of people that farm?



Why does USA’s poverty prevalence not show up in the table?
Important aspects of this activity:



      You should always be able to analyze a data set using Excel even if you don’t remember
       all the formulas. The key is that you must remember what the formulas mean, when you
       can use the formulas, what the formulas can (and can’t) do.
      Taking a course in statistics allows you to become statistically literate, which will allow
       you to be intelligently informed about the information that you see around you. You will
       see statistical information pretty much any where you go or in many informative
       documents that you will see.
      Often statistical information can help guide decisions that you would have to make in
       your occupation.
      The statistical information also can show how your intuition is not always correct. To this
       end knowing what a statistic means can help you make life choices.
      This activity demonstrates the process of collecting, displaying, describing, analyzing
       and drawing conclusions from data. This process is the main process of statistics.
      The charts that we made and the descriptions of trends that we found in the charts are
       example of descriptive statistics.
      The column marked total population is an example of the population of interest.
      The column called total agricultural population is an example of sample population.
      The question that asked you to make a decision based on the trends that you saw in the
       data is an example of inferential statistics.




(The above activity shows that students interpret statistical results in context)

(This example also makes use of graphical exploration of data)
Assignment:

Each student must make a hypothesis for the mean sunrise time in our city. Then they need to
determine how many days they would need in order to use a hypothesis test for a population mean.
Then they need to design a sampling plan for getting a random sample of days. Then after gathering
the data and calculating a sample mean they need to compare their results with one year’s worth of
sunrise times. They need to make a box plot of each to show their results visually.

(Here the box plots incorporated median based statistics with mean based analysis)

(This assignment also shows statistical methods of exploring data)
Activity: Non – Linear Relationships and Transformation (AP Statistics)

Go to the Global Monitoring Division of the National Oceanic and Atmospheric Administration.

http://www.esrl.noaa.gov/gmd/index.html



Choose the ‘products’ tab and select search for data.

Restrict the search to ‘Carbon Dioxide’ and monthly averages. Then select the data from Ascension
Island in the UK.

Copy the data in this file and paste it into a spreadsheet.

You will have to separate the data in the column by highlighting the data and then choosing the ‘data’
tab and selecting the ‘text to columns’ option. Then sort by spaces.

Once you have done this create a scatter plot of carbon dioxide levels to month/year.

Then create an appropriate sized viewing window so that you can see the detail of each month.

Does the data look linear?

What function do you think might help straighten this data set?

When you include the regression line in the scatter plot what sorts of curviness do you notice? Describe
two ways that your scatter plot is curvy.

Even with the curviness would you still feel like you could predict the carbon levels at ascension island in
the UK?

What would be a good rule for predicting the carbon levels?

Create the following column called ‘predicted’: =337.42+(8/60)*I + cos (π *(I-2) / 6)+2*cos(π*I / 200)

Create a scatter plot of carbon level versus predicted.

Find the correlation between ‘carbon level’ and ‘predicted’.
      (This assignment shows how students must interpret data in context and is shows graphical exploration
      of data and well as numerical approximation of data.)




      Sampling and Experimentation
      The following is from the syllabus:

Sampling Methods              Why sample?                 Designing a survey to           Exploring
and Bias                      Sample sizes                determine how many              sampling methods
                              Selection bias              hours students spend on         in the context
                              Measurement or              homework at our school          farming (plant
                               response bias               in the upper school.            wilt) and how to
                              Non-response bias           Along with a discussion         get a good sample
                              Conceptual bias             about how to do the             given some
                              Simple Random               actual sampling.                uncontrolled
                               samples                                                     variables.
                              Stratified random
                               sampling
                              Cluster sampling
                              Systematic sampling
                              Why not
                               Convenience
                               sampling?
                              Why not volunteer
                               sampling?
                              How important
                               sampling biases are
                               for researchers
                               when designing
                               experiments.
Assignment: Sampling (AP Statistics)

An experimenter requires prior knowledge of a subject before they can enact a test of any significance.
A specific experiment comes with the purpose of measuring some quantity. When sampling a
population to make the desired measurement the experimenter needs to know what variables affect
the quantity that they want to measure.

Consider the following passage from the 1957 yearbook of agriculture on Soils (p 44)

         “Water is the medium that disperses the protoplasm in the cell. It is a medium by which physical
force is effected on the cell wall to bring about expansion and growth.

         Only a small part of the water taken up by roots from the soils retained in the cells of the plants.
Most of the water that is absorbed is conducted to the leaves, where it is lost by evaporation or
transpiration. Since the evaporation of 1 gram of water requires 539 calories, the high rate of water loss
that takes place from leaves on hot summer days acts as an evaporative cooler. One mature tomato
plant in a warm arid climate will transpire a gallon of water in a day. As much as 700 tons of water may
be needed to produce 1 ton of alfalfa hay. The water that is transpired by a cornfield in Iowa in a
growing season is enough to cover the field to a depth of 13 to 15 inches.

       The loss of water from plants is controlled by incident light energy, relative humidity,
temperature, wind, opening of pores (called stomata) in leaves, and supply of water in soil.

         Incident light energy is the most important factor because the evaporation of water requires a
source of energy. Relative humidity is also important because evaporation takes place much more
rapidly in a dry atmosphere than in a humid one. The other factors I mentioned are of a relatively minor
consequence.

         If water loss by transpiration exceeds water intake by the roots, a water deficit develops in the
plant, expansion of growing cells ceases, and the plant stops growing. If the water deficit continues the
plant wilts. If it becomes too severe, the plant tissues wither and die. By what means can plant cells
absorb and retain water when the atmosphere is evaporating it form the leaves and the soil is impeding
its entry in to the roots? An illustration:

        When salt is applied to shredded cabbage, the tissue fluids diffuse out of the leaf slices and
dissolve the salt making a brine. The cabbage leaves become limp, or flaccid. If the limp leaves are
washed free of brine and placed in pure water, they again become stiff or turgid. This exemplifies one of
the most fundamental characteristics of the water relationships of plants. It is the diffusion of water
through a semipermeable membrane more commonly called osmosis. When two solutions differing in
concentration are separated by a membrane impermeable to the dissolved substance, water moves
from the solution of lower concentration to the one of higher concentration. “



Suppose you wanted to measure the average number of plants that showed wilt during a day under the
current farming system. Suppose that your current sampling method is to start with a random plant,
then choose 1 out of every k plants by rows starting at 4 p.m. until you reach 20% of the plants. You will
do this each day for a week in order to obtain a relatively random sample of plants from the fields. From
this sample you would count the number of plants that showed any wilting, find the average for each
day, find the percentage of plants that showed wilt, and then generalize the result to the whole field.
Then after looking for any trends you would make a suggestion to either keep the current farming
method or modify the farming method.

Situation # 1: A significant portion of the field is shaded by larger trees in such way that the shade would
influence the incident light hitting 40% of the plants at 4 p.m.

Situation # 2: The farmer forgot to put out the watering system two of the nights during the week.

Situation # 3: You happened to choose a week where the weather was hitting record highs. (Would the
extra wilting that you probably saw be cause to change the farming system?)

Situation # 4: You happened to choose a week of record high winds so that the water transpired by the
plants did not stay under the plants causing a higher percentage of plants to wilt.

Situation # 5: The field contains two similar looking plants that have different preferred growing
temperatures. One of the plants wilts more easily under the normal temperatures for the week that you
chose to take the sample.

Situation # 6: The farmer has managed to water the plants in such a way that they are not wilting, but
they have also stopped growing.



For two of the above situations do the following:

       Describe the problem with the sampling method.
       Make sure to include why the situation would skew the results from a 1 in k systematic sampling
        method.
       Classify each type of bias that shows up.
       Also make sure to include which sampling method could correct the bias and how.

Follow up questions:
      Describe the difference between a sampling bias and a cause of wilting.
      If you found a significant portion of plants that wilted (say over 10% of the sample) what might
       be some causes for the wilting?

(This examples show how we explore sampling in an actual experiment. They have to be able to
decide which sampling methods work best for a given farming situation.)

AP Problem: Blocking (AP Statistics)
(This is an AP problem that we go over to explore blocking and random assignment.)



Assignment:
Each student must clearly state a null hypothesis of the difference of between the mean electrical
usage in their house when everyone is awake and the mean electrical usage in their house after every
one has gone to bed. They need to create a sampling plan to get a random sample during those times.
Then after gathering their data and making their calculations they need to determine if the data
shows any significant difference from their prediction. They should speculate any causes given their
conclusion.

(This is an example of how students get involved in designing experiments on their own. Here they have
to create their own sampling plan and decide how they will measure electrical usage at the different
times.)

Assignment:

Each student must clearly state a null hypothesis for the difference of the mean temperature of one
floor of his or her family’s house compared with the mean temperature of another floor of his/her
family’s house. The temperature readings should happen at the same time. The students should then
comment on how well paired the data sets are. They should also speculate as to any causes given
their conclusion.

(This is another example of how students get involved in designing experiments on their own.)
Anticipating Patterns

Handout: Probability – Things to do in the face of a problem in probability (AP Statistics)

The first thing to look for when doing a problem in probability is to decide which definition of probability
the problem requires. The classical definition of probability often follows theoretical predictions. The
relative frequency definition of probability often follows strings of events from which an experimenter
records the frequency of successes.

Once you know which definition to use then the next big question to always ask is:

                                              ‘What is the sample space?’

Directly following this question as often as you can you should write out all the outcomes in the sample
space (time permitting). Then you should determine the size of the sample space.

Sample spaces for the classical definition of probability look like a finite set listing all the potential
outcomes based on the situation. For rolling two dice the sample space is the following { (1,2); (1,3);
(1,4); (1,5); (1,6); (2,1); (2,2); (2,3); (2,4); (2,5); (2,6); (3,1); (3,2); (3,3); (3,4); (3,5); (3,6); (4,1); (4,2); (4,3);
(4,4); (4,5); (4,6); (5,1); (5,2); (5,3); (5,4); (5,5); (5,6); (6,1); (6,2); (6,3); (6,4); (6,5); (6,6)} But the size of
the sample space for the classical definition of probability DOES NOT CHANGE.

Sample spaces for the relative frequency definition of probability look like strings of experiment results.
In rolling two dice the sample space might look like { (1,4); (1,7); (5,2); (6,1)} which has only four rolls or
it might have a string of 200 rolls. But with the relative frequency definition of probability the size of the
sample space CAN CHANGE.



Once you have done this you can then proceed to the problem and describe as clearly as possible in
terms of the outcomes which event the problem focuses on.

The last goal is to determine the size of the event. To do this look at the outcomes in the sample space
and circle all the outcomes that belong to the event E for the problem.

Once you have done this then you can find the quotient

                                                      size of event space E
                                            P(E) =
                                                     size of sample space S
IMPORTANT:

Make sure to remember that the classical definition of probability and a previous relative frequency
measurement act like the prediction or theory, and that a new relative frequency measurement is like
the experiment that tests the theory. If the theory (either from the classical definition or a previous
relative frequency measurement) is a good one then the results from the new relative frequency string
should agree with the predictions.

The above process is one of the major activities of science. The above process also belongs to any
discipline that makes measurements. That is how important statistical analysis is in our society.



Helpful Hints:

Do NOT try to guess the probabilities in a given problem. Instead ALWAYS use the formulas to calculate
a probability.

For disjoint events OR means ADD the probabilities

For independent events AND means MULTIPLY the probabilities

One thing that can help is if the problem uses the words ‘find the probability of E given that….’ Here you
should use the formula for conditional probability.

                                                         P(E    F)
                                            P(E | F) =
                                                           P(F)

                                                                        size of event space E
Notice that the formula for conditional probability still looks like                          , with the only
                                                                       size of sample space S
difference that the sample space is now F instead of S.



Another observation that can help is if you can divide the sample space into a disjoint collection of sets
whose union is the whole sample space. Often a problem will have options that divide that sample space
into clear disjoint sets. This often indicates that you should use either the total probability rule or Bayes’
Theorem.
The law of large numbers is an assumption that the relative frequencies in a string of experiments will
get close to that actual probabilities of an event. It does not mean that the actual frequencies will ‘level
out’ , however. This means that when flipping a fair coin the percentage of head will get closer to 50% as
you increase the number of heads, but the actual number of heads minus the actual number of tails can
grow to be quite large (53,000 heads and 49,000 tails yields a percentage very close to 51% heads but
the difference between the number of heads and tails is 4000).

When estimating probabilities empirically…

It is fairly common practice to use observed long – run proportions to estimate probabilities. The
process of estimating probabilities is simple:

       Observe a very large number of chance outcomes under controlled circumstances.
       Estimate the probability of an event by using the observed proportion of occurrence and by
        appealing to the interpretation of probability as a long run relative frequency and the law of
        large numbers.
       Two way tables can help keep track of the information concisely
       Keep in mind the concept of independence and conditional probability when looking at the
        results.



You have to be careful with statements that use the ‘law of average’ which is different form the law of
large numbers.

Law of Averages (Bad Version)

For every occurrence in favor of an event E there must be an occurrence that is not in favor of event E

Law of Averages (Okay version)

Eventually even unlikely events are bound to happen.

Independence and the Law of Averages

Notice that the law of averages still cannot say the following: If you have flipped 10 tails in a row then it
is more likely that the next one will be a heads’.

Independence of flips guarantees that each flip has a probability of showing heads 50% of the time.

What is unlikely is the particular string of 10 tosses that specifically you got (10 tails is just as unlikely as
9 tails and 1 head)

(This is an example of a handout I give my students on probability. In includes the basic rules of
probability.)
AP Problem: Variability in Inferential Statistics (AP Statistics)



Example 1.2 from Statistics and Data Analysis Second Edition (p 7).




                                                  Contaminant Concentration (in parts per million in well water)

                                        45
  frequency (avaerages taken over 200




                                        40
                                        35

                                        30

                                        25
                  days)




                                                                                                                                Series1
                                        20

                                        15
                                        10
                                        5

                                        0
                                             10       11       12       13      14      15      16        17       18      19
                                                  average contamination (the average of five measurements) (in parts per
                                                                                million)




As part of its regular water quality monitoring efforts, an environmental control board selects five
water specimens from a particular well each day. The concentration of contaminants in parts per
million (ppm) is measured for each of the five specimens, and then the average of the five
measurements is calculated. The histogram above summarizes the average contamination
values for 200 days.



Now suppose that a chemical spill has occurred at a manufacturing plant about 1 mile from the
well. It is not known whether a spill of this nature would contaminate ground water in the are of
the spill and , if so, whether a spill this distance from the well would affect the quality of well
water. One month after the spill, five water specimens are collected from the well. Which of the
following average measurements would suggest that be convincing evidence that the well water
was affected by the spill?



                                                               (a) 10        (b) 12   (c) 16     (d) 18        (e) 20
Type of Problem – Bar Charts and Inferential Statistics



Focus 1 – What is a ‘normal’ contaminant level for the well water?




Answer E



Before the spill, the average contaminant concentration varied from day to day. An average of
16 ppm would not have been an unusual value, and so seeing an average of 16 ppm after the
spill isn’t necessarily an indication that contamination has increased. On the other hand an
average as large as 18 ppm is less common, and an average of 22 ppm is not at all typical of
the pre - spill values. Therefore, 20 ppm makes sense as an answer.



(This is an AP problem that we cover on the first day of school that includes variability.)
Normal                      The general        In this activity   Each student will         Sec 7.6 #
Distributions                shape of the       we try to          have to go home and       7.67, 7.69,
                             normal             measure the        make sample the           7.71, 7.73
                             distribution.      length of our      electric meter
                            How to             classroom.         readings when they
                             calculate areas    From the data      get home during
                             using a table.     that we collect    some time frame.
                            How z – score      we calculate       They will have to
                             relates to         the mean and       think about how they
                             calculating        standard           will sample the
                             areas.             deviations.        meter. Then after
                            How probability    We also            they have gathered
                             notation works     calculate a        the data they need to
                             with normal        probability        calculate the mean
                             distributions.     distribution       and standard
                            Upper tailed,      function from      deviation. They also
                             lower tailed and   the data.          need to thing about
                             two – tailed                          whether the data that
                             calculations.                         they have looks
                            Symmetry of                           normal. Assuming
                             the normal                            that the data is
                             distribution.                         normal they will also
                                                                   have to make some
                                                                   guesses as to how
                                                                   many measurements
                                                                   they think will fall in
                                                                   a certain range of
                                                                   values.


       (This is on the syllabus.)
The Sampling         Comparing the      In ‘Cents and   Each student then        Sec 8.2, 8.17,
Distribution of       distribution of    the Central     will then look up the    8.19, 8.21,
the Sampling          single             Limit           number of coins          8.23
Mean                  measurements       Theorem’ we     minted in different
                      of a random        explore how     years to help explain
                      variable X with    the Central     why the distribution
                      the distribution   Limit           of dates from pennies
                      of the averages    Theorem         that we saw in class
                      from samples of    works. The      looked the way it did.
                      size N.            distribution
                     How the            that we take
                      distribution of    samples from
                      the averages       is the
                      from several       distribution
                      samples of size    dates of 100
                      N can have a       pennies.
                      different shape
                      than the
                      original
                      distribution.
                     How the
                      distribution of
                      the averages
                      from different
                      samples of size
                      N become more
                      ‘Gaussian’ as
                      the size of the
                      sample
                      increases.
                     How the mean
                      of the
                      distribution of
                      the averages
                      from samples of
                      size N gets
                      closer to actual
                      population
                      mean from
                      which the
                      samples come.
                     How the
                      standard
                      deviation of the
                      averages from
                      samples of size
                      N get smaller as
                      N increases.
                     The Central
                      Limit Theorem.




(This is on the syllabus.)
(This is an AP problem that we go over that explores combining independent random variables.)
Statistical Inference
The syllabus includes detailed coverage of chapters on confidence intervals for a proportion,
the difference between two proportions, the mean, the difference between two means, and
the slope of the regression line. The syllabus also covers hypothesis testing and chi – squared
tests; goodness of fit and tests for homogeneity/independence. See chapters 5, and 9 – 12.
The course draws connections between all aspects of the
statistical process including design, analysis, and
conclusion
Projects:

Each student must write an article for the school newspaper. Before they can submit an article they
must design a survey or experiment, create a sampling plan, gather the data, analyze the data and come
to a conclusion given the data set. Then they have to write an article summarizing what they found
along with at least one graphical aid for any reader of that article.
The course teaches students how to communicate
methods, results and interpretations using the
vocabulary of statistics.

Assignment: Correlation (AP Statistics)

Look up “linear relationships in science” in Google’s Scholarly index and find a pair of quantities that
have a linear relationship. Make sure that you can identify the raw data that the scholars used to
demonstrate the linear relationship between the variables.

Read the article and summarize the article while including the following information.

       Describe which quantities have a linear relationship
       Include a scatter plot demonstrating the linear relationship
       Calculate the correlation for the raw data the scholars used to demonstrate the linear
        relationship.
       Describe how your calculation corresponds to the results in the paper that you read.

(This an example of one assignment where each student must look up an existing study that
demonstrates a particular statistical relationship between variables. Here they have to
summarize the methodology and statistical analysis, they have to interpret and explain what
the relationship using statistical vocabulary [here like correlation].)




The course teaches students how to use graphing
calculators to enhance the development of statistical
understanding through exploring data, assessing
models, and/or analyzing data.

We will cover how to use the graphing calculators each time we encounter a feature that the graphing
calculator can accommodate. These features include the following: calculating the mean, calculating the
standard deviation, calculating the median, creating scatter plots, creating box plots, linear regression,
non – linear regression, 1 sample t and z tests, 2 samples t and z tests, z confidence intervals, t
confidence intervals, chi – squared tests for goodness of fit, chi – squared tests for homogeneity and
independence.
The course teaches students how to use graphing
calculators, tables, or computer software to enhance
the development of statistical understanding through
performing simulations.
We use simulations to help make the central limit theorem clearer and that it works
independently of the beginning distribution.
The course demonstrates the use of computers
and/or computer output to enhance the
development of statistical understanding
through exploring data, analyzing data, and/or
assessing models.
Activity: Food and Agricultural Organization (AP Statistics)



Materials: Laptop



Statistics is a tool that is meant to analyze and help us understand data. To this end we will
need several sources of data. The first source that we will make use of is the Food and
Agriculture Organization of the United Nations.



Please go to http://faostat.fao.org/



Click on ‘want to register?’



Register yourself at FAO by filling in the information for the following:



      Name
      Manlius Pebble Hill for the organization
      Educational institution for the type of organization
      USA for the country
      Check the first column of boxes (and any others that interest you)
      Use your school email address
      Make up a password


We will use data from this website today and throughout the year.



Go to http://www.fao.org/economic/ess/en/ and find the current agricultural yearbook.
Find the spread sheets for the following:



      Total and Agricultural Population (including forestry and fisheries) (A1)
      Human Development Index and Poverty (G4)


Find the definitions for total population, agricultural population, human development index and
poverty. Find the units for each of the categories.



Copy the columns for 2009 in each of the following categories into a new excel worksheet titled
Excel Practice 1 AP Statistics. Make sure the countries match the data in each row.



      Name of Country
      total population
      agricultural population
      human development index
      Poverty Prevalence
      Year Poverty Prevalence was recorded


In a new column you are going to calculate the ratio of agricultural population to total population.
Label the column as such and then in the first row (lining up with the first country) place an
equation that looks like ‘=E11/D11’ which should represent the ratio of agricultural population to
total population of the first country. In this formula agricultural population for Afghanistan was in
column E row 11 and the agricultural population for Afghanistan was in column D row 11.



Then copy the formula in that cell and paste it to the rest of the cells in that row all the way down
to the last country. The numbers should all be different and represent each country’s ratio.



Note: all formulas in a cell for Excel should be preceded by an equal sign.



Excel has a list of statistical functions that you can use, these are listed under ‘statistical
functions’ in the help search menu. You will be using several of these functions from excel.
Some of these include the following:
      =AVERAGE( range of cells ) This produces the average of all the numbers that you
       highlighted.
      =MEDIAN( range of cells) This finds the median or middle number of the cells that you
       highlighted.
      =SUM (range of cells ) This adds up the numbers in the cells that you highlighted
      =STDEV (range of cells ) This produces the Sample Standard Deviation for the cells that
       you highlighted.
      =CORREL ( first range of cells, second range of cells ) This produces the Correlation
       between the variables represented in the given two ranges of cells [usually two columns
       or two rows].


Do the following calculations with the data that you copied from the FAO statistical yearbook:



      Find the average agricultural population
      Find the median agricultural population
      Find the sum of the agricultural population and compare it to the world agricultural
       population. What should be true about these numbers?
      Find the standard deviation of the agricultural population.
      Find the correlation between the column labeled Human Development Index and the
       column that should represent the ratio ‘agricultural population : total population’




Make the following scatter plots and label the axes and each scatter plot:

      Human Development Index versus Proportion of Population that Farms
      Poverty Prevalence versus Proportion of Population that Farms




What is the definition of the term ‘human development index’?



What is the definition of the word ‘poverty’?
What do you notice about the overall trends in each scatter plot? Does it look like there is any
relationship between the different variables that you plotted?



What do the points on the x axis mean?



What would this suggest about what a nation should do to improve its human development
index?



Would your solution in the previous question automatically reduce poverty in a given country?
Why or why not?



What is considered the typical trend with respect to the percentage of people that farm?



Why does USA’s poverty prevalence not show up in the table?

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:12
posted:11/19/2011
language:English
pages:57