1st Quantitative Research Worksheet by ajizai


									POLB52: 1st Quantitative Research Worksheet
Univariate analysis: describing Canadian opinions. Value = 5%

    Learn to identify different levels of measurement
    Apply some common descriptive statistics.
    Become familiar with SDA statistical analysis program.
          o Learn how to recode variables
    Learn to gauge Canadian public opinion by becoming familiar with the Canadian Election Studies
       and other measure of public opinion.
    Understand the implications of using (or not using) weights.

Instructions and Overview
This is a short assignment with lots of instructions (and screenshots) taking you through each step of the
process. There are sixteen questions to answer on this assignment. All are multiple choice or fill in the
blank. The instructions run to ten pages. Consequently, you may find that reading the instructions takes
more time than actually completing the tasks and answering the questions. The step-by-step
instructions are noted with letters. The questions are numbered. Record the answers on Blackboard,
where you will have two chances to get the questions correct. These instructions include some
background on levels of measurement and descriptive statistics, before taking you through some
analysis of the 2008 Canadian Election Study.

Background: Levels of Measurement and Descriptive Statistics.
In this assignment, we will utilize some public opinion survey questions. When analyzed, these questions
are called variables. Variables can be classified using three (or four) different levels of measurement:
nominal, ordinal, or interval/ratio.

Nominal variables organize the data into mutually exclusive categories but have no inherent order or
rank. Examples include the answers to questions like:
     What newspaper do you read most often?
     What is the most important problem in Canada today?
     Dichotomous (or binary variables) like: Do you approve or disapprove of the performance of the
        Governor General of Canada?

Ordinal variables organize the data into a set of categories that are ordered or ranked , from low to
high, or less to more (or the reverse). The distance between each category may not be consistent or
clear, but one can always say that one observation is greater (or less than) another observation.
Examples include:
     Q: How much would you say that you personally care about President Obama’s policy to deny a
         permit for the Keystone XL Pipeline: do you care very much, pretty much, not very much or not
         at all? A: VERY MUCH, or PRETTY MUCH, or NOT VERY MUCH or NOT AT ALL
     Q: What is your opinion about economic sanctions against Iran because of its nuclear
         enrichment program? Do you STRONGLY AGREE with sanctions, SOMEWHAT AGREE, NEITHER
Because there is no uniform distance between categories of an ordinal variable, we know that someone
who chooses "somewhat disagree” agrees less than someone who says “strongly disagree” but we
cannot confidently say that everyone who says “strongly disagree” disagrees twice as much as anyone

who answers “somewhat disagree” nor do we know if the difference between “somewhat disagree” and
“strongly disagree” is any bigger than the difference between “somewhat agree” and “neither agree nor

Interval/Ratio variables organize data in ordered categories and have a uniform distance between each
category. Example:
     Differences in country wealth are measured using GDP per capita (Gross Domestic Product per
        person), which is measured in dollars.
     Predications made using odds, like “what do you think the odds are that Dalton McGuinty will
        serve another five years in office?”

For more, see: http://wiki.opossem.org/index.php?title=Measurement#Levels_of_Measurement

Background: Descriptive Statistics.
It is important for a researcher to assess what level of measurement the data is organized into since it
affects which descriptive statistics best describe a “typical observation”. The higher level the measure,
the more precise the descriptive statistic. For nominal variables, the most that can be said about a
variable is to describe its mode, the most common observation. In public opinion survey questions, the
mode is the most frequent answer to the question.

For ordinal and interval/ratio variables, the observations are ordered, so one can use either (or both) of
two measures:
     Mean. The average value. Calculated by adding up each observation’s value and dividing that
        sum by the total number of responses.
     Median. The middle observation. Half of all observations are lower than the median, half are
        higher than the median.

You can read more about descriptive statistics on this website http://davidmlane.com/hyperstat/desc_univ.html and
elsewhere on the web!

    A. First, go to U of T’s SDA website for the 2008 Canadian Election Study
                 If you are familiar with another statistics program, please download the data at:
    B. On the bottom left of the screen, you will see a list with some icons that look like little blue
        books. The top of the list says “Canadian Election Study, 2008”. This is the list of the survey
        questions or variables. The 2008 Canadian Election Study actually consisted of three surveys: a
        campaign period survey that was administered before election day, a post-election telephone
        survey administered right after the election and a survey that was mailed to respondents.
        Respondents could have answered all three surveys, only about 1 in 4 respondents answered all
    C. On this assignment, we will focus on the campaign period survey, so click on the second little
        book on the list next to “Campaign Period Survey” to reveal the variables on that survey. After
        clicking, you should see this:

D. Click on Section MIN: minority government (marked above with a red arrow) and click on the
   first line, which says:
   ces08_cps_min_1 – Now, your views on minority governments. Do you think minority go
   This variable asked respondents whether they think minority governments are good things, bad
   things or if they are not sure. The indicator, ces08_cps_min_1, is the variable name or number.
E. Notice that right above the list of variables, there is a section called “Variable Selection,” and,
   after you clicked on the variable, you should see “ces08_cps_min_1” in the box next to
   “Selected:” just like in the picture below.

F. Click the View button. A new window will open with a description of this survey item. A short
   description of the variable (it should be the survey question wording, but part of the question
   was truncated). On it, you can see that 1,486 people said minority governments were a good
   thing, 636 people said it was a bad thing, and 1,128 said “don’t know” or “not sure”. Please find
   these numbers underneath the column marked “N” which is short for “number”. Each of these
   people are observations, so N is the abbreviation for “number of observations.”

1. Look at the column marked N and answer:
   How many people refused to answer the question, ces08_cps_min_1? _____

G. The column next to N is the value assigned to each possible answer. This is an arbitrary number
   used by the computer to store the survey respondents’ responses (or observations). This
   number is also used when the computer calculates statistics like the average.

2. What value was assigned to those who answered “don’t know” or “not sure”? ____

   Notice that under “Properties” there is a line called “missing data codes.” These are the
   responses that are not included in the data. Codes that are missing are 8 and 9, which means
   that all of the “don’t know” respondents, as well as those who refused to answer the question
   were excluded. You can see the effect of excluding these observations: in the column on the far
   left (“percent”), only the percentages are calculated using only those who responded that a
   minority government is a good (70%) or bad thing (30%).

H. Click the “Row” button. The variable name (ces08_cps_min_1) now appears on the right of the
   screen, in the box at the top next to “Row:” Below that you will see boxes marked “Column,”
   “Control,” “Selection Filter” and “Weight”, followed by a box with a set of options. Select
   “Summary Statistics” and select what type of chart you want the computer to display (or none
   at all) and click the button, “Run the Table” at the bottom. A new tab or window should open
   that should look like:

      The green box shows what values were excluded as missing (MD = “Missing Data”). The
   column immediately to the left of the green box shows what values (1-5) were included in the
   analysis. Below that, marked by the blue arrow, are the observations, which should be identical
   to what was displayed when you hit “view” except only those who answered that minority
   governments were a good or bad thing are displayed, and the percent is above the number of
   responses (“N of cases”).

        Below that are lots of summary statistics, marked by a green arrow. In this worksheet, we will
     focus on the values on the left column, right below the green arrow. Because this variable is
     dichotomous, it is nominal, so the only statistic with any explanatory value is the mode.

I.   Now click on the variable below the one we just looked at:
     ces08_cps_min_2 - What do you think the election result WILL BE: …?
J.   Click “view” and observe the different responses to the question of what these people thought
     would be the outcome of the 2008 election: A Liberal majority government, a Conservative
     majority, a Liberal minority government, a Liberal majority government, or something else.

3. In the background section above, we described three different levels of measurement. At
   what level of measurement is this variable (ces08_cps_min_2)?
       a. Nominal
       b. Ordinal
       c. Interval/Ratio

4. Given the level of measurement of ces08_cps_min_2, what descriptive statistic would be best
   to report?
       a. Mode
       b. Median or Mean.

K. Please look at questions in “Section A: interest and the media”. You might recall that in the fall
   of 2008 there was a federal election in Canada and a Presidential Election in the US (between
   Obama and McCain). Ces08_cps_a3 asks respondents about their interest in the Canadian
   federal election, while ces08_cps_a5 asks respondents about their interest in the American
   election. Both of these variables are ordinal, measured on a scale from 0-10. Since the variables
   are ordinal, we can compare the mean and the median of each variable to see whether
   Canadians were more interested in the Canadian federal election or the American election. To
   find the mean and the median of each variable, click on ces08_cps_a3, hit “row”. Make sure the
   option, “Summary Statistics,” is selected and then click the button, “Run the Table.”

L. Write down the mean and median values of on ces08_cps_a3. Then return to the main SDA
   analysis page, click on ces08_cps_a5 and run the table for that variable. Compare the mean and
   median for ces08_cps_a5 (interest in the American election) to ces08_cps_a3 (interest in the
   Canadian election?

5. What is the median value of ces08_cps_a5? ______

6. Is the median level of interest in the Canadian federal election (ces08_cps_a5) higher or lower
   or exactly the same as the median level of interest in the American election (ces08_cps_a5)?
        a. Lower
        b. Exactly the same
        c. Higher

7. Compare the means of ces08_cps_a5 and ces08_cps_a3 to answer: On average, in 2008, were
   Canadians more interested in the Canadian federal election, more interested in the US
   Presidential election or were their level of interest exactly the same?

        a. Lower
        b. Exactly the same
        c. Higher

M. Now look at the first question in “Section J: values and party images”: ces08_cps_j0 Liberal
   Party’s Green Shift / Tax really hurt Canada? If you click “View” or “Row” and run the table, you
   will see that all answers of 8 “Not sure” were declared to be missing.

8. With “not sure” missing, at what level of measurement is ces08_cps_j0?
      a. Nominal
      b. Ordinal
      c. Interval/Ratio

9. If “not sure” was included, at what level of measurement would ces08_cps_j0, be?
        a. Nominal
        b. Ordinal
        c. Interval/Ratio

N. Sometimes, after completing a survey, the researchers ascertain that there were a few to many
   people of one group or a few too many people of another group. Say, there were too many 40-
   something women, and too few 20-year old men. In such a scenario, what the researchers often
   do is give each response (or observation) a weight. In this way, the answers for people in the
   group with too many people are worth less than 1.0 and the answers for underrepresented
   people are worth more than 1.0. Sometimes over-representing a group is done deliberately to
   ensure that there are enough respondents in that group to analyze on their own. Since a little
   less than ¼ Canadians live in Quebec, and there are several provinces with very few people (like
   PEI), it is common on national surveys in Canada to over-sample the number of Quebecois and
   residents of smaller provinces so that analysts have enough respondents to analyze
   francaphones on their own and/or provide an estimate of opinions in each province. In such a
   situation, while the provincial opinions may be more accurate, the oversampled Quebecois or
   Prince Edward Islanders must be under-weighted to give an accurate estimate of the national
   opinion (by giving each response a weight of less than one) and populous provinces like Ontario
   must be over-weighted (by giving each response a weight of greater than one) to reflect its large
   size relative to the rest of Canada.

    There are three weights in this dataset. The national weight is designed to be an accurate gauge
    of national opinion. The provincial weight allows researchers the opportunity to most accurately
    gauge opinions in each province. A third weight, the household weight, is an attempt to
    compensate for the increased odds of being sampled if they live alone (if you live alone, and the
    survey researcher calls, you will be surveyed, if you live in a three-adult family, your odds of
    being surveyed are 33%, etc). The household weight adjusts each response based on how many
    members there are in the respondents’ family and the number of families of that size in the
    total sample relative to the general population. The national and provincial weights incorporate
    the household weights.

    On this assignment, we are interested in overall opinions of Canadians, so we want to look at
    the national weight. This is a large survey, so the results of each question should not be very

different with (or without) the weights, but if you are interested in very precisely knowing the
number of Canadians with a specific opinion, the weights are important.

Please look at the results of ces_cps_j0 with the national weight. To analyze results using the
national weight, you need to select ces08_natwgt – National Weight in the Weight drop-down
menu (see red arrow) and select the box marked “Weighted” in the Table Options menu (green

When running a table with weighted data (below) you will see the weight variable named in the
list of variables at the top (green arrow) and the number of observations often includes a
fraction. In this example, there are 703.6 people who responded “somewhat disagree” that the
proposed Green Shift/Carbon Tax would really hurt Canada.

10. What is the mean opinion toward the proposed “Green Shift” carbon tax (ces08_cps_j0) when
    the national weight is used?
        a. 3.47
        b. 3.59
        c. 3.65
        d. 3.85

O. In the last steps of this assignment, we will learn how to recode variables. You will recall that in
   the beginning of the analysis, we looked at whether Canadians thought that minority
   governments are a good or bad thing by reviewing responses to ces08_cps_min_1. You may
   remember that there were three possible answers to this question: minority governments are
   good, minority governments are bad, and “I don’t know or not sure.” The latter option was
   excluded from the analysis. By recoding the variable, we can include those options, and put
   those responses in between those who said that minority governments are a good things or a
   bad thing. Complete instructions on how to recode variables can be found here:

P. Start by clicking on Section MIN: minority government (marked above with a red arrow) and
   selecting the variable again by clicking on: ces08_cps_min_1 – Now, your views on minority
   governments… Then click the “row” button.

Q. On the right side of the screen, you will now see ces08_cps_min_1 in the box next to “Row”. You
   can type in that box, so after the variable name, type: (r: 1 "Good thing"; 8 "DK/NS"; 5 "Bad
         The letter ‘r’ tells the computer to recode the variable. The colon is important.
         The number one tells the computer that observations with the value retain that value
             and should be labeled “good thing”.
         Text inside of quotation marks is understood by the computer to be the value labels.
             Don’t forget both quotation marks!
         The semi-colon is important and lets the computer know that you are about to tell it
             what is in the second category.
         Because we want “Don’t Know/Not Sure” responses to fall in between positive and
             negative feelings towards minority governments, we tell the computer to place all
             responses with the value of eight in the second category and label it “DK/NS”.
         After a semi-colon, the five says that all observations with the value of five should be in
             the third category (with a value, by default, of three). “Bad thing” is the label for the last
R. The box should now say: ces08_cps_min_1(r: 1 "Good thing"; 8 "DK/NS"; 5 "Bad thing") and look
   like this:

S. Click “Run the Table” and on the output, you should see that there are now three categories
   included in the calculations. Crucially, 1,108 respondents (if you included the national weight,
   1,128 if you did not) were included in the analysis.

11. What percentage of the Canadian population thinks that minority governments are a good
    thing (ces08_cps_min_1)? Use the number of observations weighted by the national weight
    and include respondents who answered “don’t know” or “not sure.”
        a. 30.2%
        b. 34.7%
        c. 45.7%
        d. 70%

12. Since the recoded variable ces08_cps_min_1 is ordinal, the best measures of central tendency
    are the mean and median. What is the average opinion of minority governments (after
    recoding the variable to run from one to three)?
        a. 1.14
        b. 1.27
        c. 1.47
        d. 1.74

13. Half of all responses are above the median, and half are below the median. What was the
    median opinion of minority governments (recoded ces08_cps_min_1)?
        a. 1 – Minority governments are a good thing.
        b. 2 – Not sure if minority governments are a good thing or don’t know.
        c. 3 – Minority governments are a bad thing.

T. Once again, select the variable that asks respondents to predict the outcome of the election:
   ces08_cps_min_2 - What do you think the election result WILL BE: …? Put this variable in the
   row. This variable is coded like this:
           0:      Other
           1:      Liberal majority
           2:      Liberal minority
           3:      Conservative majority
           4:      Conservative minority
   Recode the variable so that a Liberal victory (majority or minority) is in the first category, and all
   other answers are in a second category, and run the analysis. To put more than one value in a
   category, list those values with commas. So, the recoded command is:
   ces08_cps_min_2(r: 1, 2 "Liberal win"; 0, 3, 4 "Other")

14. What percent of all voters expected a Liberal Party victory (ces08_cps_min_2, using the
    national weight)?
        a. 18.2
        b. 20.1
        c. 24.5
        d. 26.1

U. Now, recode the variable so that all predictions of a majority government are together in one
   category and all predictions of a minority government are in a second category and exclude
   “other” responses.

   15. What is the correct way to recode ces08_cps_min_2 so that all those who predict a minority
       government are in one category and all those who predict a majority government are in the
       second category (exclude “other responses”)?
           a. ces08_cps_min_2(r: 1 "Majority Gov’t"; 2 "Minority Government")
           b. ces08_cps_min_2(r: 1, 3 "Majority Gov’t"; 0, 2, 4 "Minority Government")
           c. ces08_cps_min_2(r: 1, 3 "Majority Gov’t"; 2, 4 "Minority Government")
           d. ces08_cps_min_2(r: 1, 2 "Majority Gov’t"; 3, 4 "Minority Government")

   16. Excluding those who responded “other” and using the national weight, what percent of the
       population preferred a minority government (ces08_cps_min_2)?
           a. 63.4
           b. 64.6
           c. 65.1
           d. 68.3

Remember to go on Blackboard to record your answers!


To top