Due Date: 10/06/05
PM 536: Lab 5
Data Manipulation and Scale Creation
These exercises are designed to demonstrate techniques for inspecting,
cleaning, labeling and manipulating data. Scale building is also demonstrated; it
is a technique that enables the data analyst to provide more reliable and valid
measures of study concepts such as attitude or knowledge. Chapters 8 & 9 will
provide useful background for this lab.
1. Creating an attitude scale by adding
Begin by opening a log in which you can save your results for this laboratory
1. Use the “evaltxt2” data from blackboard. (This dataset is a 2-wave cross-
sectional survey administered to students in undergraduate courses half of whom
were exposed to a nutrition intervention.)
2. Inspect the attitude items in the data that are questions 1 through 14. Use
the “proc freq” command in SAS or “d q1-q14*” in STATA to see the
variable names and labels. Verify that the data are valid by inspecting the
range and number of observations for each.
3. Create a new variable by summing the responses to these 14 questions
with the following command:
Notice that this command might generate some missing values if a respondent
has missing data for any one of the questions. (Later we’ll review ways to avoid
having missing cases in this type of situation.)
4. What is the mean, mode, range, and standard deviation of the new
5. Label the new variable with a descriptor.
6. Now perform the same functions on a second attitude variable called
“att_1" consisting of question 1 through 6. Find the mean, mode range
and standard deviation of the new variable that you created.
2. Creating an attitude scale by summing
1. Create a new variable by summing the responses to the question 1
through 14 with the following command:
egen attit_1a=rsum(q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14)
2. What is the mean, mode, range, and standard deviation for both variables,
“attit_1” and “attit_1a"?
3. Label the “attit” variable with a descriptor.
3. Creating an attitude scale by taking the averages
1. Create a new variable by averaging the responses to questions 16 a
through g with the following command:
egen attit_1b=rmean(q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14)
2. What is the mean, mode, range, and standard deviation of the new
3. Label the new variable.
In the attitude scales created above, we were able to add the questionnaire
responses together, this procedure being quite common for scale building.
Indices, however, are created by counting the number of conditions (such as the
number of correct answers in a knowledge test). For example, a count of the
number of modern contraceptives that respondents are aware of would constitute
an index of awareness and would be created in the same way as that specified
for the attitude scale examples above. Many times, however, we need to create
indices of correct responses such as in a knowledge test. In such cases we
need to evaluate each question separately.
4. Creating a knowledge index by taking the number correct
1. Inspect the responses to questions 18, 19, 20 and 21.
2. Create a new variable which records the number of correct responses to
questions 18, 19 20 and 21.
gen correct_1 = (q18==2)+ (q219==3) + (q20==4) + (q21==5)
Note that here we evaluate each variable separately; if the respondent’s
response agrees with that value specified in the parenthesis (1, 2, 3, etc.),
the value in the parentheses becomes 1 (if the answer is true) and 0, (if
the answer is false). Hence each respondent generates a ‘correct’ score
that is the sum, of binary Os (zeros) and 1s (ones), for each of the four
2. Create a new variable that divides the “correct” variable by 4 to get a
gen corr_1p = correct_1/4
3. Label the two new variables you have created.
4. Record the averages and standard deviations of the two new variables.
5. Creating attitude and knowledge variables for the follow up survey data.
1. Report the mean, standard deviation, and range for follow-up nutrition
attitude (q1 to q14) on the follow up data created as a sum and as an
2. Report the mean, standard deviation, and range for follow-up nutrition
knowledge index (q18 to q21) created as a sum and as a proportion.