JMP Tutorial 1 Introduction to Data Entry in JMP and Univariate Displays for Categorical Data
Data File: Youth_Behavior_Survey.JMP
Background: The data for this example comes from a survey sponsored by the U.S. Centers for Disease Control. This was carried out in 2005 as part of a program called the Youth Risk Behavior Surveillance System. A link to more information on this program is given below: http://www.cdc.gov/HealthyYouth/yrbs/index.htm Variables: Grade - High school grade level Value 1 2 3 4 5 Label 9th Grade 10th Grade 11th Grade 12th Grade Ungraded or Other Grade
Seatbelt Use – How often the respondent wears a seatbelt when riding in a car driven by someone else. Value 1 2 3 4 5 Label Never Rarely Sometimes Most of the time Always
Information on the coding of variables can be found using the following link:
ftp://ftp.cdc.gov/pub/data/yrbs/2005/2005NationalYRBSDataUsersManual.pdf
Goal: Learning Outcomes:
To describe the seat belt use of high school students when riding in a car driven by someone else. 1. Learn how to enter data files into JMP. 2. Learn how to create data files in JMP. 3. Use JMP to create graphical displays and summary statistics for a single categorical variable. 4. Copy JMP results to another application for report writing.
1
ACCESSING THE DATA FILE The data file Youth_Behavior_Survey.JMP is posted on the course website. You can double-click on this filename to start JMP; alternatively, you can save this file to your computer, open JMP, and then select Open Data Table or File > Open and choose the file’s location. A spreadsheet containing the data from the 2005 study will appear, a portion of which is shown below. Note the setup of the data: in most cases, the variables in the data set are presented as columns while the observations in the data set are presented as rows.
SUMMARIES FOR A SINGLE CATEGORICAL VARIABLE For our first example, we will work with only the variables Grade (In what grade are you?) and Seatbelt Use (How often the respondent wore a seatbelt when riding in a car driven by someone else). Note that the icons next to the variable names identify both of these variables as categorical.
2
Numerical Summaries: Frequency and Relative Frequency Distributions First, we will concentrate on the distribution of the grade levels. Select Analyze > Distribution and place In what grade are you in the Y, Columns box.
JMP returns the frequency (in the Count column) and relative frequency (in the Prob column) distributions for these data:
Graphical Summaries: Bar Charts, Mosaic Plots, and Pie Charts When you select Analyze > Distribution in JMP, you automatically see a bar chart in addition to the frequency distributions. The bar chart for our example is as follows:
Note that a relative frequency axis has been added to the bar chart by selecting that option from the Histogram Options menu. To access this menu, click on the red drop-down arrow next to the variable name. Also, you can change the orientation of the graph by selecting Display Options > Horizontal Layout.
3
To request a mosaic plot (if it does not appear automatically), select Mosaic Plot from the same menu through the red drop-down arrow:
Alternatively, you can construct a bar chart by selecting Graph > Chart from the main menu in JMP. Select % of Total from the Statistics drop-down menu. Next, place the variable of interest in the Categories, X, Levels box and click OK.
The bar chart is shown below:
4
To obtain a pie chart instead of a bar chart, select the Pie Chart option from the red drop-down menu. Note that the labels were added to the chart by right-clicking on the chart area and selecting Label > Label by Percent of Total Values.
Questions: 1. What can you say about the distribution of the grade levels? 2. Does your answer to the previous question have any impact on your confidence in the overall survey results? Explain. ANOTHER EXAMPLE Recall that our goal in this tutorial is to describe the seat belt use of high school students when riding in a car driven by someone else. That is, we need to investigate the distribution of Seatbelt Use. To do this, choose Analyze > Distribution and place How often wore a seat belt in the Y, Columns box. You can then use both graphical and numerical summaries to describe seat belt use:
Questions: 1. What can you say about the distribution of seatbelt use? 5
MANUALLY ENTERING DATA IN JMP Suppose that the raw data were not available; that is, suppose that you have only the frequency table and have been asked to create graphical summaries. How Often Wore a Seat Belt # of Subjects Never 421 Rarely 1029 Sometimes 1917 Most of the Time 3977 Always 6548 In JMP, select File > New > Data Table. We need two columns in our spreadsheet to enter this table: one for the seat belt use category and one for the number of subjects in that category. To add columns to a spreadsheet, simply double click to the right of the last column given. The first variable, Seat Belt Use, is categorical. Double click in the header for Column 1, and give the variable a name. Since this variable is categorical/nominal, we specify the Data Type as character and Modeling Type as nominal.
Now you can type in the category names.
For the second variable, double click on Column 2. You can change the variable name to # of Subjects and enter the frequencies for each category. You must tell JMP that
these numbers are frequencies and that you wish the computer to interpret them as such. To do so, right click on the variable name and select Preselect Role > Freq.
6
When you are finished, your spreadsheet should look like the following:
You can save this spreadsheet by selecting File > Save and giving it a name. Note that JMP files will always have a .JMP file extension. Now, you can create frequency tables, bar charts, mosaic plots, and pie charts by following the same steps as before. For example, select Analyze > Distribution and place Seat Belt Use in the Y, Columns box.
JMP returns the following:
7
USING JMP OUTPUT IN REPORTS When you are preparing homework assignments, you should copy and paste relevant JMP output into a word processor. In addition, your answers to questions should be typed next to the relevant output. To copy JMP output, simply select Edit > Copy from the output window. Then, you should be able to paste the results as follows:
Distributions Seatbelt Use
Most of the time
Always
Never
Frequencies
Level Always Most of the time Never Rarely Sometimes Total N Missing 0 5 Levels Count 6548 3977 421 1029 1917 13892 Prob 0.47135 0.28628 0.03031 0.07407 0.13799 1.00000
Alternatively, you can paste the contents as a picture instead of text. To do so, select Edit > Paste Special > Bitmap, or you can choose another of the picture options.
Now, you can answer questions and summarize the results of your analysis below this JMP output.
Sometimes
Rarely
8
SAVING YOUR WORK IN JMP To save your work in JMP, go to the data table (NOT THE WINDOW CONTAINING THE ANALYSIS) and select File > Save As. Give the .JMP file a name, and save it in the location of your choice. When you exit JMP, you will be prompted with the following:
Click Yes. Now, when you open the file again, all of your work should appear. Even though JMP allows you to save your work, I encourage you to copy your JMP output and paste it into a word processor document as you progress through an assignment. Otherwise, if you accidentally click ‘No’ when you exit JMP, your work will be lost!
9