Creating An SPSS Database & Exploring Variables
The Problem Create an SPSS database using the criminal record variables in the attached table. The Variables The variables in the attached table and their associated operational definitions are follows: Subject identification number (ID); numbered 1 to 20 Length of sentence; sentence, measured in years Race/ethnicity; race, 1=white, 2=black, 3=hispanic Number of prior convictions; pr_conv, the actual number Age; age, current age in years Age at first arrest; age_firs, in years Gender; gender, 0=male, 1=female The Analysis The purpose of this analysis is to construct a SPSS database and examine the metric and nonmetric variables for missing data, outliers, and distributional dynamics. After generating and studying the SPSS output, answer the following questions. 1. Is there any missing data on any of the cases across the six variables in the database? 2. Relative to the metric variable age: What's average? Is the distribution skewed, if so in what direction? How would you characterize the kurtosis of the distribution? Is the mean different from the median and what does this imply? How would you characterize the stem and leaf plot? What does the box plot indicate about the distributional?

Creating an SPSS Database: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

2 Are there outliers in the distribution? How do you know? What are the case numbers of these cases? 3. Relative to the metric variable age at first arrest: What's average? Is the distribution skewed, if so in what direction? How would you characterize the kurtosis of the distribution? Is the mean different from the median and what does this imply? How would you characterize the stem and leaf plot? What does the box plot indicate about the distributional? Are there outliers in the distribution? How do you know? What are the case numbers of these cases? 4. Relative to the metric variable prior convictions: What's average? Is the distribution skewed, if so in what direction? How would you characterize the kurtosis of the distribution? Is the mean different from the median and what does this imply? How would you characterize the stem and leaf plot? What does the box plot indicate about the distributional? Are there outliers in the distribution? How do you know? What are the case numbers of these cases?

5. Relative to the metric variable sentence: What's average? Is the distribution skewed, if so in what direction? How would you characterize the kurtosis of the distribution?

3 Is the mean different from the median and what does this imply? How would you characterize the stem and leaf plot? What does the box plot indicate about the distributional? Are there outliers in the distribution? How do you know? What are the case numbers of these cases? 6. Based upon the frequencies of and bar charts for the variables gender and race, how would you characterize the distributional dynamics of these two nonmetric variables?

4

Criminal Record Data

Subjects 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 .

Sentence 1 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5

Race 1 2 3 3 2 3 2 3 1 3 3 1 3 2 3 1 1 2 2 3

Pr_conv 0 1 2 0 1 0 1 2 1 0 2 1 0 2 3 1 3 0 0 2

Age 18 19 18 20 19 21 20 21 23 18 19 18 24 22 21 19 26 23 24 18

Age_firs 15 17 16 17 14 17 17 16 17 15 17 14 17 16 16 15 14 15 14 16

Gender 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1

5 SPSS Procedures Problem 1: Creating an SPSS database SPSS File New Step 1: Naming a variable Click on the title (var) of the first column on the left of the spreadsheet Data Define Variable: Variable Name: Subjects OK Step 2: Entering the data Click on the cell in the 1st column, 1st row Enter the 1st subject’s ID number, i.e. 1 Press return Repeat these steps to enter the ID numbers of the other subjects Step 3: Completing the data entry Repeat steps 1 and 2 to enter the names and data for the remaining variables Step 4: Naming and saving the database File Save as: Criminal Records Database Save as Type: SPSS Data Save Step 5: Printing the database File Print Copies: 1 Print

6

AGE 18.00 19.00 18.00 20.00 19.00 21.00 20.00 21.00 23.00 18.00 19.00 18.00 24.00 22.00 21.00 19.00 26.00 24.00 18.00

AGE_FIRS 15.00 17.00 16.00 17.00 14.00 17.00 17.00 16.00 17.00 15.00 17.00 14.00 17.00 16.00 16.00 15.00 14.00 14.00 16.00

GENDER 1.00 1.00 .00 .00 .00 .00 1.00 .00 1.00 .00 1.00 .00 .00 .00 .00 1.00 1.00 1.00 1.00

PR_CONV .00 1.00 2.00 .00 1.00 .00 1.00 2.00 1.00 .00 2.00 1.00 .00 2.00 3.00 1.00 .00 .00 2.00

RACE 1.00 2.00 3.00 3.00 2.00 3.00 2.00 3.00 1.00 3.00 3.00 1.00 3.00 2.00 3.00 1.00 2.00 2.00 3.00

SENTENCE 1.00 2.00 3.00 3.00 2.00 3.00 2.00 3.00 1.00 3.00 3.00 1.00 3.00 2.00 3.00 1.00 2.00 2.00 3.00

20

Number of

cases

listed:

20

7 Problem 2: Exploring the Variables 2.1 For the metric variables Analyze Descriptive Statistics Explore Age Age_firs Pr_conv Sentence OK

Explore
Cas e Proces s ing Sum m ary Cases Mis sing N Percent 0 .0% 0 .0% 0 .0% 0 .0%

N AGE AGE_FIRS PR_CONV SENTENCE

Valid Percent 20 100.0% 20 100.0% 20 100.0% 20 100.0%

Total N 20 20 20 20 Percent 100.0% 100.0% 100.0% 100.0%

8

Des crip tives Statistic 20.5500 19.4190 21.6810 20.3889 20.0000 5.839 2.4165 18.00 26.00 8.00 4.5000 .740 -.410 15.7500 15.2052 16.2948 15.7778 16.0000 1.355 1.1642 14.00 17.00 3.00 2.0000 -.347 -1.341 1.1000 .6222 1.5778 1.0556 1.0000 1.042 1.0208 .00 3.00 3.00 2.0000 .442 -.905 2.2000 1.8099 2.5901 2.2222 2.0000 .695 .8335 1.00 3.00 2.00 1.7500 -.412 -1.434 Std. Error .5403

A GE

Mean 95% Conf idence Interval f or Mean 5% Trimmed Mean Median V arianc e Std. Dev iation Minimum Max imum Range Interquartile Range Skew nes s Kurtosis Mean 95% Conf idence Interval f or Mean 5% Trimmed Mean Median V arianc e Std. Dev iation Minimum Max imum Range Interquartile Range Skew nes s Kurtosis Mean 95% Conf idence Interval f or Mean 5% Trimmed Mean Median V arianc e Std. Dev iation Minimum Max imum Range Interquartile Range Skew nes s Kurtosis Mean 95% Conf idence Interval f or Mean 5% Trimmed Mean Median V arianc e Std. Dev iation Minimum Max imum Range Interquartile Range Skew nes s Kurtosis

Low er Bound Upper Bound

A GE_FIRS

.512 .992 .2603

Low er Bound Upper Bound

PR_CONV

.512 .992 .2283

Low er Bound Upper Bound

SENTENCE

.512 .992 .1864

Low er Bound Upper Bound

.512 .992

AGE

9
AGE Stem-and-Leaf Plot Frequency 9.00 10.00 1.00 Stem width: Each leaf: Stem & 1 . 2 . 2 . Leaf 888889999 0011123344 6

10.00 1 case(s)

28

26

24

22

20

18

16
N = 20

AGE

AGE_FIRS
AGE_FIRS Stem-and-Leaf Plot Frequency 4.00 .00 4.00 .00 5.00 .00 7.00 Stem width: Each leaf: Stem & 14 14 15 15 16 16 17 . . . . . . . Leaf 0000 0000 00000 0000000

1.00 1 case(s)

17.5

17.0

16.5

16.0

15.5

15.0

14.5

14.0 13.5
N = 20

AGE_FIRS

10

PR_CONV
PR_CONV Stem-and-Leaf Plot Frequency 7.00 .00 6.00 .00 5.00 .00 2.00 Stem width: Each leaf: Stem & 0 0 1 1 2 2 3 . . . . . . . Leaf 0000000 000000 00000 00

1.00 1 case(s)

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0 -.5
N = 20

PR_CONV

SENTENCE
SENTENCE Stem-and-Leaf Plot Frequency 5.00 .00 6.00 .00 9.00 Stem width: Each leaf: Stem & 1 1 2 2 3 . . . . . Leaf 00000 000000 000000000

1.00 1 case(s)

3.5

3.0

2.5

2.0

1.5

1.0

.5
N = 20

SENTENCE

11

2.1 For the categorical variables Analyze Descriptive Statistics Frequencies Variables: gender race Statistics Central Tendency Mode Continue Charts Chart Type Bar Chart Values Frequencies Continue OK
Frequencies
Statistics GENDER 20 0 .00 RACE 20 0 3.00

N Mode

V alid Mis sing

Frequency Table
GENDER Cumulativ e Percent 55.0 100.0

Valid

.00 1.00 Total

Frequenc y 11 9 20

Percent 55.0 45.0 100.0

Valid Percent 55.0 45.0 100.0

RACE Cumulativ e Percent 25.0 55.0 100.0

Valid

1.00 2.00 3.00 Total

Frequenc y 5 6 9 20

Percent 25.0 30.0 45.0 100.0

Valid Percent 25.0 30.0 45.0 100.0

12

Bar Chart
GENDER
12

10

8

6

4

Frequency

2

0 .00 1.00

GENDER

RACE
10

8

6

4

Frequency

2

0 1.00 2.00 3.00

RACE

```
