Defining and Measuring Variables (Part 3!!!) by FfB43W

VIEWS: 0 PAGES: 30

• pg 1
```									Defining and Measuring
Variables
(Part 3!!!)

PSYC 350

September 13, 2010
Outline: Selecting a measurement

1. Modalities of measurement
2. Scales/levels of measurement
3. Validity and reliability of measurement
4. Other aspects of measurement
Reliability and Validity
How can we be sure that the measurements
obtained from an operational definition
actually represent the intangible construct
we're interested in?

Researchers have developed two general
criteria for evaluating the quality of
measurement procedures.
Reliability and Validity
• Reliability
– The consistency or stability of a measure
– Does it measure the same thing each time?

• Validity
– The truthfulness of a measure
– Does it measure what it intends to measure?
Reliability of measurement
Reliability
– The consistency or stability of a measure
– Does it measure the same thing each time?

A measurement is reliable if repeated
measurements of the same individual
under the same conditions produce
identical (or nearly identical) values.
Reliability of measurement
Reliability

Includes the notion that each individual
measurement has an element of error.

Measured score = True score + Error

A measurement is reliable if measurement
error is small
Reliability of measurement
Measured score = True score + Error

E.g., IQ score:

of intelligence (your true score) …

… but it is also influenced by other factors like your mood,
health, luck in guessing (error).
Reliability of measurement
Measured score = True score + Error
As long as error is small, reliability is good.

(e.g., IQ tests)

If the error component is large, the measurement
is not reliable.
(e.g., reaction time tests)
Reliability of measurement
Common Sources of Error
1. Observer error

Simple human error (such as lack of precision)
in measuring.

E.g., four people with stopwatches recording the
winner's time in a race -- differences in judgment
and reaction time.
Reliability of measurement
Common Sources of Error
2. Environmental changes

It's not really possible to measure the same
individual at different times under identical
circumstances.

Small environmental changes can influence
measurement (e.g., temperature, time of day,
background noise)
Reliability of measurement
Common Sources of Error
3. Participant changes

The participant can change between
measurements

E.g., mood, body temperature, hunger, fatigue
Reliability of measurement
• Forms of reliability
– Test-retest reliability
– Split-half reliability
– Inter-rater reliability
Reliability
• Test-retest reliability
– A test of stability over time
– Test people with exactly the same test (or equivalent
version of the test) at two time points and compare
scores
– Potential problems
• First testing contaminates second
• Participants change over time
Reliability
• Split-half reliability
– A measure of consistency within a measure
– Relatedness of items on a test or questionnaire
When we give a multi-item test, we assume that the
different questions measure a part or aspect of the
construct.
If this is true, there should be some consistency among
the items.
Researchers split the set of items in half, and then
evaluate whether they are in agreement.
Reliability
• Inter-rater reliability
– A measure of agreement between two
observers of the same behaviors
– High reliability demonstrates strong definitions
of behavioral categories
Validity of measurement
Validity
– The truthfulness of a measure
– Does it measure what it intends to measure?

E.g., IQ: Think of the absent-minded professor
with an IQ of 160 who functions poorly in
everyday life.
Validity of measurement
Types of Validity
– Face validity
– Concurrent validity
– Predictive validity
– Construct validity
• Convergent validity
• Divergent validity
Validity
• Face validity
– Appearance of validity
– Based on subjective judgment
– Difficult to quantify
– Little scientific value

Potential problem with high face validity: you don't always want
participants to know what you're measuring.
Validity
• Concurrent validity
– Scores obtained from a new measure covary
with (are consistent with) scores from an
established measure of the same construct
But … caution: the fact that two sets of
measurement are related doesn't mean that they
are measuring the same thing.
E.g., measuring people's height by weighing
them
Validity
• Predictive validity
– Scores obtained from a measure accurately predict
future behavior

Most theories make predictions about how different values
of a construct will affect behavior.

If people's behavior is consistent with the results of a test,
the measure has predictive validity.

E.g., "need for achievement" score predicted
children's behavior in a game
Validity
• Construct validity
– How well a test assesses the underlying construct
– Is this measurement of the variable consistent with all the things
variables?
E.g., measuring people's height by weighing them is ok in terms of
concurrent validity (height and weight are correlated), but not in
terms of construct validity (height is not influenced by food
deprivation but weight is).
Construct validity
Two types of Construct Validity

• Convergent validity

• Divergent validity
Construct validity
• Convergent validity
– Scores on the measure are related to scores
on measures of related constructs
– Use two methods to measure the same
construct and then show that the scores are
strongly related
**Note: There is a difference between convergent validity. It's a subtle
difference. Concurrent validity has to do with comparing a new measure
Construct validity
• Divergent validity
– The test does not measure something it was
not intended to measure
– Demonstrate that we are measuring one
specific construct (and not a mixture of two)
Construct validity
•   Divergent validity
– Use two methods to measure two
constructs, then demonstrate:
1. convergent validity for each construct
2. no relationship between scores for the two
constructs when measured by the same
method.
Construct validity
•    Convergent and divergent validity
E.g.:
–   want to measure aggression in children, but worried
that your measure might reflect general activity level
–   get observations of aggression and activity, and get
teacher ratings of aggression and activity
–   aggression measures should match, activity
measures should match, the two observational
measures should not match, and the two ratings
measures should not match
Reliability and validity
Indicate whether the researcher is interested in demonstrating reliability or
validity. Specify the type of reliability or validity being described.

1.Dr. Chang has written a test to measure extroversion. He is worried
that his test might simply measure social desirability (i.e., people will
respond to the items the way they think they should respond, instead
of being honest). To make sure his test does not measure social
desirability, he gives 100 research participants his measure of
extroversion and a measure of social desirability and calculates the
correlation between the two scales. Dr. Change is trying to
demonstrate ______________________________.

2.Dr. Diefenbacher has written two versions of her Religiosity Scale.
Each version is 20 items in length. She gives all 40 items to 100
research participants. Then she examines the correlation between
each set of 20 items. Dr. Diefenbacher is interested in demonstrating
_____________________________.
Other aspects of measurement
• Sensitivity and range effects
– Ceiling effects
– Floor effects
Other aspects of measurement
• Participant reactivity
– Social desirability
– Demand characteristics

• Observer bias
– Expectancy effects
Other aspects of measurement
• Participant reactivity
– Social desirability
– Demand characteristics

• Observer bias
– Expectancy effects

• Potential solutions
– Deception
– Blinding (single-blind & double-blind studies)

```
To top