Studying behavior by yy8Shk

VIEWS: 6 PAGES: 33

									                       Variable, validity,
                          reliability

                            Dr. Yan Liu
Department of Biomedical, Industrial and Human Factors Engineering
                     Wright State University
                                         Variables
   Definition
       An event, situation, behavior, or individual characteristic that varies
            cognitive task performance, word length, intelligence, etc.
   Types of Variables by Measurement Scales
            Nominal variables
                  Values are unordered labels
                  color, name
            Ordinal variables
                  Possible levels are ordered in sequence)
                  ranking of preference, satisfaction
            Interval variables
                  Derive gaps between values; no true zero
                  time between flight departure and arrival, temperature (in the unit of C or F)
            Ratio variables
                  Values are real numbers with true zeros
                  size, weight
                                                                                               2
              Operational Definitions of Variables
   Operational Definition of Variable
       Definition of the variable in terms of the operations or techniques the researcher
        uses to measure or manipulate it
   Why Operational Definitions
       Some variables must be operationally defined so they can be studied empirically
            “cognitive task performance” can be defined as the number of errors made in
             detecting a target object on a screen
       The task of operationally defining a variable forces scientists to discuss abstract
        concepts in concrete terms
       Operational definitions help us communicate our ideas to others
   Which Operational Definition to Use
       A variety of methods to operationally define a variable may be available, each of
        which has advantages and disadvantage
       Decision on which operational definition to use in a study should be based on
        the goal of the study and other considerations (e.g. ethnics and cost)
                                                                                           3
                Relationship Between Variables
   Types of Relationship in Interval and Ratio Variables
       Positive linear relationship
       Negative linear relationship
       Curvilinear relationship
       No relationship
   Relationships and Reduction of Uncertainty
       Detecting relationships between variables means reducing our uncertainty about
        the nature of the variables
       Error variance (random variability)
       Research is aimed at reducing error variance by identifying systematic
        relationships between variables




                                                                                 4
Perfect Positive Linear Relationship   Perfect Negative Linear Relationship




Perfect Positive Linear Relationship
                                                 No Relationship
  (quadratic relationship with an
      intermediate minimum)                                              5
Suppose you have surveyed 200 people about whether or not they like shopping, and
100 people said Yes and the remaining 100 said No. What can do conclude from the
information?
When you meet a person, you can only make a random guess whether the person likes
shopping or not, and the chance your answer is correct (either way) is 50%
Suppose you also have asked people to indicate their gender, and found the
relationship between gender and attitude toward shopping as follows.

                       Male Female
   Like        Yes      30     70     70% males do not like shopping
 shopping?                            70% females like shopping
                No      70     30
                                      Therefore, when you predict a person’s
      Number of                       attitude toward shopping based on the
                        100    100
      participants                    person’s gender, you will be correct 70% of
                                      the time


                                                                                6
                       Nonexperimental Methods
   Relationships are studied by making observations or measures of the
    variables of interest as behavior occurs naturally
       Observing behavior
            Requires trained personnel to observe and code behavior according to a prearranged
             set of criteria
       Ask people to describe their behavior (survey)
       Examining various public records (e.g. census data)
   A relationship between variables is established when the variables
    vary together
   Allows observing covariation between variables (“correlational
    method”)



                                                                                          7
     Disadvantages of Nonexperimental Methods
   Correlation Does Not Equal Causality
       The correlational method identifies whether two variables are associated but not
        how they are associated
   Issues
       Direction of cause and effect
            With nonexperimental method, it is difficult to determine which variable causes the
             other
       The third-variable problem
            Extraneous variables may be causing the observed relationship
            Third-variables introduce alternative explanations for the observed relation
            A known yet uncontrolled third-variable is called confounding variable
                When two variables are confounded, their effects on another variable cannot be
                  separated



                                                                                           8
                           Experimental Methods
   Study cause and effect
   Use controlled observations and measurements to test hypotheses
   Experimental Control
       All extraneous variables are kept constant
            Cannot be confounding variables or responsible for the results of the experiment
       Treating participants in all groups in the experiment identically; the only
        difference between groups is the manipulated variable of interest
            If you design an experiment to compare the effectiveness of two visualization
             techniques, participants in the two settings must have the same technical background
             related to using the visualization techniques and be given equivalent tasks, the
             lighting and all other relevant conditions will be the same, and so on.




                                                                                           9
                  Experimental Methods (Cont’d)
   Randomization
       Used when it is difficult or costly to keep an extraneous variable constant
            Making the individual characteristic composition of different groups virtually
             identical
       Assign participants to different groups in a random fashion
            Using a list of random numbers (Appendix C.1 in Cozby’s book)
            Computer-based random number generators (e.g. Excel, Randomizer
             (http://www.randomizer.org))
       Other variables that cannot be held constant are also controlled by
        randomization
            A random order is used to schedule the sequence of various experimental conditions




                                                                                              10
Suppose you have recruited 16 participants and need to assign them to 2 groups (8
in each group)


Participant      Random          Group            8 smallest numbers ->group 1
  Order          Number        Assignment         8 largest numbers -> group 2
     1             56               1
     2             57               1
     3             51               1
     4             10               1
     5             69               2
     6              9               1
     7             75               2
     8              5               1
     9             78               2
     10            90               2
     11            79               2
     12             4               1
     13            79               2
     14            48               1
     15            57               2
     16            82               2                                               11
          Independent and Dependent Variables
   Independent Variable
       Manipulated by the researcher
       Considered to be the “cause” of the observed relationship
   Dependent Variable
       Studied and measured by the researcher to see if its values depend on the values
        of the independent variable
       Considered to be the “effect” of the observed relationship

                                           Dependent
                                            variable
      Independent         Dependent
        Variable           Variable


                                                                     Independent
                                                                       variable
                                                                                   12
Researchers conducted a study to examine the effect of music on exam scores. They
hypothesized that scores would be higher when students listened to soft music compared
to no music during the exam because the soft music would reduce students’ text anxiety.
One hundred students (50 males, 50 females) were randomly assigned to either the soft
music or no music conditions. Students in the music condition listened to music using
headphones during the exam. Fifteen minutes after the exam began, the researchers asked
the students to complete a questionnaire that measured test anxiety. Later, when the
exams were completed and graded, the scores were recorded.

As hypothesized, test anxiety was significantly lower and exam scores were significantly
higher in the soft music condition compared to the no music condition.


Where are the independent, dependent, mediating, and confounding variables?




                                                                                       13
        Disadvantages of Experimental Methods
   Artificiality of Experiments
       High degree of control and the laboratory setting may sometimes create an
        artificial atmosphere that may limit either the questions that can be addressed or
        the generality of the results
       Field experiment
            The independent variable is manipulated in a natural setting
            Researcher attempts to control extraneous variables via randomization or
             experimental control
            Researcher loses the ability to directly control many aspects of the situation, and thus
             the experiment suffers from the possibility of “contamination”
   Ethical and Practical Considerations
       Sometimes the experimental method is not a feasible alternative because
        experimentation would either unethical or impractical




                                                                                             14
            Comparison of Non-Experimental and
                 Experimental Methods
                    Description                    Advantages                       Disadvantages
             Relationships studied     •Allows measure of covariation          •Difficult to infer cause
             by making                 between variables                       and effect
   Non-      observations or           •Behavior can be observed in a          •Direction and third-
Experimental measuring variables       natural context                         variable problem
             as they exist naturally   •Allows us to study participant         •Difficult to control many
                                       variables that cannot be manipulated    aspects of the situation

               Direct manipulation     •Reduces ambiguity in interpretation    •High control may create
               and control of          of results regarding cause and effect   an artificial atmosphere
               variables, then         •Attempts to eliminate the impact of    •Can be unethical or
               response or result is   all possible confounding third-         impractical
               observed                variables
Experimental
                                       •Permits greater experimental
                                       control
                                       •Reduces the possible influence of
                                       extraneous variables through
                                       randomization

                                                                                                   15
             Evaluating Research: Three Validities
   Validity
       “Truth” and accurate representation of information
       Three types of validity to evaluate research
            Construct validity
            Internal validity
            External validity
   Construct Validity
       The adequacy of the operational definition of variables
            The operational definition of variable indeed reflects the true theoretical meaning of
             the variables
            A measure of social anxiety has construct validity if its measures the social anxiety
             construct and not some other variable such as dominance
       Variables that are abstract constructs usually can be measured and manipulated
        in a variety of ways but do not have a single perfect operational definition


                                                                                             16
    Evaluating Research: Three Validities (Cont’d)
   Internal Validity
       The ability to draw conclusions about causal relationships from our data
            A study has high internal validity when strong inferences can be made that one
             variable caused changes in the other variable
       Strong causal inferences can be made more easily when the experimental
        method is used
   External Validity
       The extent to which the results can be generalized to other populations and
        settings
            Whether the results can be replicated with other operational definitions of the
             variables, with different participants, or in other settings
       Artificiality of laboratory experiments is an issue of external validity
       Field experiments represent one way that researchers try to increase the external
        validity of their experiments
       The goal of high internal validity may sometimes conflict with the goal of
        external validity
                                                                                               17
                           Measurement Error
   Measurement error
       Any deviation from the “true value”
   Systematic Error
       Caused by factors that systematically affect measurement of the variable across
        samples
       Tends to be consistently either positive or negative
       Referred to as “bias”
       Can be controlled using strategies such as frequent calibration and
        randomization
   Random Error
       Caused by factors that randomly affect measurement of the variable across
        samples
       It does not have any consistent effects across the entire sample population
       Referred to “noise”
       Difficult to control
       Leads to unreliability of measures
                                                                                  18
                             Additive Error Model
   Additive Error Model
       Appropriate to most human factors studies
            Attempts are made to determine the values of the variables of interest yet one is not
             able to do so because of various errors in the measurement
       Easy to understand and familiar to most human factor practitioners
       Every measurement is a sum of two components: the true score of the measure
        and random error

                                  X*: the observed score
        X*  X          (Eq. 1) X: the true score
                                  δ: random error, with mean 0 and variance  
                                                                              2


         X   X   2
          2
             *
                2
                             (Eq. 2)




                                                                                            19
                           Reliability of Measures
   Reliability
       The consistency of measures obtained by individuals when reexamined with the
        same criterion measure on different occasions or with different sets of equivalent
        tasks (Salvendy & Carayon, 1997)
            A reliable measure of intelligence should yield the same result each time you
             administer the intelligence test to the same person. The test would be unreliable if it
             measured the same person as average one week, low the next, and bright the next
       Mathematically, the reliability of a measure is defined as the proportion of the
        variability in the measure attributable to the true score
        rXX   X /  X *   X /( X   2 ) (Eq. 3)
                2     2       2     2




                                                                                              20
                    97   103



Less random error

                               21
                  Reliability of Measures (Cont’d)
   The Importance of Reliability
       Researchers cannot use unreliable measures to systematically study variables or
        the relationships among variables
       Trying to study human behavior using unreliable measures is a waste of time
        because the results will be unstable and unable to be replicated
   How to Improve Reliability
       Use careful measurement procedures
            Carefully training observers to collect data or record behavior
            Paying close attention to the way questions are phrased




                                                                                 22
                                   Access Reliability
   Use Correlation Coefficient
       Correlation coefficient indicates the strength and direction of a linear
        relationship between two random variables
       Pearson product-moment correlation coefficient
             If we have a series of n measurements of X and Y , xi and yi (i = 1, 2, ..., n), then the
             sample correlation of X and Y, rX,Y , can be estimated using the Pearson product-
             moment correlation coefficient. It is the best estimate of rX,Y if X and Y are both
             normally distributed.

                        Σ ( xi  x )( yi  y )
             rX ,Y        ( n 1) S X SY
                                                 (Eq. 4)


     To assess the reliability of a measure, we need to obtain at least two scores on the
     measure from multiple individuals




                                                                                               23
               Types of Reliability Estimates
   Test-Retest Reliability
   Alternative-Form Reliability
   Inter-Rater Reliability
   Internal Consistency Reliability




                                                24
                             Test-Retest Reliability
   Procedure
       Measure the same individuals at two points in time and then calculate the
        correlation coefficient between the first and second test scores
            To test the reliability of an intelligence test, the test is given to a group of people on
             one day and again a week later
   Advantage
       The procedure is simple and straightforward
   Disadvantage
       The problem of carry-over effects due to memory and/or practice which will
        result in inflated estimates of reliability




                                                                                                 25
                     Alternative-Form Reliability
   Procedure
       Administer two parallel forms of a test to the same group of individuals
       Two forms of a measuring instrument is considered parallel if an object's true
        score is the same for both forms and if both forms produce equal means and
        equal variances
   Advantage
       Helps to alleviate carry-over effects
   Disadvantage
       Difficult to come up with two parallel forms, especially with personality
        measures




                                                                                    26
                                Inter-Rater Reliability
   Access Reliability of Rating System
       The extent to which two or more individuals (coders or raters) agree in their
        observations
            Two usability experts are asked to give a rating on the usability of a website
             according to a sliding rating scale (1 being the worst, 5 being the best). If one expert
             gives “1” to the usability of the website, whereas the other gives “5”, then the
             interrater reliability of the rating scale would be quite low
       Depends on the ability of the raters to be consistent
            Training and education can help enhance inter-rater reliability
   Cohen’s Kappa (K)
                  PO  PC              PO: Observed proportion of agreement
        k         1 PC
                            (Eq. 5)
                                       PC: Proportion of agreement predicted by chance

     PC     1
             n2     pm
                    i
                            i   (Eq. 6) pmi: product of the ith row and column marginals

                                                                                              27
Raters A and B are asked to evaluate the usability of 100 websites using a three-
point likert scale (1 being the worst, 3 being the best)

                Ratings of B
                1      2    3
           1    20     5    10    35
Ratings                                  row margin
           2    10    30    10    50
 of A
           3    2      3    10    15
                32    38    30

               column margin

                     PO     20  30 10
                                100         0.60
                     P  1002 (35  32  50  38  15  30)  0.347
                      C
                           1


                     k    0.6  0.347
                            1 0.347      0.39
                                                                               28
          Internal Consistency of Multi-Item tests
   Multi-Item Tests
       Many psychological measures are made up of a number of different questions
        (items)
            An intelligence test may have 100 items, a satisfaction questionnaire may have 10
             items
   Internal Consistency Reliability
       The extent to which the items of a measuring instrument correlate with one
        another
            All items measure the same variable, so they should yield consistent results
       Split-half reliability
       Cronbach’s alpha




                                                                                            29
                             Split-Half Reliability
   Procedure
       The questions in the measuring instrument are divided in half, creating two
        pseudo-parallel half-tests a and b
       Each of the two half-tests is scored on a number of individuals
       Calculate the correlation between the total scores of a and b, rab
       The half-test reliability is adjusted to estimate the overall reliability of the whole
        instrument, rAB , using the Spearman-Brown formula
                 2rab
        rAB     1 rab
                        (Eq. 7)

        rab = 0.6, then rAB = 2∙0.6/(1+0.6) = 0.75




                                                                                       30
                                Cronbach’s Alpha(α)
   Most popular measure of internal consistency reliability
   Interpreted as the mean of all possible split-half coefficients
                               Si2
    rXX  (      k
                k 1   )(1     2
                               SX
                                      )   (Eq. 8)

     k : the number of items
    S i2 : sample variance for item i
      2
    S X : sample variance of the total test scores

    Cronbach's α of 0.7 is a rule-of-thumb acceptable level of agreement

    Calculate Cronbach’s alpha from JMP

    Choose Analyze --> Multivariate and specify your continuous columns. From the
    Multivariate pull-down menu select Item Reliability --> Cronbach's Alpha

                                                                              31
              Sources of Psychological Tests
   It is usually wise to use existing measures of psychological
    characteristics rather than develop your own
   Existing measures have reliability and validity data to help you
    decide which measure to use
   You can compare your findings with prior research that uses the
    measures
   You should always report the reliability of any psychological
    measure used in your study even if it is an existing one!




                                                                       32
         Sources of Psychological Tests (Cont’d)
   Mental Measurements Yearbook
       Published by the Buros Institute of Mental Measurements (BIMM)
       A database containing the most recent descriptive information and critical
        reviews of new and revised tests from the Buros Institute's 9th, 10th, 11th, 12th,
        13th, 14th, and 15th Yearbooks
       Covers more than 2,200 commercially-available tests in categories such as
        personality, developmental, behavioral assessment, neuropsychological,
        achievement, intelligence and aptitude, educational, speech & hearing, and
        sensory motor
    The good news is that you can access it online for free through WSU library!
    >> WSU library online >> Databases >> Mental Measurements Yearbook




                                                                                    33

								
To top