Docstoc

L'évaluation des interfaces utilisateurs

Document Sample
L'évaluation des interfaces utilisateurs Powered By Docstoc
					Les expériences contrôlées
    Plusieurs méthodes d’évaluation
           vues en LOG 350 …
•   Sondages
•   Évaluation heuristique
•   Tests d’utilisabilité
•   Expériences
•   Etc.
             Les expériences
• Une partie fondamentale de la méthode
  scientifique
• Permettent de trouver des relations causales
  entres des conditions et leurs effets
• En IHM, permettent de trouver si une
  interface A est plus rapide/cause moins
  d’erreurs/etc. qu’une interface B
              Les expériences
• On varie (manipule) au moins une variable
  (exemple: l’interface à utiliser). C’est la
  variable indépendante. Chacune de ses
  valeurs correspond à une condition.
• On mesure au moins une variable (exemples:
  le temps, le nombre d’erreurs, la satisfaction
  subjective). C’est la variable dépendante.
• On analyse les résultats pour voir s’il y a des
  différences significatives.
           Exemple d’expérience
• Les « expanding targets »

  Référence: M. McGuffin, R. Balakrishnan (2002). Acquisition
  of Expanding Targets. Proceedings of ACM Conference on
  Human Factors in Computing Systems (CHI) 2002, pages 57-
  64, http://doi.acm.org/10.1145/503376.503388
      Example: Mac OS X




• Does this really make acquisition
  easier ?
Additional motivation




Furnas                      Mackinlay, Robertson, Card
Generalized fisheye views   The Perspective Wall
CHI 1986                    CHI 1991                     Bederson
                                                         Fisheye Menus
                                                         UIST 2000
Fitts’ Law


             A




                 Target

    Cursor


                   W
Fitts’ Law




     Target 1                        Target 2




                Same ID → Same Difficulty
Fitts’ Law




     Target 1            Target 2




                Smaller ID → Easier
Fitts’ Law




     Target 1                        Target 2




                Larger ID → Harder
 Fitts’ Law
                                      W

        Open-loop




                                          Closed-loop
Speed



                                          Overshoot
                    Undershoot




                           Distance
Expanding Targets
Basic Idea:
• Big targets can be acquired faster, but take
  up more screen space
• So: keep targets small until user heads
  toward them
                             Click Me !

                               Okay

                              Cancel
Experimental Setup
                         W




                             Target

    Start Position   A
Experimental Setup
 Expansion:
 • How ?


                     Animated
                     Expansion
Experimental Setup
 Expansion:
 • How ?


                     Fade-in
                     Expansion
Experimental Setup
 Expansion:
 • How ?
 • When ?     P = 0.25
Experimental Setup
 Expansion:
 • How ?
 • When ?     P = 0.5
Experimental Setup
 Expansion:
 • How ?
 • When ?     P = 0.75
Pilot Study
7 conditions:
• No expansion (to establish a, b values)
• Expanding targets
  – Either animated growth or fade-in
  – P is one of 0.25, 0.5, 0.75


(Expansion was always by a factor of 2)
Pilot Study
        7 conditions
        x 16 (A,W) values
        x 5 repetitions
        x 2 blocks
        x 3 participants
        = 3360 trials
 Pilot Study: Results




  Time
(seconds)




            ID (index of difficulty)
 Pilot Study: Results
                         A
            a  b log 2 (  1)
                         W



  Time
(seconds)




              ID (index of difficulty)
 Pilot Study: Results
                         A
            a  b log 2 (  1)
                         W



  Time
(seconds)
                                                1 A
                                  a  b log 2 (      1)
                                                2W



              ID (index of difficulty)
 Pilot Study: Results




  Time
(seconds)                              P = 0.25




            ID (index of difficulty)
 Pilot Study: Results




  Time
(seconds)                              P = 0.5




            ID (index of difficulty)
 Pilot Study: Results




  Time
(seconds)                              P = 0.75




            ID (index of difficulty)
Implications
• Pilot Study suggests the advantage of
  expansion doesn’t depend on P
• So, set P = 0.9 and perform a more
  rigorous study
Full Study
2 conditions:
• No expansion (to establish a, b values)
• Expanding targets, with
  – Animated growth
  – P = 0.9
  – Expansion factor of 2
Full Study
         2 conditions
         x 13 (A,W) values
         x 5 repetitions
         x 5 blocks
         x 12 participants
         = 7800 trials
  Results



  Time
(seconds)




            A, W values
 Results




  Time
(seconds)




            ID (index of difficulty)
 Results




  Time
(seconds)




            ID (index of difficulty)
 Results




  Time
(seconds)




            ID (index of difficulty)
 Results




  Time
(seconds)                              P = 0.9




            ID (index of difficulty)
Implications
• For single-target selection task,
  – Expansion yields a significant advantage,
    even when P=0.9


• What about multiple targets ?
    (Fin des diapos sur
les « expanding targets »)
 Les variables dans une expérience
• Variables indépendantes: celles qu’on manipule (on les appelle aussi les
  facteurs); correspondent aux conditions (ou traitements ou niveaux)
• Variables dépendantes: celles qu’on mesure
• Variables de contrôle: celles qu’on contrôle, c.-à-d. qu’on essaie de garder
  constantes entre les conditions
• Variables aléatoires: celles qu’on laisse varier, de manière le plus aléatoire
  possible.
    – Exemples: âge, sexe, profil socio-économique, etc.
    – Comment assurer une variation aléatoire entre les conditions ?
         • Assignation aléatoire des participants aux conditions
    – Désavantage: Ces variables vont introduire plus de variabilité dans nos
      résultats
    – Avantage: Nos résultats seront plus généraux; nos conclusions vont s’appliquer
      à plus de situations
• Variables confondantes: celles qui varient de manière systématique entre
  les conditions. On veut éliminer ces variables!
           Régression linéaire
                        Y




                                           X




• Sortie: pente, intersection,
  et coéfficient de corrélation de Pearson r
  qui est dans l’intervalle *-1,1]
             Un lien causal …
• Dans une expérience bien contrôlée, s’il n’y a
  pas de variables confondantes, et on trouve
  que les variable dépendantes changent
  lorsqu’on change les variables indépendantes,
  on peut conclure qu’il y a un lien causal: le
  changements dans les variables
  indépendantes cause le changement dans les
  variables dépendantes. Dans ce cas, une
  corrélation impliquerait un lien causal.
  … versus une corrélation simple
• Par contre, si on ne fait qu’observer une
  corrélation entre deux variables X et Y, sans
  contrôler les conditions, cela n’implique pas
  un lien causal entre eux. Il se pourrait que
  – X a un effet sur Y
  – Y a un effet sur X
  – Une troisième variable, Z, a un effet sur X et Y
• C’est pour ça qu’on essaie d’éliminer les
  variables confondantes dans les expériences
                   Exemple
• Des chercheurs voulait savoir quelle variable
  pourrait prédire les chances qu’un conducteur
  de motocyclette ait un accident de moto. Ils
  ont cherché des corrélations entre le nombre
  d’accidents, et l’âge, le niveau socio-
  économique, etc.
• Ils ont trouvé que la plus forte corrélation était
  avec le nombre de tatouages du conducteur.
• Évidemment, les tatouages ne causent pas les
  accidents, ni l’inverse.
     Examples of Questions to Answer
            in an Experiment
• Of 3 interfaces, A, B, C, which enables fastest
  performance at a given task?
• Does prozac have an effect on performance at tying
  shoe laces?
• How does frequency of advertisements on television
  affect voting behaivour?
• Can casting a spell on a pair of dice affect what
  numbers appear on them?
           Elements of an Experiment
• Population
   – Set of all possible subjects / observations
• Sample
   – Subset of the population chosen for study; a set of subjects /
     observations
• Subjects
   – People/users under study. The more politically correct term
     within HCI is “participants”.
• Observations / Dependent variable(s)
   – Individual data points that are measured/collected/recorded
       • E.g. time to complete a task, errors, etc.
• Condition / Treatment / Independent variables(s)
   – Something done to the samples that distinguishes them
     (e.g. giving a drug vs placebo, or using interface A vs B)
   – Goal of experiment is often to determine whether the conditions
     have an effect on observations, and what the effect is
   Tasks to Design and Run an Experiment
• Design
   –   Choose independent variables
   –   Choose dependent variables
   –   Develop hypothesis
   –   Choose design paradigm
   –   Choose control procedures
   –   Choose a sample size
• Pilot experiment
   – Often more exploratory, varying a greater number of variables to get
     a “feel” for where the effect(s) might be
• Run experiment
   – Focuses in on the suspected effect; tries to gather lots of data under
     key or optimal conditions to result in a strong conclusion
• Analyze data
   – Using statistical tests such as ANOVA
• Interpret results
                    Hypothesis

• Statement, to be tested, of relationship between
  independent and dependent variables
• The null hypothesis is that the independent variables
  have no effect on the dependent variables
      Experimental Design Paradigms

• Between subjects or within subjects manipulation
  (entre participants vs à travers tous les participants)
• Example: designs with one independent variable
   – Between subjects design
       • One independent variable with 2 or more levels
       • Subjects randomly assigned to groups
       • Each subject tested under only 1 condition
   – Within subject design
       • One independent variable with 2 or more levels
       • Each subject tested under all conditions
       • Order of conditions randomized or counterbalanced (why?)
                    What To Control
• Subject characteristics
   – Gender, handedness, etc.
   – Ability
   – Experience
• Task variables
   – Instructions
   – Materials used
• Environmental variables
   – Setting
   – Noise, light, etc.
• Order effects
   – Practice
   – Fatigue
      How to Control for Order Effects

• Counterbalancing
   – Factorial Design
   – Latin Square
  Data Analysis and Hypothesis Testing

• Describe data
   – Descriptive statistics (means, medians, standard deviations)
   – Graphs and tables
• Perform statistical analysis of results
   – Are results due to chance? (That is, with what probability)
                         ANOVA

• “Analysis of Variance”
• A statistical test that compares the distributions of
  multiple samples, and determines the probability that
  differences in the distributions are due to chance
• In other words, it determines the probability that the
  null hypothesis is correct
• If probability is below 0.05 (i.e. 5 %), then we reject
  the null hypothesis, and we say that we have a
  (statistically) significant result
   – Why 0.05 ? Dangers of using this value ?
 Techniques for Making Experiment more
  “Powerful” (i.e. able to detect effects)
• Reduce noise (i.e. reduce variance)
   – Increase sample size
   – Control for confounding variables
       • E.g. psychologists often use in-bred rats for experiments !
• Increase the magnitude of the effect
   – E.g. give a larger dosage of the drug
Uses of Controlled Experiments within HCI

• Evaluate or compare existing systems/features/interfaces
• Discover and test useful scientific principles
   – Examples ?
• Establish benchmarks/standards/guidelines
   – Examples ?
    Exemple d’un plan d’expérience …

• Pour chaque participant …
   – Pour chaque condition majeure ... *
      • On fait des essais de réchauffement
      • On a un certain nombre de blocs, séparés par
        des pauses
      • Pour chaque bloc …
          • On répète chaque condition mineure un certain
            nombre de fois *


• * Comment ordonner ces choses ?

				
DOCUMENT INFO