Comparative Usability Evaluation

Document Sample
Comparative Usability Evaluation Powered By Docstoc
					               CHI99 Panel
Comparative Evaluation of Usability Tests


                Presentation by
                  Rolf Molich
                 DialogDesign
                   Denmark


             molich@dialogdesign.dk
                  CHI99 Panel
Comparative Evaluation of Usability Tests


                   Take a web-site.
        Take nine professional usability teams.
       Let each team usability test the web-site.


                Are the results similar?
What Have We Done?

   Nine teams have usability tested the
    same web-site
    – Seven professional teams
    – Two student teams


   Test web-site: www.hotmail.com
    Free e-mail service
              Panel Format

   Introduction (Rolf Molich)
   Five minute statements from five participating teams
   The Customer’s point of view (Meeta Arcuri, Hotmail)
   Conclusions (Rolf Molich)
   Discussion - 30 minutes
Purposes of Comparison

   Survey the state-of-the art within
    professional usability testing of web-
    sites.
   Investigate the reproducibility of
    usability test results
NON Purposes of Comparison

          To pick a winner
          To make a profit
Basis for Usability Test

   Web-site address: www.hotmail.com
   Client scenario
   Access to client through intermediary
   Three weeks to carry out test
What Each Team Did

   Run standard usability test
   Anonymize the usability test report
   Send the report to Rolf Molich
    Problems Found

   Total number of different
    usability problems found    300

   Found by seven teams          1
            six teams            1
            five teams           4
            four teams           4
            three teams         15
            two teams           49
            one team           226 (75%)
Comparative Usability Evaluation 2

   Barbara Karyukina, SGI (USA)
   Klaus Kaasgaard & Ann D. Thomsen, KMD (Denmark)
   Lars Schmidt and others, Networkers (Denmark)
   Meghan Ede and others, Sun Microsystems, Inc., (USA)
   Wilma van Oel, P5 (The Netherlands)
   Meeta Arcuri, Hotmail, Microsoft Corp. (USA) (Customer)
   Rolf Molich, DialogDesign (Denmark)         (Coordinator)
Comparative Usability Evaluation 2

   Joseph Seeley, NovaNET Learning Inc. (USA)
   Kent Norman, University of Maryland (USA)
   Torben Norgaard Rasmussen and others,
    Technical University of Denmark
   Marji Schumann and others,
    Southern Polytechnic State University (USA)
               CHI99 Panel
Comparative Evaluation of Usability Tests


               Presentation by
              Barbara Karuykina
               SGI, Wisconsin
                    USA


              barbarak@sgi.com
    Challenges:

Twenty functional areas


+


User preferences questions
Possible Solutions:

   Two usability tests
   Surveys
   User notes
   Focus groups
           Results:

26 tasks + 10 interview questions

100 findings
    Challenges:

Twenty functional areas


+


User preferences questions
    Problems Found

   Total number of different
    usability problems found    300

   Found by seven teams          1
            six teams            1
            five teams           4
            four teams           4
            three teams         15
            two teams           49
            one team           226 (75%)
               CHI99 Panel
Comparative Evaluation of Usability Tests


               Presentation by
               Klaus Kaasgaard
                Kommunedata
                   Denmark


                kka@kmd.dk
Slides currently not available
                CHI99 Panel
Comparative Evaluation of Usability Tests


                 Presentation by
                  Lars Schmidt
           Framtidsfabriken Networkers
                    Denmark


                ls@networkers.dk
         Team E

Framtidsfabriken Networkers
      Testlab, Denmark
Key learnings CUE-2


   Setting up the test
    – Insist on dialog with customer
    – Secure complete understanding of user groups and user
      tasks
    – Narrow down test goals


   Writing the report
    – Use screendumps
    – State conclusions - skip the premises
    – Test the usability of the usability report
Improving Test Methodology


   Searching for usability and usefulness
    – Hook up with different methodologies (e.g. interviews)


   Focus on website context
    – Test against e.g. YahooMail
    – Test against softwarebased email clients
               CHI99 Panel
Comparative Evaluation of Usability Tests


                Presentation by
                 Meghan Ede
               Sun Microsystems
                California, USA


             meghan.ede@sun.com
Hotmail Study Requests

   18 Specific Features
     e.g. Registration, Login, Compose...


   6 Questions
     e.g. "How do users currently do email?"


   24 Potential Study Areas
Usability Methods

   Expert Review
     6 Reviewers
     6 Questions



   Usability Study
     6 Participants (3 + 3)
     5 Tasks (with sub-tasks)
  Report Description

1. Executive Summary
   - 4 Main High-Level Themes
   - Brief Study Description

2. Debriefing Meeting Summary
   - 7 Areas (e.g. overall, navigation, power features, ...)

3. Findings
   - 31 Sections
   - Study Requests, Extra Areas, Bugs, Task Times, Study Q & A

4. Study Description

Total: 36 Pages - 150 Findings
Lessons Learned

   Importance of close contact
    with product team
   Consider including:
     severity ratings
     more specific recommendations
     screen shots
Discussion Issues

   How can we measure the
    usability of our reports?


   How to deal with the
    difference between number
    of problems found and
    number included in report?
               CHI99 Panel
Comparative Evaluation of Usability Tests


                Presentation by
                 Wilma van Oel
                      P5
                The Netherlands


            w.vanoel@p5-adviseurs.nl
Wilma van Oel

P5
adviseurs voor
produkt-& kwaliteitsbeleid
quality & product
management consultants

Amsterdam, the Netherlands
Structure of Presentation

    1. Introduction

    2. Deviations in approach
     – Test design
     – Results and recommendations


    3. Lessons for the future
     – Change in approach?
     – Was it worth the effort?
    Introduction

• Company:
   P5 Consultants

• Personal background:
  psychologist
                Test design
   Subjects: n=11, pilot, ‘critical users’, 1 hour session
   Data collection: log software, video recording

                       Methods:
            lab evaluation + informal approach
                    Techniques:
                exploration, task execution,
           think aloud, interview, questionnaire

                       Tool: SUS
A Test Session
Results and recommendations

                Results:
                'general'
                severity?

    Negative                 Positive
   n = median               n > mean


         Recommendations:
                 general
                not 'how'
Lessons for the future

   Change in approach?
    – Methods: add a usability inspection method
    – Procedure: extensive analysis, add session time
    – Results: less general, severity?


   Was it worth the effort?
    – Company: to get experience & benchmarking
    – Personally: to improve skills, knowledge
               CHI99 Panel
Comparative Evaluation of Usability Tests


                Presentation by
                 Meeta Arcuri
             Microsoft Corporation
                California, USA


              meeta@hotmail.com
             CUE - 2
    The Customer’s Perspective



Meeta Arcuri
User Experience Manager
Microsoft Corp., San Jose, CA
Customer Summary of Findings


     New findings ~ 4%
     Validation of known issues ~ 67%
      – Previous finding from our lab tests
      – Finding from on-going inspections
     Remainder - beyond Hotmail Usability
      – Business reasons for not changing
      – Out of Hotmail’s control (partner sites)
      – Problems generic to the web
      Report Content:
    Positive Observations

   Quick and Dirty results
   Recommendations for problem fixes
   Participant quotes – get tone/intensity of
    feedback
   Exact # of P who encountered each issue
   Background of Participants
   Environment (browser, speed of connection,
    etc.)
Additional Strengths of Reports


     Fresh perspectives
     Lots of data on non-US users
     Recommendations from participants
     Trend reporting
     Report of outdated material on site
      (some help files)
     Appreciate positive findings, comments
Report Content: Weaknesses


     Some recommendations not sensitive to
      web issues (performance, security)
     At least one finding irreproducible
      (not preserving fields in Reg. Form)
     Frequency of issue reported was
      sometimes vague.
     Some descriptions terse, vague - had to
      decipher
How Hotmail Will Use Results


   Cross-validate new findings with Hotmail
    Customer Service reports
   Lots of good data to cite in planning meetings
   Some good recommendations given by labs
    and participants
            Conclusion


   Focused, iterative testing would give better
    results
   Wide array of user data very valuable
   Overall - good qualitative and quantitative data
    to help prioritize, schedule, and improve
    usability of Hotmail.
               CHI99 Panel
Comparative Evaluation of Usability Tests


                Presentation by
                  Rolf Molich
                 DialogDesign
                   Denmark


             molich@dialogdesign.dk
Comparison of Tests

   Based only on test reports
   Liberal scoring
   Focus on major differences
   Two generally recognized textbooks:
    –   Dumas and Redish, ”A Practical Guide to
        Usability Testing”
    –   Jeff Rubin, ”Handbook of Usability Testing”
                        Resources

             Team      A   B    C    D    E   F   G    H    J


   Person hours
    used for test     136 123   84 (16) 130   50 107   45 218

   # Usability
    professionals      2    1    1    1   3   1    1   3    6

   Number of tests    7    6    6   50   9   5   11   4    6
                 Usability Results

              Team    A    B    C    D    E    F    G    H     J


# Positive findings    0    8    4    7   24   25   14    4   6

# Problems            26 150    17   10   58   75   30   18   20



% Exclusive           42   71   24   10   57   51   33   56   60
                  Usability Results

             Team    A   B    C     D   E    F    G    H     J


# Problems          26 150    17   10   58   75   30   18   20



% Core problems
  (100%=26)         38   73   35    8   58   54   50   27   31

Person hours
  used for test     136 123   84   NA 130    50 107    45 218
    Problems Found

   Total number of different
    usability problems found    300

   Found by seven teams          1
            six teams            1
            five teams           4
            four teams           4
            three teams         15
            two teams           49
            one team           226 (75%)
         Conclusion

   If Hotmail is typical, then the total
    number of usability problems for a
    typical web-site is huge,
    much larger than you can hope to find
    in one series of usability tests
   Usability testing techniques can be
    improved
   We need more awareness of the
    Usability of Usability work
Download Test Reports and Slides


 http://www.dialogdesign.dk/cue2.htm

				
DOCUMENT INFO