usability_specification by 7hdYPL5



Southern Methodist University
         CSE 8316
        Spring 2003
Temporal Relations and
Usability Specifications
• Previous chapter discussed low level
• Now focus on abstraction and
  relative timing of events
• Such issues as interruptibility and
  interleavability should be part of
  interaction design and not driven by
  constructional design.
• UAN can be used to specify:
  –   Sequence
  –   Iteration
  –   Optionality
  –   Repeating choice
  –   Order independence
  –   Interruptibility
  –   Interleavability
  –   Concurrency
  –   Waiting
 Sequencing and Grouping
• Sequence
  – Sequence: One task is performed in its entirety
    before the next task is begun
  – Represent in the UAN by grouping
    (horizontally or vertically) without any
    intervening operators
• Grouping
  – Tasks can be grouped together using various
    operators to form new tasks
  – Definition is similar to that for regular
• Have only seen UAN describing
  articulatory actions -- primitive tasks
  performed by the user.
• In this form, describing an entire
  interaction design would be overly
  complex and difficult
• Introduce abstraction by allowing groups
  of tasks to be named.
• As with procedures, a reference to the
  name is equivalent to performing all the
  tasks described by that name
• To aid in reusability, allow tasks
  references to be parameterized
• Reusing tasks promotes logical
  decomposition, providing for consistent
  system model
• Abstraction hides details, but also hides
  user feedback. This information can be
  listed at one or both levels.
• With task naming, can now perform top-
  down design.
           Task Operators
• Choice
  – Simple choice is represented in UAN
    with the vertical bar, `|'.
  – Repeating choice is formed by adding
    the iterators `*' and `+'.
         Task Operators
• Order Independence
  – Set of tasks that must be completed
    before continuing, but order of
    completion of the subtasks is not
  – Represented by the `&'.
         Task Operators
• Interruption
  – Interruption occurs when one task is
    suspended while another task is started
  – Since UAN describes what can happen,
    you cannot specify an interruption, but
    rather what can be interrupted
  – To specify that A can interrupt B use A -
    -> B.
         Task Operators
• Uninterruptible Tasks
  – Assume all primitive actions are
    uninterruptible (e.g. pressing a mouse
  – Specify the uninterruptibilty of higher-
    level tasks (e.g. modality) by enclosing
    in brackets, `<A>'.
         Task Operators
• Interleavability
  – If two tasks can interrupt each other,
    they are considered interleavable.
  – Assume that operator is transitive.
  – Represented with double arrow, A <-->
          Task Operators
• Concurrency
  – If two tasks can be performed in parallel
    (e.g. two different users), then tasks are
    concurrent Represented with `||'.
          Task Operators
• Intervals and Waiting
  – Can add explicit time intervals between
    two events.
  – Two forms:
    • If task B must be completed within n
      seconds of task A: `A (t<n) B'
    • If task B is to occur only n seconds after
      task A: `A (t>n) B'
    Other Representations
• Screen Pictures and Scenarios
  – UAN describes user actions, but does not
    describe the format/display of screens
  – Should supplement UAN with screen layouts
    and scenarios.
• State Transition Diagrams
  – Typical interface contains various states
  – To provided global view of how states are
    related, add state transition diagram to UAN
       Design Rationale
• Basic role of UAN is communication
• Important to provide reasons behind
  various decisions
• Gives motivation and goals and
  helps prevent later duplication of
Usability Specifications
   Usability Specifications
• Quantitative, measurable goals for
  knowing when the interface is good
• Often overlooked, but provide
  insurance that multiple iterations are
• For this reason, should be
  established early
Usability Specification Table
• Convenient method for indicating
• Contains following information
  –   Usability Attribute
  –   Measuring Instrument
  –   Value to be Measured
  –   Current Level
  –   Worst Acceptable Level
  –   Planned Target Level
  –   Best Possible Level
  –   Observed Results
       Usability Attribute
• Represents the usability
  characteristic being measured
• Must determine classes of intended
• For each class determine realistic set
  of tasks
• Goal is to determine what user
  performance will be acceptable
       Usability Attributes
• Typical attributes include:
  – Initial Performance: User's performance during
    the first few uses.
  – Long-term Performance: User's performance
    after extended use of the product
  – Learnability: How quickly the user learns the
  – Retainability: How quickly does the knowledge
    of how to use the system dissipate
    Usability Attributes
– Advanced Feature Usage: Usability of
  sophisticated features
– First Impression: Subjective user
  feelings at first glance
– Long-term User Satisfaction: User's
  opinion after extended use
     Measuring Instrument
• Method to find a value for a usability
• Quantitative, but may be objective or
• Objective: based on user task
• Subjective: deal with user opinion
• Both types are needed to effectively
       Benchmark Tasks
• User is asked to perform a task using
  the interface
• Most common objective measure
• Task should be a specific, single
  interface feature
• Description should be clearly worded
  without describing how to do it
• Quantitative measure for subjective
• Creating survey that provides useful
  data is not trivial
• Recommend use of scientifically
  created question (e.g. QUIS)
         Values To Be Measured
• The data value metric
• Typically metrics are:
  –   Time for task completion
  –   Number of errors
  –   Average scores/ratings on questionnaire
  –   Percentage of task completed in a given time
  –   Ratio of successes to failures
  –   Time spent in errors and recovery
 Values To Be Measured
– Number of commands/actions used to
  perform task
– Frequency of help/documentation use
– Number of repetitions of failed
– Number of available commands not
– Number of times user expresses
  frustration or satisfaction
          Setting Levels
• Having determined what and how to
  measured, need to set acceptable levels
• These levels will be used to determine
  when the interface has reached the
  appropriate level of usability
• Important to be specific enough so that
  levels can be reasonably set
           Current Level
• Present level of the value to be measured
• Values can be determined from manual
  system, current automated system or
• Proof that usability attribute can be
• Baseline against which new system will be
   Worst Acceptable Level

• Lowest acceptable level of user
• This level must be attained for the
  product to be considered complete
• Not a prediction of how the user will
  perform, but rather the worst
  performance that is considered
   Worst Acceptable Level
• Tendency/pressure is to set the
  values too low
• Good rule of thumb is to set them at
  or near the current levels
     Planned Target Level
• The level of unquestioned usability,
  the ideal situation
• Serve to focus attention on those
  aspects needing the most work (now
  or later)
• May be based on competitive
     Best Possible Level
• State-of-the-art upper limit
• Provides goals for next versions
• Gives indication of improvement that
  is possible
• Frequently determined by having
  measuring expert user
       Observed Results
• Actual values obtained from user
• Provides quick comparison with
  projected levels
            Setting Levels

• There are various methods for estimating
  the levels:
  – Existing systems or previous versions of new
  – Competitive systems
  – Performing task manually
  – Developer performing with prototype
  – Marketing input based on observations of user
    performance on existing systems
          Setting Levels
• The context of the task is important
  in determining these levels
             Example usability table
Usability   Measuring     Value to be     Current      Worst       Planned    Best       Observed
attribute   instrument    measured        level        accepta     target     possible   results
                                                       ble level   level      level

Advanced    “Add          Length of       13 minutes   2           1 minute   30
feature     repeating     time to add a   (manually)   minutes                seconds
usage       appointment   weekly
            ” task per    appointment
            benchmark 3   every week
                          for one year
                          after one
                          hour of use
              Example usability table
Usability    Measuring    Value to be     Current   Worst       Planned     Best       Observed
attribute    instrument   measured        level     accepta     target      possible   results
                                                    ble level   level       level

First        User         Number of       ??        10          5           2
impression   reaction     negative/posi             negative    negative/   negative
                          tive remarks              /2          5           /10
                          during the                positive    positive    positive
               Example usability table
Usability      Measuring    Value to be    Current      Worst       Planned   Best       Observed
attribute      instrument   measured       level        accepta     target    possible   results
                                                        ble level   level     level

Learnability   “Add         Length of      15           15          12        8
               appointme    time to        seconds      seconds     seconds   seconds
               nt” task     successfully   (manually)
               per          add
               benchmark    appointment
               5            after one
                            hour of use
• Each usability attribute should be
  (realistically) measurable
• User classes need to be clearly specified
• The number of attributes to be measured
  should be reasonable. Start small and add
  as experience grows
• All project members should agree on the
• The values should be reasonable
  – If found to be too low, then increase
    them on next iteration
  – If they appear too high, it may be they
    were not realistically set or that the
    interface needs a lot of work!
    Judgement call
Expert Reviews, Usability
  Testing, Surveys, and
 Continuing Assessment
• Designers can become so entranced
  with their creations that they may fail
  to evaluate them adequately
• Experienced designers have attained
  the wisdom and humility to know that
  extensive testing is a necessity
• The determinants of the evaluation
  plan include:
  – stage of design (early, middle, late)
  – novelty of project (well defined vs.
  – number of expected users
  – criticality of the interface (life-critical
    medical system vs. museum exhibit
– costs of product and finances allocated
  for testing
– time available
– experience of the design and evaluation
• The range of evaluation plans might
  be from an ambitious two-year test to
  a few days test.
• The range of costs might be from
  10% of a project down to 1%.
          Expert Reviews
• While informal demos to colleagues or
  customers can provide some useful
  feedback, more formal expert reviews
  have proven to be effective.
• Expert reviews entail one-half day to one
  week effort, although a lengthy training
  period may sometimes be required to
  explain the task domain or operational
         Expert Reviews
• There are a variety of expert review
  methods to chose from:
  – Heuristic evaluation
  – Guidelines review
  – Consistency inspection
  – Cognitive walkthrough
  – Formal usability inspection
          Expert Reviews
• Expert reviews can be scheduled at
  several points in the development process
  when experts are available and when the
  design team is ready for feedback.
• Different experts tend to find different
  problems in an interface, so 3-5 expert
  reviewers can be highly productive, as can
  complementary usability testing.
        Expert Reviews
• The dangers with expert reviews are
  that the experts may not have an
  adequate understanding of the task
  domain or user communities.
          Expert Reviews
• To strengthen the possibility of successful
  expert reviews it helps to chose
  knowledgeable experts who are familiar
  with the project situation and who have a
  longer term relationship with the
• Moreover, even experienced expert
  reviewers have great difficulty knowing
  how typical users, especially first-time
  users will really behave.
      Usability Testing and
• The emergence of usability testing
  and laboratories since the early
  1980s is an indicator of the profound
  shift in attention to user needs.
• The remarkable surprise was that
  usability testing not only sped up
  many projects but that it produced
  dramatic cost savings.
      Usability Testing and
• The movement towards usability
  testing stimulated the construction
  of usability laboratories.
      Usability Testing and
• A typical modest usability lab would
  have two 10 by 10 foot areas, one for
  the participants to do their work and
  another, separated by a half-silvered
  mirror, for the testers and observers
  (designers, managers, and
Usability Lab (Interface
 Analysis Associates)
Usability Lab (Interface
 Analysis Associates)
      Usability Testing and
• Participants should be chosen to
  represent the intended user
  communities, with attention to
  background in computing,
  experience with the task, motivation,
  education, and ability with the
  natural language used in the
       Usability Testing and
• Participation should always be voluntary,
  and informed consent should be obtained.
  Professional practice is to ask all subjects
  to read and sign a statement like this one:
  – I have freely volunteered to participate in this
  – I have been informed in advance what my
    task(s) will be and what procedures will be
     Usability Testing and
– I have been given the opportunity to ask
  questions, and have had my questions
  answered to my satisfaction.
– I am aware that I have the right to withdraw
  consent and to discontinue participation at any
  time, without prejudice to my future treatment.
– My signature below may be taken as
  affirmation of all the above statements; it was
  given prior to my participation in this study.
      Usability Testing and
• Videotaping participants performing
  tasks is often valuable for later
  review and for showing designers or
  managers the problems that users
• Field tests attempt to put new
  interfaces to work in realistic
  environments for a fixed trial period
         Nomos Lab

An observer's view of a test being carried out
in the purposely designed Nomos lab.
     Nomos Lab

Two sides of the one-way glass -
actions and problems are logged
while the user carrys out real tasks
with the product.
      Usability Testing and
• Field tests can be made more fruitful
  if logging software is used to capture
  error, command, and help
  frequencies plus productivity
        Usability Testing and
• Game designers pioneered the can-you-
  break-this approach to usability testing
  – providing energetic teenagers with the
    challenge of trying to beat new games
• This is a destructive testing approach
  – users try to find fatal flaws in the system, or
    otherwise to destroy it
  – has been used in other projects and should be
    considered seriously
       Usability Testing and
• Usability testing does have at least two
  serious limitations
  – it emphasizes first-time usage
  – has limited coverage of the interface features.
• These and other concerns have led design
  teams to supplement usability testing with
  the varied forms of expert reviews.
         Siemens Usability Lab

A control deck (shown above) allows the team to witness
users reacting to software as they navigate the interface and
attempt to perform normal tasks. Separate cameras record
facial expressions and comments, use of manuals, and
activity on the screen itself. As a rule, every session is
recorded and held for later review and analysis.
     Siemens Usability Lab

This section of the Siemens Center has been arranged
so the software design team can view every move the
user makes, interact with him or her when necessary,
and generally see and feel their own design through the
user's experience.
    Inventory of Facilities - U of
•   2 Sony DXC-107A CCD Color Video Cameras, equipped with Canon R-II electrically controlled zoom
    lenses and wall-mounted on Pelco remote-control pan/tilt bases. All camera functions are remotely
    controlled from the observation room by Pelco MPTAZ Pan/Tilt and Scanner controls.
•   2 Microphones: 1 Audio-Technical superhypercardioid (super shotgun) type for discrete data
    collection and a cardioid microphone for narration and overdubbing.
•   1 Teac TASCAM M-06 six channel professional audio mixer, monitored via 5W self-amplified
    speakers or headphones.
•   2 Scan converters: 1 Extron Super Emotia high resolution scan converter for capturing live video
    from the subject's computer screen, and 1 Mediator medium-resolution scan converter for titling
    and effects generation.
•   2 Macintosh PowerMac 7500/100 workstations with 1710AV 17" monitors: 1 located in the testing
    room for use in evaluating Macintosh software, and 1 located in the observation room for data
    analysis, effects generation, and web-server functions. Both machines feature video capture and
    output (via scan converter) capabilities, and are networked onto both the local LAN (Novell ipx/spx)
    and Internet (TCP/IP).
•   1 Dell XPS-90 Workstation with Dell 17" multiscanning monitor, located in the testing room for use
    in evaluating PC-compatible software. This machine is also networked onto both the local LAN
    (Novell ipx/spx) and Internet (TCP/IP).
•   1 Sony PVM-411 video monitor rack for monitoring all online video sources.
•   3 JVC BRS-800U industrial video cassette recorders, equipped with SA-R50U time code
    generator/reader boards and SA-K26U RS-422 interface boards: 2 for capturing camera output and 1
    for capturing scan convertor (computer screen) output. Each can function independently or can be
    slaved to a single universal RMG-30U serial remote control.
    Inventory of Facilities - U of
•   3 JVC TM-131SU Color Video Monitors located in the observation room for monitoring online
    sources during the evaluation session and providing high-quality output for post-session analysis
    and mixdown.
•   1 JVC RMG-800U Editing Control Unit for post-production assemble/insert mixdown of recorded
    video source into condensed "highlights" tapes.
•   1 Panasonic WAV7 Digital Effects Generator/Mixer for creating a variety of online and post-
    production video effects including wipes, fades, cuts, strobes, keys, mosaics, split-screen and
    picture-in-picture effects.
•   1 Optimus SCT-53 "Pro Series" dual audio cassette deck with auto-reverse, dual digital time
    counters, and high speed dubbing capabilities.
•   Speakerphone equipped with a flashing silent ringer and a digital voicemail box.
•   Requisite cabling, stands, tables and other paraphernalia to allow above equipment to function and
    be used properly.
• Written user surveys are a familiar,
  inexpensive and generally
  acceptable companion for usability
  tests and expert reviews.
• The keys to successful surveys are
  clear goals in advance and then
  development of focused items that
  help attain the goals.
• Survey goals can be tied to the
  components of the Objects and
  Action Interface model of interface
  design. Users could be asked for
  their subjective impressions about
  specific aspects of the interface such
  as the representation of:
  – task domain objects and actions
  – syntax of inputs and design of displays.
• Other goals would be to ascertain
  – users background (age, gender, origins,
    education, income)
  – experience with computers (specific
    applications or software packages, length of
    time, depth of knowledge)
  – job responsibilities (decision-making
    influence, managerial roles, motivation)
  – personality style (introvert vs. extrovert, risk
    taking vs. risk aversive, early vs. late adopter,
    systematic vs. opportunistic)
– reasons for not using an interface
  (inadequate services, too complex, too
– familiarity with features (printing,
  macros, shortcuts, tutorials)
– their feeling state after using an
  interface (confused vs. clear, frustrated
  vs. in-control, bored vs. excited).

• Online surveys avoid the cost of printing
  and the extra effort needed for distribution
  and collection of paper forms.
• Many people prefer to answer a brief
  survey displayed on a screen, instead of
  filling in and returning a printed form,
  although there is a potential bias in the
• Extensive testing is a necessity
• Formal expert reviews have proven to be
• Must have an adequate understanding of
  the task domain and user communities
• Usability testing speeds up project TTM
  and produces dramatic cost savings
Product Evaluations
  Evaluation During Active
• A carefully designed and thoroughly
  tested system is a wonderful asset, but
  successful active use requires constant
  attention from dedicated managers, user-
  services personnel, and maintenance
• Perfection is not attainable, but
  percentage improvements are possible
  and are worth pursuing.
Evaluation During Active
  Evaluation During Active
• Interviews and focus group
  – Interviews with individual users can be
    productive because the interviewer can
    pursue specific issues of concern.
  – After a series of individual discussions,
    group discussions are valuable to
    ascertain the universality of comments.
  Evaluation During Active
• Continuous user-performance data
  – The software architecture should make it easy
    for system managers to collect data about the
    patterns of system usage, speed of user
    performance, rate of errors, or frequency of
    request for online assistance.
  – A major benefit of usage-frequency data is the
    guidance they provide to system maintainers
    in optimizing performance and reducing costs
    for all participants.
  Evaluation During Active
• Online or telephone consultants
  – Online or telephone consultants are an
    extremely effective and personal way to
    provide assistance to users who are
    experiencing difficulties.
  – Many users feel reassured if they know
    there is a human being to whom they
    can turn when problems arise.
Evaluation During Active
– On some network systems, the
  consultants can monitor the user's
  computer and see the same displays
  that the user sees while maintaining
  telephone voice contact.
– This service can be extremely
  reassuring; the users know that
  someone can walk them through the
  correct sequence of screens to
  complete their tasks.
  Evaluation During Active
• Online suggestion box or trouble
  – Electronic mail can be employed to allow users
    to send messages to the maintainers or
  – Such an online suggestion box encourages
    some users to make productive comments,
    since writing a letter may be seen as requiring
    too much effort.
  Evaluation During Active Use
• Online bulletin board or newsgroup
  – Many interface designers offer users an
    electronic bulletin board or newsgroups to
    permit posting of open messages and
  – Bulletin-board software systems usually offer
    a list of item headlines, allowing users the
    opportunity to select items for display.
  – New items can be added by anyone, but
    usually someone monitors the bulletin board
    to ensure that offensive, useless, or repetitious
    items are removed.
  Evaluation During Active
• User newsletters and conferences
  – Newsletters that provide information
    about novel interface facilities,
    suggestions for improved productivity,
    requests for assistance, case studies of
    successful applications, or stories
    about individual users can promote
    user satisfaction and greater
Evaluation During Active
– Printed newsletters are more traditional and
  have the advantage that they can be carried
  away from the workstation.
– Online newsletters are less expensive and
  more rapidly disseminated
– Conferences allow workers to exchange
  experiences with colleagues, promote novel
  approaches, stimulate greater dedication,
  encourage higher productivity, and develop a
  deeper relationship of trust.
  Controlled Psychologically-
    oriented Experiments
• Scientific and engineering progress
  is often stimulated by improved
  techniques for precise measurement.
• Rapid progress in the designs of
  interfaces will be stimulated as
  researchers and practitioners evolve
  suitable human-performance
  measures and techniques.
  Controlled Psychologically-
    oriented Experiments
• The outline of the scientific method as
  applied to human-computer interaction
  might comprise these tasks:
  – Deal with a practical problem and consider the
    theoretical framework
  – State a lucid and testable hypothesis
  – Identify a small number of independent
    variables that are to be manipulated
  – Carefully choose the dependent variables that
    will be measured
Controlled Psychologically-
  oriented Experiments
– Judiciously select subjects and carefully or
  randomly assign subjects to groups
– Control for biasing factors (non-representative
  sample of subjects or selection of tasks,
  inconsistent testing procedures)
– Apply statistical methods to data analysis
– Resolve the practical problem, refine the
  theory, and give advice to future researchers
  Controlled Psychologically-
    oriented Experiments
• Managers of actively used systems are
  coming to recognize the power of
  controlled experiments in fine tuning the
  human-computer interface.
• Limited time, and then performance could
  be compared with the control group.
  Dependent measures could include
  performance times, user-subjective
  satisfaction, error rates, and user
  retention over time.

To top