User
Interface
Design
Southern Methodist University
CSE 8316
Spring 2003
Temporal Relations and
Usability Specifications
Introduction
• Previous chapter discussed low level
primitives
• Now focus on abstraction and
relative timing of events
• Such issues as interruptibility and
interleavability should be part of
interaction design and not driven by
constructional design.
Introduction
• UAN can be used to specify:
– Sequence
– Iteration
– Optionality
– Repeating choice
– Order independence
– Interruptibility
– Interleavability
– Concurrency
– Waiting
Sequencing and Grouping
• Sequence
– Sequence: One task is performed in its entirety
before the next task is begun
– Represent in the UAN by grouping
(horizontally or vertically) without any
intervening operators
• Grouping
– Tasks can be grouped together using various
operators to form new tasks
– Definition is similar to that for regular
expressions
Abstraction
• Have only seen UAN describing
articulatory actions -- primitive tasks
performed by the user.
• In this form, describing an entire
interaction design would be overly
complex and difficult
• Introduce abstraction by allowing groups
of tasks to be named.
• As with procedures, a reference to the
name is equivalent to performing all the
tasks described by that name
Abstraction
• To aid in reusability, allow tasks
references to be parameterized
• Reusing tasks promotes logical
decomposition, providing for consistent
system model
• Abstraction hides details, but also hides
user feedback. This information can be
listed at one or both levels.
• With task naming, can now perform top-
down design.
Task Operators
• Choice
– Simple choice is represented in UAN
with the vertical bar, `|'.
– Repeating choice is formed by adding
the iterators `*' and `+'.
Task Operators
• Order Independence
– Set of tasks that must be completed
before continuing, but order of
completion of the subtasks is not
important.
– Represented by the `&'.
Task Operators
• Interruption
– Interruption occurs when one task is
suspended while another task is started
– Since UAN describes what can happen,
you cannot specify an interruption, but
rather what can be interrupted
(interruptibility)
– To specify that A can interrupt B use A -
-> B.
Task Operators
• Uninterruptible Tasks
– Assume all primitive actions are
uninterruptible (e.g. pressing a mouse
button).
– Specify the uninterruptibilty of higher-
level tasks (e.g. modality) by enclosing
in brackets, `'.
Task Operators
• Interleavability
– If two tasks can interrupt each other,
they are considered interleavable.
– Assume that operator is transitive.
– Represented with double arrow, A
B.
Task Operators
• Concurrency
– If two tasks can be performed in parallel
(e.g. two different users), then tasks are
concurrent Represented with `||'.
Task Operators
• Intervals and Waiting
– Can add explicit time intervals between
two events.
– Two forms:
• If task B must be completed within n
seconds of task A: `A (tn) B'
Other Representations
• Screen Pictures and Scenarios
– UAN describes user actions, but does not
describe the format/display of screens
– Should supplement UAN with screen layouts
and scenarios.
• State Transition Diagrams
– Typical interface contains various states
– To provided global view of how states are
related, add state transition diagram to UAN
Design Rationale
• Basic role of UAN is communication
• Important to provide reasons behind
various decisions
• Gives motivation and goals and
helps prevent later duplication of
mistakes
Usability Specifications
Usability Specifications
• Quantitative, measurable goals for
knowing when the interface is good
enough
• Often overlooked, but provide
insurance that multiple iterations are
converging
• For this reason, should be
established early
Usability Specification Table
• Convenient method for indicating
parameters
• Contains following information
– Usability Attribute
– Measuring Instrument
– Value to be Measured
– Current Level
– Worst Acceptable Level
– Planned Target Level
– Best Possible Level
– Observed Results
Usability Attribute
• Represents the usability
characteristic being measured
• Must determine classes of intended
users
• For each class determine realistic set
of tasks
• Goal is to determine what user
performance will be acceptable
Usability Attributes
• Typical attributes include:
– Initial Performance: User's performance during
the first few uses.
– Long-term Performance: User's performance
after extended use of the product
– Learnability: How quickly the user learns the
system
– Retainability: How quickly does the knowledge
of how to use the system dissipate
Usability Attributes
– Advanced Feature Usage: Usability of
sophisticated features
– First Impression: Subjective user
feelings at first glance
– Long-term User Satisfaction: User's
opinion after extended use
Measuring Instrument
• Method to find a value for a usability
attribute
• Quantitative, but may be objective or
subjective
• Objective: based on user task
performance
• Subjective: deal with user opinion
(questionnaires)
• Both types are needed to effectively
evaluate
Benchmark Tasks
• User is asked to perform a task using
the interface
• Most common objective measure
• Task should be a specific, single
interface feature
• Description should be clearly worded
without describing how to do it
Questionnaire
• Quantitative measure for subjective
feelings
• Creating survey that provides useful
data is not trivial
• Recommend use of scientifically
created question (e.g. QUIS)
Values To Be Measured
• The data value metric
• Typically metrics are:
– Time for task completion
– Number of errors
– Average scores/ratings on questionnaire
– Percentage of task completed in a given time
– Ratio of successes to failures
– Time spent in errors and recovery
Values To Be Measured
– Number of commands/actions used to
perform task
– Frequency of help/documentation use
– Number of repetitions of failed
commands
– Number of available commands not
invoked
– Number of times user expresses
frustration or satisfaction
Setting Levels
• Having determined what and how to
measured, need to set acceptable levels
• These levels will be used to determine
when the interface has reached the
appropriate level of usability
• Important to be specific enough so that
levels can be reasonably set
Current Level
• Present level of the value to be measured
• Values can be determined from manual
system, current automated system or
prototypes
• Proof that usability attribute can be
measured
• Baseline against which new system will be
judged
Worst Acceptable Level
• Lowest acceptable level of user
performance
• This level must be attained for the
product to be considered complete
• Not a prediction of how the user will
perform, but rather the worst
performance that is considered
acceptable
Worst Acceptable Level
• Tendency/pressure is to set the
values too low
• Good rule of thumb is to set them at
or near the current levels
Planned Target Level
• The level of unquestioned usability,
the ideal situation
• Serve to focus attention on those
aspects needing the most work (now
or later)
• May be based on competitive
systems
Best Possible Level
• State-of-the-art upper limit
• Provides goals for next versions
• Gives indication of improvement that
is possible
• Frequently determined by having
measuring expert user
Observed Results
• Actual values obtained from user
testing
• Provides quick comparison with
projected levels
Setting Levels
• There are various methods for estimating
the levels:
– Existing systems or previous versions of new
system
– Competitive systems
– Performing task manually
– Developer performing with prototype
– Marketing input based on observations of user
performance on existing systems
Setting Levels
• The context of the task is important
in determining these levels
Example usability table
Usability Measuring Value to be Current Worst Planned Best Observed
attribute instrument measured level accepta target possible results
ble level level level
Advanced “Add Length of 13 minutes 2 1 minute 30
feature repeating time to add a (manually) minutes seconds
usage appointment weekly
” task per appointment
benchmark 3 every week
for one year
after one
hour of use
Example usability table
Usability Measuring Value to be Current Worst Planned Best Observed
attribute instrument measured level accepta target possible results
ble level level level
First User Number of ?? 10 5 2
impression reaction negative/posi negative negative/ negative
tive remarks /2 5 /10
during the positive positive positive
session
Example usability table
Usability Measuring Value to be Current Worst Planned Best Observed
attribute instrument measured level accepta target possible results
ble level level level
Learnability “Add Length of 15 15 12 8
appointme time to seconds seconds seconds seconds
nt” task successfully (manually)
per add
benchmark appointment
5 after one
hour of use
Cautions
• Each usability attribute should be
(realistically) measurable
• User classes need to be clearly specified
• The number of attributes to be measured
should be reasonable. Start small and add
as experience grows
• All project members should agree on the
values
Cautions
• The values should be reasonable
– If found to be too low, then increase
them on next iteration
– If they appear too high, it may be they
were not realistically set or that the
interface needs a lot of work!
Judgement call
Expert Reviews, Usability
Testing, Surveys, and
Continuing Assessment
Introduction
• Designers can become so entranced
with their creations that they may fail
to evaluate them adequately
• Experienced designers have attained
the wisdom and humility to know that
extensive testing is a necessity
Introduction
• The determinants of the evaluation
plan include:
– stage of design (early, middle, late)
– novelty of project (well defined vs.
exploratory)
– number of expected users
– criticality of the interface (life-critical
medical system vs. museum exhibit
support)
Introduction
– costs of product and finances allocated
for testing
– time available
– experience of the design and evaluation
team
Introduction
• The range of evaluation plans might
be from an ambitious two-year test to
a few days test.
• The range of costs might be from
10% of a project down to 1%.
Expert Reviews
• While informal demos to colleagues or
customers can provide some useful
feedback, more formal expert reviews
have proven to be effective.
• Expert reviews entail one-half day to one
week effort, although a lengthy training
period may sometimes be required to
explain the task domain or operational
procedures.
Expert Reviews
• There are a variety of expert review
methods to chose from:
– Heuristic evaluation
– Guidelines review
– Consistency inspection
– Cognitive walkthrough
– Formal usability inspection
Expert Reviews
• Expert reviews can be scheduled at
several points in the development process
when experts are available and when the
design team is ready for feedback.
• Different experts tend to find different
problems in an interface, so 3-5 expert
reviewers can be highly productive, as can
complementary usability testing.
Expert Reviews
• The dangers with expert reviews are
that the experts may not have an
adequate understanding of the task
domain or user communities.
Expert Reviews
• To strengthen the possibility of successful
expert reviews it helps to chose
knowledgeable experts who are familiar
with the project situation and who have a
longer term relationship with the
organization.
• Moreover, even experienced expert
reviewers have great difficulty knowing
how typical users, especially first-time
users will really behave.
Usability Testing and
Laboratories
• The emergence of usability testing
and laboratories since the early
1980s is an indicator of the profound
shift in attention to user needs.
• The remarkable surprise was that
usability testing not only sped up
many projects but that it produced
dramatic cost savings.
Usability Testing and
Laboratories
• The movement towards usability
testing stimulated the construction
of usability laboratories.
Usability Testing and
Laboratories
• A typical modest usability lab would
have two 10 by 10 foot areas, one for
the participants to do their work and
another, separated by a half-silvered
mirror, for the testers and observers
(designers, managers, and
customers).
Usability Lab (Interface
Analysis Associates)
Usability Lab (Interface
Analysis Associates)
Usability Testing and
Laboratories
• Participants should be chosen to
represent the intended user
communities, with attention to
background in computing,
experience with the task, motivation,
education, and ability with the
natural language used in the
interface.
Usability Testing and
Laboratories
• Participation should always be voluntary,
and informed consent should be obtained.
Professional practice is to ask all subjects
to read and sign a statement like this one:
– I have freely volunteered to participate in this
experiment.
– I have been informed in advance what my
task(s) will be and what procedures will be
followed.
Usability Testing and
Laboratories
– I have been given the opportunity to ask
questions, and have had my questions
answered to my satisfaction.
– I am aware that I have the right to withdraw
consent and to discontinue participation at any
time, without prejudice to my future treatment.
– My signature below may be taken as
affirmation of all the above statements; it was
given prior to my participation in this study.
Usability Testing and
Laboratories
• Videotaping participants performing
tasks is often valuable for later
review and for showing designers or
managers the problems that users
encounter.
• Field tests attempt to put new
interfaces to work in realistic
environments for a fixed trial period
Nomos Lab
An observer's view of a test being carried out
in the purposely designed Nomos lab.
Nomos Lab
Two sides of the one-way glass -
actions and problems are logged
while the user carrys out real tasks
with the product.
Usability Testing and
Laboratories
• Field tests can be made more fruitful
if logging software is used to capture
error, command, and help
frequencies plus productivity
measures
Usability Testing and
Laboratories
• Game designers pioneered the can-you-
break-this approach to usability testing
– providing energetic teenagers with the
challenge of trying to beat new games
• This is a destructive testing approach
– users try to find fatal flaws in the system, or
otherwise to destroy it
– has been used in other projects and should be
considered seriously
Usability Testing and
Laboratories
• Usability testing does have at least two
serious limitations
– it emphasizes first-time usage
– has limited coverage of the interface features.
• These and other concerns have led design
teams to supplement usability testing with
the varied forms of expert reviews.
Siemens Usability Lab
A control deck (shown above) allows the team to witness
users reacting to software as they navigate the interface and
attempt to perform normal tasks. Separate cameras record
facial expressions and comments, use of manuals, and
activity on the screen itself. As a rule, every session is
recorded and held for later review and analysis.
Siemens Usability Lab
This section of the Siemens Center has been arranged
so the software design team can view every move the
user makes, interact with him or her when necessary,
and generally see and feel their own design through the
user's experience.
Inventory of Facilities - U of
Indiana
• 2 Sony DXC-107A CCD Color Video Cameras, equipped with Canon R-II electrically controlled zoom
lenses and wall-mounted on Pelco remote-control pan/tilt bases. All camera functions are remotely
controlled from the observation room by Pelco MPTAZ Pan/Tilt and Scanner controls.
• 2 Microphones: 1 Audio-Technical superhypercardioid (super shotgun) type for discrete data
collection and a cardioid microphone for narration and overdubbing.
• 1 Teac TASCAM M-06 six channel professional audio mixer, monitored via 5W self-amplified
speakers or headphones.
• 2 Scan converters: 1 Extron Super Emotia high resolution scan converter for capturing live video
from the subject's computer screen, and 1 Mediator medium-resolution scan converter for titling
and effects generation.
• 2 Macintosh PowerMac 7500/100 workstations with 1710AV 17" monitors: 1 located in the testing
room for use in evaluating Macintosh software, and 1 located in the observation room for data
analysis, effects generation, and web-server functions. Both machines feature video capture and
output (via scan converter) capabilities, and are networked onto both the local LAN (Novell ipx/spx)
and Internet (TCP/IP).
• 1 Dell XPS-90 Workstation with Dell 17" multiscanning monitor, located in the testing room for use
in evaluating PC-compatible software. This machine is also networked onto both the local LAN
(Novell ipx/spx) and Internet (TCP/IP).
• 1 Sony PVM-411 video monitor rack for monitoring all online video sources.
• 3 JVC BRS-800U industrial video cassette recorders, equipped with SA-R50U time code
generator/reader boards and SA-K26U RS-422 interface boards: 2 for capturing camera output and 1
for capturing scan convertor (computer screen) output. Each can function independently or can be
slaved to a single universal RMG-30U serial remote control.
Inventory of Facilities - U of
Indiana
• 3 JVC TM-131SU Color Video Monitors located in the observation room for monitoring online
sources during the evaluation session and providing high-quality output for post-session analysis
and mixdown.
• 1 JVC RMG-800U Editing Control Unit for post-production assemble/insert mixdown of recorded
video source into condensed "highlights" tapes.
• 1 Panasonic WAV7 Digital Effects Generator/Mixer for creating a variety of online and post-
production video effects including wipes, fades, cuts, strobes, keys, mosaics, split-screen and
picture-in-picture effects.
• 1 Optimus SCT-53 "Pro Series" dual audio cassette deck with auto-reverse, dual digital time
counters, and high speed dubbing capabilities.
• Speakerphone equipped with a flashing silent ringer and a digital voicemail box.
• Requisite cabling, stands, tables and other paraphernalia to allow above equipment to function and
be used properly.
Surveys
• Written user surveys are a familiar,
inexpensive and generally
acceptable companion for usability
tests and expert reviews.
• The keys to successful surveys are
clear goals in advance and then
development of focused items that
help attain the goals.
Surveys
• Survey goals can be tied to the
components of the Objects and
Action Interface model of interface
design. Users could be asked for
their subjective impressions about
specific aspects of the interface such
as the representation of:
– task domain objects and actions
– syntax of inputs and design of displays.
Surveys
• Other goals would be to ascertain
– users background (age, gender, origins,
education, income)
– experience with computers (specific
applications or software packages, length of
time, depth of knowledge)
– job responsibilities (decision-making
influence, managerial roles, motivation)
– personality style (introvert vs. extrovert, risk
taking vs. risk aversive, early vs. late adopter,
systematic vs. opportunistic)
Surveys
– reasons for not using an interface
(inadequate services, too complex, too
slow)
– familiarity with features (printing,
macros, shortcuts, tutorials)
– their feeling state after using an
interface (confused vs. clear, frustrated
vs. in-control, bored vs. excited).
Surveys
• Online surveys avoid the cost of printing
and the extra effort needed for distribution
and collection of paper forms.
• Many people prefer to answer a brief
survey displayed on a screen, instead of
filling in and returning a printed form,
although there is a potential bias in the
sample.
Summary
• Extensive testing is a necessity
• Formal expert reviews have proven to be
effective
• Must have an adequate understanding of
the task domain and user communities
• Usability testing speeds up project TTM
and produces dramatic cost savings
Product Evaluations
Evaluation During Active
Use
• A carefully designed and thoroughly
tested system is a wonderful asset, but
successful active use requires constant
attention from dedicated managers, user-
services personnel, and maintenance
staff.
• Perfection is not attainable, but
percentage improvements are possible
and are worth pursuing.
Evaluation During Active
Use
Evaluation During Active
Use
• Interviews and focus group
discussions
– Interviews with individual users can be
productive because the interviewer can
pursue specific issues of concern.
– After a series of individual discussions,
group discussions are valuable to
ascertain the universality of comments.
Evaluation During Active
Use
• Continuous user-performance data
logging
– The software architecture should make it easy
for system managers to collect data about the
patterns of system usage, speed of user
performance, rate of errors, or frequency of
request for online assistance.
– A major benefit of usage-frequency data is the
guidance they provide to system maintainers
in optimizing performance and reducing costs
for all participants.
Evaluation During Active
Use
• Online or telephone consultants
– Online or telephone consultants are an
extremely effective and personal way to
provide assistance to users who are
experiencing difficulties.
– Many users feel reassured if they know
there is a human being to whom they
can turn when problems arise.
Evaluation During Active
Use
– On some network systems, the
consultants can monitor the user's
computer and see the same displays
that the user sees while maintaining
telephone voice contact.
– This service can be extremely
reassuring; the users know that
someone can walk them through the
correct sequence of screens to
complete their tasks.
Evaluation During Active
Use
• Online suggestion box or trouble
reporting
– Electronic mail can be employed to allow users
to send messages to the maintainers or
designers.
– Such an online suggestion box encourages
some users to make productive comments,
since writing a letter may be seen as requiring
too much effort.
Evaluation During Active Use
• Online bulletin board or newsgroup
– Many interface designers offer users an
electronic bulletin board or newsgroups to
permit posting of open messages and
questions.
– Bulletin-board software systems usually offer
a list of item headlines, allowing users the
opportunity to select items for display.
– New items can be added by anyone, but
usually someone monitors the bulletin board
to ensure that offensive, useless, or repetitious
items are removed.
Evaluation During Active
Use
• User newsletters and conferences
– Newsletters that provide information
about novel interface facilities,
suggestions for improved productivity,
requests for assistance, case studies of
successful applications, or stories
about individual users can promote
user satisfaction and greater
knowledge.
Evaluation During Active
Use
– Printed newsletters are more traditional and
have the advantage that they can be carried
away from the workstation.
– Online newsletters are less expensive and
more rapidly disseminated
– Conferences allow workers to exchange
experiences with colleagues, promote novel
approaches, stimulate greater dedication,
encourage higher productivity, and develop a
deeper relationship of trust.
Controlled Psychologically-
oriented Experiments
• Scientific and engineering progress
is often stimulated by improved
techniques for precise measurement.
• Rapid progress in the designs of
interfaces will be stimulated as
researchers and practitioners evolve
suitable human-performance
measures and techniques.
Controlled Psychologically-
oriented Experiments
• The outline of the scientific method as
applied to human-computer interaction
might comprise these tasks:
– Deal with a practical problem and consider the
theoretical framework
– State a lucid and testable hypothesis
– Identify a small number of independent
variables that are to be manipulated
– Carefully choose the dependent variables that
will be measured
Controlled Psychologically-
oriented Experiments
– Judiciously select subjects and carefully or
randomly assign subjects to groups
– Control for biasing factors (non-representative
sample of subjects or selection of tasks,
inconsistent testing procedures)
– Apply statistical methods to data analysis
– Resolve the practical problem, refine the
theory, and give advice to future researchers
Controlled Psychologically-
oriented Experiments
• Managers of actively used systems are
coming to recognize the power of
controlled experiments in fine tuning the
human-computer interface.
• Limited time, and then performance could
be compared with the control group.
Dependent measures could include
performance times, user-subjective
satisfaction, error rates, and user
retention over time.