A Survey - Human Movement Tracking and Stroke
TECHNICAL REPORT: CSM-420
ISSN 1744 - 8050
Huiyu Zhou and Huosheng Hu
8 December 2004
Department of Computer Sciences
University of Essex
Email: email@example.com, firstname.lastname@example.org
1 Introduction 3
2 Sensor technologies 4
2.1 Non-vision based tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Vision based tracking with markers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Vision based tracking without markers . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Robot assisted tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Human movement tracking: non-vision based systems 6
3.1 MT9 based . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 G-link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 MotionStar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 InterSense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Polhemus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5.1 LIBERTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5.2 FASTRAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.5.3 PATRIOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.6 HASDMS-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.7 Glove-based analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.8 Non-commercial systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.9 Other techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Vision based tracking systems with markers 11
4.1 Qualisys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 VICON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 CODA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4 ReActor2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.5 ELITE Biomech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.6 APAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.7 Polaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.8 others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Vision based tracking systems without markers 15
5.1 2-D approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.1.1 2-D approches with explicit shape models . . . . . . . . . . . . . . . . . . . . . 16
5.1.2 2-D approaches without explicit shape models . . . . . . . . . . . . . . . . . . 17
5.2 3-D approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.1 3-D modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.2 Stick ﬁgure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.3 Volumetric modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Camera conﬁguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3.1 Single camera tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.2 Multiple camera tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.4 Segmentation of human motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6 Robot-guided tracking systems 23
6.1 Discriminating static and dynamic activities . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2 Typical working systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2.1 Cozens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2.2 MANUS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2.3 Taylor and improved systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2.4 MIME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2.5 ARM-Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3 Other relevant techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7 Discussion 26
7.1 Remaining challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.2 Design speciﬁcation for a proposed system . . . . . . . . . . . . . . . . . . . . . . . . . 27
8 Conclusions 27
This technical report reviews recent progress
in human movement tracking systems in general,
and patient rehabilitation in particular. Major
achievements in previous working systems are
summarized. Meanwhile, problems in motion
tracking that remain open are highlighted along
with possible solutions. Finally, discussion is
made regarding challenges which remain and a
design speciﬁcation is proposed for a potential Figure 1. A rehabilitation system at the
tracking system. Massachusetts Institute of Technology (MIT),
rehabilitation is a dynamic process which uses
Evidence shows that, in 2001-02, 130,000 peo- available facilities to correct any undesired mo-
ple in the UK experienced a stroke  and re- tion behavior in order to reach an expectation (e.g.
quired admission to hospital. More than 75% of ideal position).
these people were elderly, who required locally During the rehabilitation process, the move-
based multi-disciplinary assessments and appro- ment of stroke patients needs to be localized and
priate rehabilitative treatments after they were learned so that incorrect movements can be in-
dismissed from hospital , . As a con- stantly modiﬁed or tuned. Therefore, tracking
sequence, this increased greatly the demand on these movements becomes vital and necessary
healthcare services, and expense in the national during the course of rehabilitation. This report
health service. To enhance the health service, details a survey of technologies deployed by hu-
people intend to use intelligently devised equip- man movement tracking systems that consistently
ment to conduct patient rehabilitation in the pa- update the spatiotemporal information of patients.
tient’s home rather than in a hospital that may Previous systems (one of them shown in Figure
be geographically remote. These systems are ex- 1) have proved that, to some extent, properly con-
pected to reduce the requirement for face-to-face ducted designs are capable of improving the qual-
therapy between therapy experts providing vision ity of human movement, but many challenges
and audio supports, and patients. still remain due to complexity and uncertainty in
The goal of rehabilitation is to enable a person movement. In the following sections, a compre-
who has experienced a stroke to regain the high- hensive review of this type of systems is provided.
est possible level of independence so that they can The rest of this report is organized as follows.
be as productive as possible. Since stroke patients Section 2 outlines the four main types of tech-
often have complex rehabilitation needs, progress nologies used in human movement tracking. Sec-
and recovery characteristics are unique for each tion 3 presents non-vision based human move-
person. Although a majority of functional abili- ment tracking systems, which have been commer-
ties may be restored soon after a stroke, recovery cialized. Marker-based visual tracking systems
is an ongoing process. Therefore, home-based re- are introduced in Section 4, and markerless visual
habilitation systems are expected to have adaptive systems described in Section 5. Section 6 pro-
settings designed to meet the requirements of in- vides robot-guided tracking system concepts and
dividuals, automatic operation, an open human- a description of their application in the rehabili-
machine interface, rich database for later evalu- tation procedure. A research proposal based on
ation, and compactness and portability. In fact, previous work at the University of Essex, and lit-
Figure 2. Illustration of a real human movement tracking system (courtesy of Axel Mulder, Simon
erature is provied in Section 8. Finally, conclu- 2.1 Non-vision based tracking
sions are provided in Section 9.
In non-vision based systems, sensors are at-
tached to the human body to collect movement
information. Their sensors are commonly classi-
2 Sensor technologies ﬁed as mechanical, inertia, acoustic, radio or mi-
crowave and magnetic sensing. Some of them
have a small sensing footprint that they can
Human movement tracking systems generate monitor small amplitudes such as ﬁnger or toe
real-time data that represents measured human movement. Each kind of sensor has advantages
movement , based on different sensor tech- and limitations. Limitations include modality-
nologies. For example, Figure 2 illustrates a hy- speciﬁc, measurement-speciﬁc and circumstance-
brid human movement tracking system. Retriev- speciﬁc limitations that accordingly affect the use
ing such sensing information allows a system to of the sensor in different environments .
efﬁciently describe human movement, e.g. arm For example, as part of inertia sensors ac-
motion. However, it is recognized that sensor celerometer sensors (Figure 3) convert linear ac-
data encoded with noise or error due to relative celeration, angular acceleration or a combination
movement between the sensor and the objects to of both into an output signal . There are three
which it is attached. It is therefore essential to un- common types of accelerometers: piezoelectric
derstand the structure and characteristics of sen- which exploit the piezoelectric effect whereby a
sors before they are applied to a tracking sys- naturally occurring quartz crystal is used to pro-
tem. According to sensor location on a human duce an electric charge between two terminals;
body, tracking systems can be classiﬁed as non- piezoresistive operating by measuring the resis-
vision based, vision based with markers, vision tance of a ﬁne wire when it is mechanically de-
based without markers, and robot assisted sys- formed by a proof mass ; and variable capac-
tems. These systems are described one at a time itive where the change in capacitance is propor-
in the following sections. tional to acceleration or deceleration . An
Figure 4. Entran’s family of miniature ac-
Figure 3. Illustration of a piezoresistive sen-
2.3 Vision based tracking without markers
This technique exploits external sensors like
cameras to track the movement of the human
example of accelerometers is given in Figure 4. body. It is motivated by facts addressed in marker
Unfortunately, these sensors demand some com- based vision systems : (1) Identiﬁcation of
puting power, which possibly increases response standard bony landmarks can be unreliable. (2)
latency. Furthermore, resolution and signal band- The soft tissue overlying bony landmarks can
with are normally limited by the interface cir- move, giving rise to noisy data. (3) The marker
cuitry . itself can wobble due to its own inertia. (4) Mark-
ers can even come adrift completely.
A camera can be of a resolution of a million
2.2 Vision based tracking with markers
pixels. This is one of the main reasons that such
an optical sensor s attracted people’s attention.
However, such vision based techniques require in-
tensive computational power to achieve efﬁciently
This is a technique that uses optical sensors,
and to reduce the latency of data . Moreover,
e.g. cameras, to track human movements, which
high speed camera’s are also required, as conven-
are captured by placing identiﬁers upon the hu-
tionally less than sixty frames a second provides
man body. As human skeleton is a highly ar-
an insufﬁcient bandwith for accurate data repre-
ticulated structure, twists and rotations make the
movement fully three-dimensional. As a conse-
quence, each body part continuously moves in
2.4 Robot assisted tracking
and out of occlusion from the view of the cam-
eras, leading to inconsistent and unreliable track- Recently, voluntary repetitive exercises admin-
ing of the human body. As a good solution to istered with the mechanical assistance of robotic
this situation, marker-based vision systems have rehabilitators has proven effective in improving
attracted the attention of researchers in medical arm movement ability in post-stroke populations.
science, sports science and engineering. During the course of rehabilitation, human move-
One major drawback of using optical sensors ment is reﬂected by using sensors attached to
and markers, however, is that they are difﬁcult to the body, which consist of electromechanical and
use to accurately sense joint rotation, leading to electromagnetic sensors. Electromechanical sys-
the infeasibility of representing a real 3-D model tems prohibit free movements and involve discon-
for the sensed objects . necting sensors from the human body. The elec-
tromagnetic approach provides more freedom for
human movement, but is seriously affected by di-
3 Human movement tracking: non-
vision based systems
Understanding and interpreting human behav-
ior has attracted attention of therapists and bio- Figure 5. Illustration of MT9.
metric researchers due to its impact on the re-
covery of patient post disease. So, people need
to learn dynamic characteristics about the actions and virtual reality, etc. However, a MT9-based
of certain parts of the body, e.g. hand-gestures tracker with six MT9 units costs about 16,000 eu-
and gait analysis. Tracking actions is an effec- ros.
tive means that consistently and reliably repre-
sents human dynamics against time. This purpose 3.2 G-link
can be reached through the use of electromechan-
ical or electromagnetic sensors. This is so-called G-Link of MicroStrain is a high speed, triax-
“non-vision based tracking”. Among the sensors ial accelerometer node, designed to operate as
and systems to be introduced below, MT9 based, part of an integrated wireless sensor network sys-
G-link based and MotionStar systems have wire- tem , as shown in Figure 6. The Base Sta-
less properties, indicating that they are not limited tion transceiver may trigger data logging (from
in space. 30 meters), or request previously logged data to
be transmitted to the host PC for data acquisi-
3.1 MT9 based tion/display/analysis. Featuring 2 KHz sweep
rates, combined with 2 Mbytes ﬂash memory,
The MT9  of Xsens Motion Tech, is a dig- these little nodes pack a lot of power in a small
ital measurement unit that measures 3-D rate-of- package. Every node in the wireless network
turn, acceleration and earth-magnetic ﬁeld, as re- is assigned a unique 16 bit address, so a single
ferred to in Figure 5. Combined with the MT9 host transceiver can address thousands of multi-
Software it provides real-time 3-D orientation channel sensor nodes. The Base Station can trig-
data in the form of Euler angles and Quaternions, ger all the nodes simultaneously, and timing data
at frequencies up to 512 Hz and with an accuracy is sent by the Base Station along with the trigger.
better than 1 degree root-mean-square (RMS). This timing data is logged by the sensor nodes
The algorithm of the MT9 system is equiva- along with sensor data.
lent to a sensor fusion process where the measures G-Link may also be wirelessly commanded to
of gravity through accelerometers and magnetic transmit data continually, at 1 KHz sweep rates,
north via magnetometers compensate for increas- for a pre-programmed time period. The contin-
ing errors from the integration of the rate of turn uous, fast wireless transmission mode allows for
data. Hence, this drift compensation is attitude real time data acquisition and display from a sin-
and heading referenced. In a homogeneous earth- gle multichannel sensor node at a time. G-link
magnetic ﬁeld, the MT9 system has 0.05 degrees has two acceleration ranges: +/- 2 G’s and +/- 10
RMS angular resolution; < 1.0 degrees static ac- G’s, whilst its battery lifespan can be 273 hours.
curacy; and 3 degrees RMS dynamic accuracy. Furthermore, this product has a small transceiver
Due to its compact size and reliable perfor- size: 25×25×5 mm2 . A G-Link starter kit, inl-
mance, the MT9 has easily been integrated into cuding two G-Link data-logging transcevers (+/-
the ﬁeld of biomechanics, robotics, animation, 10 G full scale range), one basestation, all nec-
Figure 7. Motionstar Wireless 2.
Figure 6. A G-link unit.
essary software and cables, costs about 2,000 US
As a wireless sensor, 3DM-G combines an-
gular rate gyros with three orthogonal DC ac-
celerometers, three orthogonal magnetometers to
output its orientation. This product can be oper-
ated over the full 360 degrees of angular motion
on all three axes with +/- 300 degrees/sec angular Figure 8. InterSense IS-300 Pro.
velocity range, 0.1 degrees repeatability and +/-
5 degrees accuracy. A gyro enhanced 3-axis ori- the “line of sight” blocking problems of optical
entation system starter kit 3DM-G-485-SK, con- systems. There are 6 data points sampled by each
sisting of one 3DM-G-485-M orientation module, sensor so fewer sensors are demanded. The com-
one 3DM-G-485-CBL-PWR communication ca- munication between the console and the sensors
ble and power supply, a 3MG-G Software Suite is wireless.
for Win 95/98/2000/XP and a user manual, costs MotionStar Wireless 2 (Figure 7) is a magnetic
1,500 US dollars (approx.). tracker for capturing the motion of one or more
performers. Data is sent via a wireless communi-
3.3 MotionStar cations link to a base-station. It holds such good
performance as: (1) translation range: +/- 3.05 m;
MotionStar is a magnetic motion capture sys- (2) angular range: all attitude - +/- 180 deg for Az-
tem produced by the Ascension Technology Cor- imuth and Roll, +/- 90 deg for Elevation; (3) static
poration in the USA. This system applies DC resolution (position): 0.08 cm at 1.52 m range; (4)
magnetic tracking technologies, which are signif- static resolution (orientation): 0.1 RMS at 1.52 m
icantly less susceptible to metallic distortion than range. Unfortunately, the communication range is
AC electronmagnetic tracking technologies. It only 12 feet (radius).
provides real-time data output, capturing signif- A vital drawback is that this system with six
icant amounts of motion data in short order. Re- sensors costs around 56,000 US dollars.
gardless of the number of sensors tracked, one can
get up to 120 measurements per sensor per sec- 3.4 InterSense
ond. This system achieves six degree-of-freedom
measurements, where each sensor calculates both InterSense has its updated product IS-300 Pro
position (x, y, z) and orientation (azimuth, eleva- Precision Motion Tracker shown in Figure 8. This
tion, roll) for a full 360 degrees coverage without system virtually eliminated the jitter common to
other systems. It is featured with update rates of
up to 500 Hz, steady response in metal-cluttered
environments. The signal processor was small
enough to wear on a belt for tetherless applica-
tion. Furthermore, this system was the only one
which predicted motion up to 50 ms and compen-
sated for graphics rendering delays and further
contributed to eliminating simulator lag. There-
fore, it has been used successfully to implement
feed-forward motion prediction strategies.
Figure 9. Illustration of LIBERTY by Polhe-
3.5 Polhemus mus.
Polhemus  is the number one global provider
of 3-D position/orientation tracking systems, dig-
itizing technology solutions, eye-tracking sys-
tems and handheld three-dimensional scanners.
The company was founded in 1969 by Bill Pol-
hemus in Grand Rapids, MI. In early 1971 Polhe-
mus moved to the Burlington area. Polhemus pro-
vided several novel fast and easy digital tracking
systems: LIBERTY, FASTRAK and PATRIOT.
3.5.1 LIBERTY Figure 10. Illustration of FASTRAK by Polhe-
This was the forerunner in electromagnetic track-
ing technology (Figure 9). LIBERTY computed
at an extraordinary rate of 240 updates per second localization, telerobotics, digitizing, and pointing.
per sensor with the ability to be upgraded from
four sensor channels to eight, by the addition of
a single circuit board. Also, it had a latency of
3.5 milliseconds, a resolution of .00015 in (0.038 3.5.3 PATRIOT
mm) at 12 in. (30 cm) range; and a 0.0012 ori- PATRIOT was a cost effective solution for six-
entation. The system provided an easy, intuitive degree-of-freedom tracking and 3-D digitizing.
user interface. Application uses were boundless, A good answer for the position/orientation sens-
from biomechanical, and sports analysis, to vir- ing requirements of 3-D applications and environ-
tual reality. ments where cost is a primary concern, it was
ideal for head tracking, biomechanical analysis,
3.5.2 FASTRAK computing graphics, cursor control, and stero-
taxic localization. See Figure 11.
FASTRAK was a solution for accurately comput-
ing position and orientation through space (Figure 3.6 HASDMS-I
10). With real time, six-degree-of-freedom track-
ing and virtually no latency, this award-winning Human Performance Measurement, Inc. pro-
system was ideal for head, hand, and instrument vided the HASDMS-I Human Activity State De-
tracking, as well as biomedical motion and limb tection and Monitoring System . The Model
rotation, graphic and cursor control, stereotaxic HASDMS-I is a system designed to detect and log
Figure 11. Illustration of PATRIOT by Polhe-
mus. Figure 12. HASDMS-I from Human Perfor-
mance Measurement, Inc.
selected human activity states over prolonged pe-
riods (up to 7 days). It consists of a Sensing and
Logging Unit (SLU) and Windows-based Host
Software that runs on a user supplied PC. The sys-
tem is based on the observation that while humans
engage in activities which are often quite com-
plex dynamically and kinematically, there are dis-
tinct patterns that lead us to identify these activ-
ities with speciﬁc words such as standing, walk- Figure 13. Illustration of a glove-based proto-
ing, etc. Such words are referred to as “activity type (image courtesy of KITTY TECH).
The HASDMS-I was designed to provide the
greatest activity discrimination with the smallest (hours, minutes, seconds) spent in different states.
possible sensor array (i.e., one sensing site on In addition, an Activity State History Graph de-
the body). The SLU is a compact, battery pow- picts the type and duration of each state in a time-
ered instrument with special sensors and a mi- sequenced, scrollable window. Activity state data
croprocessor that is mounted to the lateral aspect can be printed or exported in the form of an ASCII
of the monitored subject’s thigh. It detects four text ﬁle for any other user-speciﬁed analyses. The
unique activity states: (1) lying-sitting (grouped), HASDMS-1 system is shown in Figure 12.
(2) standing, (3) walking, and (4) running. A ﬁfth
state (”unknown”) is also provided to discrimi- 3.7 Glove-based analysis
nate unusual patterns from those which the sys-
tem is designed to detect. Since the late 1970s people have studied glove-
The SLU is ﬁrst connected to a Host PC (via based devices for the analysis of hand gestures.
a simple serial port connection) for initialization Glove-based devices adopt sensors attached to a
and start-up. It is then attached to a subject for an glove that transduces ﬁnger ﬂexion and abduction
unsupervised monitoring session. When the ses- into eletrical signals to determine the hand pose
sion is complete, the SLU is again connected to (Figure 13).
the Host PC and logged data is uploaded to the The Dataglove (originally developed by VPL
host software for databasing, display, and analy- Research) was a neoprene fabric glove with two
sis. Several standard activity summaries are pro- ﬁber optic loops on each ﬁnger. Each loop was
vided including (1) the percent time spent in dif- dedicated to one knuckle and this can be a prob-
ferent states and (2) the total amount of time lem. If a user has extra large or small hands, the
loops will not correspond very well to the actual - x, y, z, roll).
knuckle position and the user will not be able to Similar technologies can be referred to 5DT
produce very accurate gestures. At one end of DataGlove , PINCH Gloves , and Hand
each loop is an LED and at the other end is a Master .
photosensor. The ﬁber optic cable has small cuts
along its length. When the user bends a ﬁnger, 3.8 Non-commercial systems
light escapes from the ﬁber optic cable through
these cuts. The amount of light reaching the pho- The commercial systems described earlier ac-
tosensor is measured and converted into a mea- commodate stable and consistent technologies.
sure of how much the ﬁnger is bent. The Data- Nevertheless, they are sold with high prices. This
glove requires recalibration for each user . extremely limits the applications of these systems
The CyberGlove system included one Cyber- in the community. As a result, people intend to
Glove , an instrumentation unit, a serial cable propose some affordable, compact and friendly
to connect to your host computer, and an exe- systems instead. In this context, an example is
cutable version of the VirtualHand graphic hand given as follows.
model display and calibration software. Many Dukes  developed a compact system which
applications require measurement of the position comprised two parts, an embedded hand unit that
and orientation of the forearm in space. To ac- encapsulated the necessary hardware for captur-
complish this, mounting provisions for Polhemus ing human arm movement and a software inter-
and Ascension 6 degrees of freedom tracking sen- face implemented in a computer terminal for dis-
sors are available for the glove wristband. Track- playing the collected data.
ing sensors are not included in the basic Cyber- Within the embedded hand unit a microcon-
Glove system. The CyberGlove had a software troller gathered data from two accelerometers.
programmable switch and LED on the wristband The collected data was then transmitted to the
to permit the system software developer to pro- computer terminal for the purpose of analysis.
vide the CyberGlove wearer with additional in- The software interface was implemented to col-
put/output capability. The instrumentation unit lect data from the embedded hand unit. The data
provided a variety of convenient functions and was presented to the user both statically and dy-
features including time-stamp, CyberGlove sta- namically in the form of a three dimensional an-
tus, external sampling synchronization and analog imation operation. The whole system success-
sensor outputs. fully captured human movement. Moreover, the
Based on the design of the DataGlove, Power- transference of data from the hand unit to the ter-
Glove was developed by Abrams-Gentile Enter- minal was consistently achieved in an asynchro-
tainment (AGE Inc.) for Mattel through a licens- nized mode. In the computer terminal, the col-
ing agreement with VPL Research. PowerGlove lected data was clearly illustrated for represent-
consists of a sturdy Lycra glove with ﬂat plas- ing the continuous sampling of the FM transmit-
tic strain gauge ﬁbers coated with conductive ink ter and receiver modules, demonstrated in Figure
running up each ﬁnger; which measure change in 14.
resistance during bending to measure the degree However, the animation shown in the termi-
of ﬂex for the ﬁnger as a whole. It employs an nal failed to correct display human movement in
ultrasonic system (back of glove) to track the roll terms of distance travelled and speed of move-
of the hand (reported in one of twelve possible ment. This is due to the direct output of the data
roll positions), ultrasonic transmitters must be ori- from the accelerometers without any calculation
ented toward the microphones to get an accurate with respect to the distance. To perform a correct
reading; pitching or yawing hand changes orien- demonstration, this data needs to be resampled
tation of transmitters and signal would be lost by and further processed in the terminal based on the
the microphones; poor tracking mechanism. (4D travelled distance and its corresponding time.
Figure 14. Demo of Dukes’s approach: (a) collected data on x-axis, and (b) collected data on y-axis.
3.9 Other techniques 4 Vision based tracking systems with
Acoustic systems collect information by trans- In 1973 Johansson explored his famous Mov-
mitting and sensing sound waves, where the ﬂight ing Light Display (MLD) psychological experi-
duration of a brief ultrasonic pulse is timed and ment to perceive biological motion . He at-
calculated. These systems are being used in med- tached small reﬂective markers to the joints of
ical applications, , , , but have not human subjects, which allow these markers to
been used in motion tracking. This is due to in- be monitored during trajectories. This experi-
herent drawbacks corresponding to the ultrasonic ment became the milestone of human movement
systems: (1) the efﬁciency of an acoustic trans- tracking. Although Johansson’s work established
ducer is proportional to the active surface area so a solidate theory for human movement track-
large devices are demanding; (2) to improve the ing, it still faces the challenges such as errors,
detected range the frequency of ultrasonic waves non-robustness and expensive computation due to
must be low (e.g. 10Hz) but this affects system environmental constraints, mutual occlusion and
latency in continuous measurement; (3) acoustic complicated processing. However, tracking sys-
systems require a line of sight between the emit- tems with markers minimize uncertainty of sub-
ters and the receivers. ject movements due to the unique appearance of
Radio and microwaves are normally used in the markers. Consequently, plenty of marker-
navigation systems and airports landing aids based tracking systems are nowadays available in
 although they have no application in the the market. Study of these systems allows their
human motion tracking. Electromagnetic wave- advantages to be exploited in a further developed
based tracking approaches can provide range in- platform.
formation by calculating the radiated energy dis-
sipated in a form of radius r as 1/r 2 . For exam- 4.1 Qualisys
ple, using a delay-locked loop (DL) the Global
Positioning System (GPS) can achieve a resolu- A Qualisys motion capture system depicted in
tion of 1 meter. Obviously, this is not enough Figure 15 consists of 1 to 16 cameras, each emit-
for the human motion that is usually of 40-50 cm ting a beam of infrared light . Small reﬂective
displacements per sec. The only radio frequency- markers are placed on the object or person to be
based precision motion tracker can be of a surpris- measured. The camera ﬂash infrared light and the
ingly good resolution of a few millimeters, but it markers reﬂect it back to the camera. The cam-
used large racks of microwave equipment and was era then measures a 2-dimensional position of the
demonstrated in an empty room. That is to say, a reﬂective target by combining the 2-D data from
hybrid system is potential to obtain higher resolu- several cameras a 3D position is calculated. The
tion but incurs integration difﬁculties. data can be analyzed in Qualisys Motion Manager
VICON, a 3-D optical tracking system, was
speciﬁcally designed for use in virtual and im-
mersive environments . By combining Vicons
high-speed, high-resolution cameras with new au-
tomated Tracker software, the system delivers im-
mediate and precision manipulation of graphics
for ﬁrst person immersive environments for mil-
Figure 15. An operating Qualisys system. itary, automotive, and aerospace visualizations.
Precise, low-latency and jitter free motion track-
ing, though key to creating a realistic sense of im-
mersion in visualizations and simulations, has not
been possible previously due to lag, inaccuracies,
unpredictability and unresponsiveness in electro-
magnetic, inertial and ultrasonic tracking options.
The VICON Tracker, which offers wireless, ex-
treme low-latency performance with six degrees
of freedom (DOF) and zero environmental inter-
ference, outclasses these obsolescent systems, yet
is the simplest to set up and calibrate. Targets are
tracked by proprietary CMOS VICON cameras
ranging in resolution from 640x480 to 1280x1024
Figure 16. Reﬂective markers used in a real- and operating between 200-1000 Hz. The en-
time VICON system. tire range of cameras are designed, developed and
built speciﬁcally for motion tracking.
At the heart of the system, the VICON Tracker
software automatically calculates the center of ev-
(QMM) or is exported in several external formats. ery marker, reconstructs its 3-D position, identi-
ﬁes each marker and object, and outputs 6 DOF
information typically in less than 7 milliseconds.
This system can be combined with Visual3D, The strength of Vicon Tracker software lies in its
an advanced analysis package for managing and automation. The very ﬁrst Tracker installation re-
reporting optical 3-D data, to track each segment quires about an hour of system set-up; each fol-
of the model. The pose (position and orientation) lowing session requires only that the PC running
of each segment is determined by 3 or more non- the software be switched on. Objects with three
collinear points attached to the segment. For bet- or more markers will automatically output mo-
ter accuracy, a cluster of targets can be rigidly at- tion data that can be applied to 3-D objects in real
tached to a shell. This prevents the targets from time, and integrated into a variety of immersive 3-
moving relative to each other. This shell is then D visualization applications, including EDS Jack,
afﬁxed to the segment. The kinematics model Dassault Delmia, VRCOM, Fakespace, VRCO
is calculated by determining the transformation Track D and others. Figure 16 shows that reﬂec-
from the tracking targets recorded to a calibration tive markers within a real-time VICON system
pose. are applied to two subjects.
Figure 18. ReActor2 system.
in neuro-physiology and high quality virtual re-
Figure 17. CODA system. ality systems as well as tightly coupled real-time
animation. It was also possible to trigger external
4.3 CODA equipment using the real-time Codamotion data.
At a three metre distance, this system has such
good accurate parameters as follows: +/-1.5 mm
CODA is an acronym of Cartesian Opto-
in X and Z axes, +/- 2.5 mm in Y axis for peak-
electronic Dynamic Anthropometer, a name ﬁrst
to-peak deviations from actual position.
coined in 1974 to give a working title to an early
research instrument developed at Loughborough 4.4 ReActor2
University, United Kingdom by David Mitchelson
and funded by the UK Science Research Council As products of Ascension Tech. Corporation
, illustrated in Figure 17. ReActor2 digital active-optical tracking systems
The system was pre-calibrated for 3-D mea- shown in Figure 18 capture the movements of an
surement, which means that the lightweight sen- untethered performer C free to move in a cap-
sor can be set up at a new location in a matter of ture area bordered by modular bars that fasten
minutes, without the need to recalibrate using a together. The digital detectors embedded in the
space-frame. Up to six sensor units can be used frame provide full coverage of performers while
together and placed around a capture volume to minimizing blocked markers. The Instant Marker
give extra sets of eyes and maximum redundancy Recognition instantly reacquires blocked markers
of viewpoint. This enables the Codamotion sys- for clean data. This means less post processing
tem to track 360 degree movements which often and a more efﬁcient motion capture pipeline .
occur in animation and sports applications. The
active markers were always intrinsically identi- Up to 544 new and improved digital detec-
ﬁed by virtue of their position in a time multi- tors embedded in a 12-bar frame and over 800
plexed sequence. Confused or swapped trajecto- active LEDs ﬂashing per measurement cycle for
ries can never happen with the Codamotion sys- complete tracking coverage. A sturdy, ruggedi-
tem, no matter how many markers are used or how zed frame eliminiates repetitive camera calibra-
close they are to each other. tion and tripod alignment headaches. Set up the
The calculation of the 3-D coordinates of mark- system once and start capturing data immediately.
ers was done in real-time with an extremely low
delay of 5 milliseconds. Special versions of the 4.5 ELITE Biomech
system were available with latency shorter than
1 millisecond. This opens up many applications ELITE Biomech from BTS of Italy is based
that require real-time feedback such as research on the latest generation of ELITE systems:
Figure 20. The Polaris system.
Figure 19. Demo of ELITE Biomech’s out-
comes. also been utilized in many industrial, non-human
applications. Optional software modules include
real-time 3D (6 degree of freedom) rendering ca-
ELITE2002. ELITE2002 performs a highly accu- pabilities and full gait pattern analysis utilizing all
rate reconstruction of any type of movement, on industry standard marker sets.
the basis of the principle of shape recognition of
passive markers. 4.7 Polaris
3D reconstruction and tracking of markers
starting from pre-deﬁned models of protocols are The Polaris system (Northern Digital Inc.) 
widely validated by the international scientiﬁc is of real-time tracking ﬂexibility for comprehen-
community. Tracking of markers based on the sive purposes, including academic and industrial
principle of shape recognition allows the use of environments. This system optimally combines
the system in extreme conditions of lighting. This simultaneous tracking of both wired and wireless
system is capable of managing up to 4 force plat- tools (Figure 20).
forms of various brands, and up to 32 electro- The whole system can be divided into two
myographic channels. It also runs in real time parts: the position sensors and passive or ac-
recognition of markers with on-monitor-display tive markers. The former consist of a couple of
during the acquisition, and real time processing of cameras that are only sensitive to infrared light.
cinematic and analog data, demonstrated in Fig- This design is particularly useful when the back-
ure 19. ground lighting is varying and unpredictive. Pas-
sive markers are covered by reﬂective materials,
4.6 APAS which are activated by the arrays of infrared light-
emitting diodes surrounding the position sensor
The Ariel Performance Analysis System lenses. In the meantime, active markers can emit
(APAS)  is the premier products designed, infrared light themselves. The Polaris system is
manufactured, and marketed, by Ariel Dynamics, able to provide 6 degrees of freedom motion in-
Inc. It is an advanced video-based system op- formation. With proper calibration, this system
erating from the Windows 95/98/NT/2000 envi- may achieve 0.35 mm RMS accuracy in position
ronments. Speciﬁc points of interest are digitized measures. A basic Polaris with a software devel-
with user intervention or automatically using con- opment kit (SDK) costs about $2000.
trasting markers. Additionally, analog data (i.e. However, similar to other marker-based tech-
force platform, EMG, goniometers etc.) can be niques, the Polaris system cannot sort out the oc-
collected and synchronized with the kinematic clusion problem due to the existence of the line
data. Although the system has primarily been of sight. Adding extra position sensors possibly
used for quantiﬁcation of human activities, it has mitigates the trouble but also increases computa-
ball markers were attached to the performer’s
joints, which reﬂected infrared light so that cam-
eras picked up the bright points indicating the
cameras’ positions. The 3-D position of each
marker was calculated by corresponding a 2-D
(a) (b) marker point in an image plane via the epipo-
lar constraint. The skeleton motion of the per-
former was then deduced . By using the per-
spective camera models, the 3-D model recovered
previously was projected into 2-D image planes,
which behaved as the prediction of the matching
framework. The performance of this approach is
(c) (d) demonstrated in Figure 21.
Figure 21. Demo of Tao and Hu’s approach: 5 Vision based tracking systems
(a) markers attached to the joints; (b), (c) and without markers
(d) marker points captured from three cam-
eras. In the previous section, we described the fea-
tures of the marker-based tracking systems, which
are restritive to some degree due to the mounted
tional cost and operational complexity. markers. As a less restritive motion capture tech-
nique, markerless based systems are capable of
4.8 others overcoming the mutual occlusion problem as they
are only concerned about boundaries or features
Other commercial marker-based systems are on human bodies. This is an active and promis-
given in , , . ing but also challenging research area in the last
By combining with the commercial marker- decade. The research with respect to this area is
based systems, people have developed some hy- still ongoing due to unsolved technical problems.
brid techniques to implement human motion From a review’s point of view, Aggarwal and
tracking. These systems, although still in the Cai  classiﬁed human motion analysis as:
experimental stage, already demonstrate encour- body structure analysis (model and non-model
aging performance. For example, Tao and Hu based), camera conﬁguration (single and multi-
 built a visual tracking system, which ex- ple), and correlation platform (state-space and
ploited both marker-based and marker-free track- template matching). Gavrila  claimed that the
ing methods. The proposed system consisted of dimensionality of the tracking space, e.g. 2-D or
three parts: a patient, video cameras and a PC. 3-D, be mainly focused. To be coincident with
The patient’s motion was ﬁlmed by video cam- these exiting deﬁnitions, we suggest to contain all
eras and the captured image sequences were in- these issues in this context.
put to the PC. The software in the PC com-
prised of three modules: motion tracking mod- 5.1 2-D approaches
ule, database module and decision module. The
motion tracking module was formulated in an As a commonly used framework, 2-D motion
analysis-by-synthesis framework, which was sim- tracking only concerns the human movement in
ilar to the strategy introduced by O’Rourke and an image plane, although sometimes people in-
Badler . In order to enhance the predic- tend to project a 3-D structure into its image plane
tion component, a marker-based motion learn- for processing purposes. This approach can be
ing method was adopted: small retro-reﬂective catalogued with and without explicit shape mod-
(a) (b) (c)
Figure 22. Demonstration of Pﬁnder by Wren, et al.
5.1.1 2-D approches with explicit shape mod-
Due to the arbitrary movements of humans self-
occlusion exists during human trajetories. To
sort out this problem, one normally uses a pri-
ori knowledge about human movements in 2-D by
segmenting the human body. For example, Wren
et al.  presented a region-based approach,
where they regarded the human body as a set of Figure 24. Computer game on-chip by Free-
“blobs” which can be described by a spatial and man, W. et al.
color Gaussian distribution. To initialize the pro-
cess, a foreground region can be extracted given
the background model. The blobs, representing
human hands, head, etc., are then placed over the obtain joint locations in images of walking hu-
foreground region instead. A 2-D contour shape mans by establishing correspondence between ex-
analysis was undertaken to identify various body tracted ribbons. Their work assumed small mo-
parts. The working ﬂowchart is referred to Figure tion between two consecutive frames, and feature
22. correspondence was conducted using various ge-
Akita  explored an approach to segment ometric constraints.
and track human body parts in common circum- Shimada et al.  suggested to achieve rapid
stances. To prevent the body tracking from col- and precise estimation of human hand postures by
lapsing, he presumed the human movements are combining 2-D appearance and 3-D model-based
known a priori in some kind of “key frames”. He ﬁtting. First, a rough posture estimate was ob-
followed the tracking order, legs, head, arms, and tained by image indexing. Each possible hand ap-
trunk, to detect the body parts. However, due to pearance generated from a given 3-D shape model
simpliﬁcation his model works in some special was labeled by an index obtained by PCA com-
situations. pression and registered with its 3-D model pa-
Long and Yang  advocated that the limbs rameters in advance. By retrieving the index of
of a human silhouette could be tracked based on the input image, the method obatined the matched
the shapes of the antiparallel lines. They also appearance image and its 3-D parameters rapidly.
conducted experimental work to cope with oc- Then, starting from the obtained rough estimate,
clusion, i.e. disappearance, merging and split- it estimated the posture and moreover reﬁned the
ting. Kurakake and Nevatta  attempted to given initial 3-D model by model-ﬁtting.
Figure 23. Human tracking in the approach of Baumberg and Hogg.
5.1.2 2-D approaches without explicit shape the other used orientation histograms to select the
models body pose from a menu of templates.
Cordea et al.  discussed a 2.5 dimensional
This is a more often addressed topic. Since human tracking method allowing real-time recovery of
movements are non-rigid and arbitrary, bound- the 3-D position and orientation of a head mov-
aries or silhouettes of human body are viable and ing in its image plane. This method used a 2-D
deformable, leading to difﬁcult description for elliptical head model, a region- and edge-based
them. Tracking human body, e.g. hands, is nor- matching algorithms, and a Linear Kalman Filter
mally achieved by means of background substrac- estimator. The tracking system worked in a realis-
tion or color detection. Furthermore, due to the tic situation without makeup on the face, with an
unavailability of models one has to attend low uncalibrated camera, and unknown lighting con-
level image processing such as feature extraction. ditions and background.
Baumberg and Hogg  considered using Ac- Fablet and Black  proposed a solution
tive Shape Model (ASM) for tracking pedestri- for the automatic detection and tracking of hu-
ans (Figure 23). B-splines were used to repre- man motion using 2-D optical ﬂow information,
sent different shapes. The foreground region was which provided rich descriptive cues, while be-
ﬁrst extracted by substracting the background. A ing independent of object and background ap-
Kalman ﬁlter was then applied to accomplish the pearance. To represent the optical ﬂow patterns
spatio-temporal operation, which is similar to the of people from arbitrary viewpoints, they devel-
work of Blake et al . Their work was then ex- oped a novel representation of human motion us-
tended by automatically generating an improved ing low-dimensional spatio-temporal models that
physically based model using a training set of ex- were learned using motion capture data of hu-
amples of the object deforming, tuning the elastic man subjects. In addition to human motion (the
properties of the object to reﬂect how the object foreground) they modelled the motion of generic
actually deforms. The resulting model provides scenes (the background); these statistical models
a low dimensional shape description that allows were deﬁned as Gibbsian ﬁelds speciﬁed from the
accurate temporal extrapolation at low computa- ﬁrst-order derivative of motion observations. De-
tional cost based on the training motions . tection and tracking were posed in a principled
Freeman et al.  developed a special de- Bayesian framework which involved the compu-
tector for computer games on-chip (Figure 24), tation of a posterior probability distribution over
which is to infer useful information about the po- the model parameters. A particle ﬁlter was then
sition, size, orientation, or conﬁguration of the used to represent and predict this non-Gaussian
human body parts. Two algorithms were used, posterior distribution over time.The model param-
one of which used image moments to calculate eters of samples from this distribution were re-
an equivalent rectangle for the current image, and lated to the pose parameters of a 3-D articulated
5.2 3-D approaches
These approaches attempted to recover 3-D ar-
ticulated poses over time . People usually
project a 3-D model into a 2-D image for substan-
tial processing. This is due to the application of
image appearance and dimensional reduction.
5.2.1 3-D modelling
Modelling human movements a priori allows
the tracking problem to be minimized: the fu- Figure 25. Stick ﬁgure of human body (image
ture movements of the human body can be courtesy of Freeman, W.T.).
predicted regardless of self-occlusion or self-
collision. O’Rourke and Badler  discovered
that the prediction in state space seemed more sta-
ble than that in image space due to the incorpo- of constraint that could be relaxed using “virtual
rated semantic knowledge in the former. In their springs”. This model behaved as a mass-spring-
tracking framework, four components were inl- damper system. Proximity space (PS) was used
cuded: prediction, synthesis, image analysis, and to conﬁned the motion and stereo measurements
state estimation. This strategy has been applied to of joints, which started from the human head and
most of the existing tracking systems. extended to arms and torso through the expansion
Model-based approaches contain stick ﬁgures, of PS.
volumetric and a mixture of models. By modelling a human body with 14 joints
and 15 body parts, Ronfard et al.  at-
5.2.2 Stick ﬁgure tempted to ﬁnd people in static video frames using
learned models of both the appearance of body
The stick ﬁgure is the representation of the skele-
parts (head, limbs, hands), and of the geome-
tal structure, which is normally regarded as a col-
try of their assemblies. They built on Forsyth
lection of segments and joint angles (Figure 25).
and Fleck’s general ‘body plan’ methodology and
Bharatkumar et al  used stick ﬁgures to model
Felzenszwalb and Huttenlocher’s dynamic pro-
the lower limbs, e.g. hip, knees, and ankles. They
gramming approach for efﬁciently assembling
applied a medial-axis transformation to extract 2-
candidate parts into ‘pictorial structures’. How-
D stick ﬁgures of the lower limbs.
ever they replaced the rather simple part detec-
Chen and Lee  ﬁrst applied geometric pro-
tors used in these works with dedicated detectors
jection theory to obtain a set of feasible pos-
learned for each body part using Support Vector
tures from a single image, then made use of the
Machines (SVMs) or Relevance Vector Machines
given dimensions of the human stick ﬁgure, phys-
(RVMs). RVMs are SVM-like classiﬁers that of-
iological and motion-speciﬁc knowledge to con-
fer a well-founded probabilistic interpretation and
strain the feasible postures in both the single-
improved sparsity for reduced computation. Their
frame analysis and the multi-frame analysis. Fi-
beneﬁts were demonstrated experimentally in a
nally a unique gait interpretation was selected by
series of results showing great promise for learn-
an optimization algorithm.
ing detectors in more general situations.
Huber’s human model  was a reﬁned ver-
sion of the stick ﬁgure representation. Joints were Further technical reports are given in , ,
connected by line segments with a certain degree , , .
5.2.3 Volumetric modeling
Elliptical cylinders are one of the volumetric
models that model human body. Hogg  and
Rohr  extended the work of Marr and Nishi-
hara , which used elliptical cylinders for rep-
resenting the human body. Each cylinder con-
sisted of three parameters: the length of the axis,
the major and minor axes of the ellipse cross sec-
tion. The coordinate system originated from the
center of the torso. The difference between the
two approaches is that Rohr used eigenvector line
ﬁtting to project the 2-D image onto the 3-D hu-
Rehg et al.  represented two occluded ﬁn-
gers using several cylinders, and the center axes
of cylinders were projected into the center line
segments of 2-D ﬁnger images. Goncalves et al. Figure 26. Volumetric modelling by Theobalt,
 modelled both the upper and lower arm as C.
truncated circular cones, and the shoulder and el-
bow joints were presumably spherical joints. A
3-D arm model was projected to an image plane traction. In the initial frame, the silhouette of the
and then ﬁtted to the blurred image of a real arm. person seen from the 2 front view cameras was
The maching was acheived by minimizing the er- separated into distinct regions using a General-
ror between the model projection and the real im- ized Voronoi Diagram Decomposition. The lo-
age adapting the size and the orientation of the cations of its hands, head and feet could now be
model. identiﬁed. In the front camera view for all video
Chung and Ohnishi  proposed a 3-D frames after initialization the locations of these
model-based motion analysis which used cue cir- body parts could be tracked and their 3-D location
cles (CC) and cue sphere (CS). Stereo match- reconstructed. In addition a voxel-based approx-
ing for reconnstructing the body model was per- imation to the visual hull was computed for each
formed by ﬁnding pairs of CC between the pair time step. The experimental volumetric data was
of contour images investigated. A CS needed to given in Figure 26.
be projected back onto two image planes with its
corresponding CC. 5.3 Camera conﬁguration
Theobalt et al.  suggested to combine ef-
ﬁcient real-time optical feature tracking with the The tracking problem can be tackled by proper
reconstruction of the volume of a moving sub- camera setup. Literature has been linked with a
ject to ﬁt a sophisticated humanoid skeleton to single camera and a distributed-camera conﬁgu-
the video footage. The scene is observed with 4 ration. Using multiple cameras does require a
video cameras, two connected to one PC (Athlon common spatial reference to be employed, and a
1GHz). The system consisted of two parts: a single camera does not have such a requirement.
distributed tracking and visual hull reconstruction However, a single camera from time to time suf-
system (online component), and a skeleton ﬁtting fers from the occlusion of the human body due
application that took recorded sequences as input. to its ﬁxed viewing angle. Thus, a distributed-
For each view, a moving person was separated camera strategy is a better option of minimizing
from background by a statistical background sub- such a risk.
5.3.1 Single camera tracking responding 3-D skeletal structure were encap-
sulated within a non-linear Point Distribution
Polana and Nelson  observed that the move- Model. This statistical model allowed a direct
ments of arms and legs converge to that of the mapping to be achieved between the external
torso. Each walking person image was bounded boundary of a human and the anatomical position.
by a rectangular box, and the centroid of the It showed that this information, along with the po-
bounding box was treated as the feature to track. sition of lanmark features, e.g. hands and head,
Positions of the center point in the previous could be used to reconstruct information about
frames were used to estimate the current position. the pose and structure of the human body from
As such, correct tracking was conducted when the a monoscopic view of a scene.
two subjects were occluded to each other even in Barron and Kakadiaris  present a simple,
the middle of the image sequences. efﬁcient, and robust method for recovering 3-D
Sminchisescu and Triggs  present a method human motion capture from an image sequence
for recovering 3-D human body motion from obtained using an uncalibrated camera. The pro-
monocular video sequences using robust image posed algorithm included an anthropometry ini-
matching, joint limits and non-self-intersection tialization step, assuming that the similarity of ap-
constraints, and a new sample-and-reﬁne search pearance of the subject over the time of acquisi-
strategy guided by rescaled cost-function covari- tion led to the minimum of a convex function on
ances. Monocular 3-D body tracking is challeng- the degree of freedom of a Virtual Human Model
ing: for reliable tracking at least 30 joint param- (VHM). The method searched for the best pose in
eters need to be estimated, subject to highly non- each image by minimizing discrepancies between
linear physical constraints; the problem is chron- the image under consideration and a synthetic im-
ically ill-conditioned as about 1/3 of the d.o.f. age of an appropriate VHM. By including on the
(the depth-related ones) are almost unobservable objective function penalty factors from the image
in any given monocular image; and matching an segmentation step, the search focused on regions
imperfect, highly ﬂexible, self-occluding model that belong to the subject. These penalty factors
to cluttered image features is intrinsically hard. converted the objective function to a convex func-
To reduce correspondence ambiguities they used tion, which guaranteed that the minimization con-
a carefully designed robust matching-cost met- verged to a global minimum.
ric that combined robust optical ﬂow, edge en- To reduce side-effects of hard kinematic con-
ergy, and motion boundaries. Even so, the am- straints, Dockstader et al.  proposed a new
biguity, nonlinearity and non-observability made model-based approach toward three-dimensional
the parameter-space cost surface be multi-modal, (3-D) tracking and extraction of gait and hu-
unpredictable and ill-conditioned, so minimizing man motion. They suggested the use of a hi-
it is difﬁcult. They discussed the limitations of erarchical, structural model of the human body
CONDENSATION-like samplers, and introduced that introduced the concept of soft kinematic con-
a novel hybrid search algorithm that combined straints. These constraints took the form of a pri-
inﬂated-covariance-scaled sampling and continu- ori, stochastic distributions learned from previ-
ous optimization subject to physical constraints. ous conﬁgurations of the body exhibited during
Experiments on some challenging monocular se- speciﬁc activities; they were used to supplement
quences showed that robust cost modelling, joint an existing motion model limited by hard kine-
and self-intersection constraints, and informed matic constraints. Time-varying parameters of the
sampling were all essential for reliable monocu- structural model were also used to measure gait
lar 3-D body tracking. velocity, stance width, stride length, stance times,
Bowden et al.  advocated a model based and other gait variables with multiple degrees of
approach to human body tracking in which the accuracy and robustness. To characterize track-
2-D silhouette of a moving human and the cor- ing performance, a novel geometric model of ex-
of these basis ﬂows. The leared motion models
may be used for optical ﬂow estimation and for
model-based recognition. They described a ro-
bust, multi-resolution scheme for directly com-
puting the parameters of the learned ﬂow mod-
els from image derivatives. As examples they in-
cluded learning motion discontinuities, non-rigid
motion of humans, and articulated human mo-
tion. Later, Sidenbladh  et al., also in ,
extended the work of  to a generative prob-
abilistic method for tracking 3-D articulated hu-
man ﬁgures in monocular image sequences (see
the example shown in Figure 27). These ideas
Figure 27. Human motion tracking by Black, similar to  that obtained further extension in
M.J. et al. , .
5.3.2 Multiple camera tracking
pected tracking failures was then introduced.
Yeasin and Chaudhuri  proposed a simple, To enlarge the monitored area and to avoid the
inexpensive, portable and real-time image pro- disappearance of subjects, a distributed-camera
cessing system for kinematic analysis of human strategy is set up to solve the ambiguity of
gait. They viewed this as a feature based multi- mactching when subjects are occluded to each
target tracking problem. They tracked the arti- other. Cai and Aggarwal  used multiple points
ﬁcially induced features appearing in the image belonging to the medial axis of the human upper
sequence due to the non-impeding contrast mark- body as the feature to track. These points were
ers attached at different anatomical landmarks of sparsely sampled and assumed to be independent
the subject under analysis. The paper described a of each other. Location and average intensity of
real-time algorithm for detecting and tracking fea- feature points were integrated to ﬁnd the most
ture points simultaneously. By applying a Kalman likely match between two neighboring frames.
ﬁlter, they recursively predicted tentative features Multivariate Gaussian distributions were presum-
location and retained the predicted point in case ably addressed in the lcass-conditional probabil-
of occlusion. A path coherence score was used for ity density function of features of candidate sub-
disambiguation along with tracking for establish- ject images. It was shown that using such a sys-
ing feature correspondences. Experimentations tem with three cameras indoors led to real time
on normal and pathological subjects in different operation.
gait was performed and results illustrated the ef- Sato et al.  represented a moving person
ﬁcacy of the algorithm. Similar algorithms to this as a combination of blobs of its body parts. All
one can be found in  and . the cameras were calibrated in advance regarding
Further to the application of optical ﬂow in the the CAD model of an indoor environment. Blobs
motion learning, Black et al.  proposed a were corresponded using their area, brightness,
framework for learning parameterized models of and 3-D position in the world coordinates. The
optical ﬂow from image sequences. A class of 3-D position of a 2-D blob was estimated on the
motion is represented by a set of orthogonal ba- basis of its height retrieved from the distance be-
sis ﬂow ﬁelds that were computed from a train- tween the weight center of the blob and the ﬂoor.
ing set using principal component analysis. Many Ringer and Lasenby  proposed to use mark-
complex image motion sequences can be repre- ers placed at the joints of the arm(s) or leg(s) be-
sented by a linear combination of a small number ing analyzed, referred to Figure 28. The location
Figure 28. Applications of multiple cameras in human motion tracking by Ringer and Lasenby.
of these markers on a camera’s image plane pro- ing people in multiple uncalibrated cameras. The
vided the input to the tracking systems with the system was able to discover spatial relationships
result that the required parameters of the body between the camera ﬁeld of views and uses this
could be estimated to far greater accuracy that one information to correspond between different per-
could obtain in the markerless case. This scheme spective views of the same person. They explored
used a number of video cameras to obtain com- the novel approach of ﬁnding limits of ﬁeld of
plete and accurate information on the 3-D loca- view of a camera as visible in other cameras. This
tion and motion of bodies over time. Based on helped disambiguate between possible candidates
the extracted kinematics and measurement mod- of correspondence.
els, the extended Kalman ﬁlter (EKF) and particle
ﬁlter tracking strategies were compared in their 5.4 Segmentation of human motion
applications to update state estimates. The results
justiﬁed that the EKF was preferred due to its less Spatio-temporal segmentation, illustrated in
computational demands. Figure 29, is vital in vision related analysis due
Rodor et al.  introduced a method for to the required reconstruction of dynamic scenes.
employing image-based rendering to extend the Spatial segmentation attempts to extract mov-
range of use of human motion recognition sys- ing objects from their backgorund, and divide
tems. Input views orthogonal to the direction a complicated motion stream into a set of sim-
of motion were created automatically to con- ple and stable motions . In order to fully
struct the proper view from a combination of depict human motion in constraint-free environ-
non-orthogonal views taken from several cam- ments, people have explored a variety of motion
eras. Image-based rendering was utilized in two segmentation strategies which consisted of both
ways: (1) to generate additional training sets for model-based and appearence-based approaches
these systems containing a large number of non- , , , , . Nevertheless, it
orthogonal views, and (2) to genrate orthogo- does not mean that segmentation is independently
nal views from a combination of non-orthogonal achieved. Instead, motion segmentation is nor-
views from several cameras. mally encoded within the tracking procedure and
Multiple cameras are needed to completely performs like an assistive tool and descriptor.
cover an environment for monitoring activity. To Gonzalez et al.  estimated motion ﬂows of
track people successfully in multiple perspec- features on human body using a standard tracker.
tive imagery, one has to establish correspondence Given a pair of subsequent images, an afﬁne fun-
between objects captured in multiple cameras. damental matrix was estimated by four pairs of
Javed et al.  presented a system for track- corresponding feature points such that number of
(tMHI) for representing motion from gradients in
successively layered silhouettes. The segmented
regions were not “motion blobs” but motion re-
gions that were naturally connected to parts of
moving objects. This movivated by the fact that
segmentation by collecting “blobs” of similar di-
rection motion frame to frame from optical ﬂow
 did not gurantee the correspondence of the
motion over time. By labeling motion regions
connected to the current silhouette using a down-
ward stepping ﬂoodﬁll, areas of motion were di-
rectly attached to parts of the object of interest.
Figure 29. Segmentation of human body by
Moeslund and Granum  suggested to use
colour information to segment the hand and head.
To the sensitivity of orignal RGB-based colours
to the intensity of lightling, they used chromatic
other feature points undergoing the afﬁne mo- colours which were normalised according to the
tion modelled by the matrix should be maximized intensity. In order to determine dance motion, hu-
. In the remaining subsequent frames, fea- man observers were shown video and 3-D motion
ture points corresponding to those used for the capture sequences on a video display . Ob-
ﬁrst fundamental matrix continued to estimate an servers were asked to deﬁne gesture boundaries
afﬁne motion model. At the last pair of frames, within each microdance, which was analyzed to
it led to a set of feature points identiﬁed as those compute the local minima in the force of the body.
belonging to a same limb and hence undergoing At the moment of each of these local minima, the
a same motion over the whole sequence. By re- force, momentum, and kinetic energy parameters
peating this estimate-and-sortout process over the were examined for each of lower body segments.
remaining feature points, different limb motions For each segment a binary triple was computed.
were ﬁnally segmented. It provided a complete characterization of all the
To obtain a high level interpretation of human body segments at each instant when body accel-
motion in a vedio stream one has to ﬁrst detect eration was at a local minimum.
body parts. Hilti et al.  proposed to com-
bine both pixel-based skin color segmentation 6 Robot-guided tracking systems
and motion-based segmentation for human mo-
tion tracking. The motivation of using skin color In this section, one can ﬁnd a rich variety of re-
was raised due to its orientation invariant and fast habiliation systems that are driven by electrome-
detection. The human face normally presents a chanical or electromagnetic tracking strategies.
large skin surface in a ﬂesh-tone, which is quite These systems, namely robot-guided systems
similar from person to person and even across var- hereafter, incorporate sensor technologies to con-
ious races . Using hue and saturation (HS) as duct “move-measure-feedback” training strate-
inputs, a color map was changed to a ﬁltered im- gies.
age, where each pixel is associated with a likeli-
hood of being . For compensating the impacts 6.1 Discriminating static and dynamic activities
of lighting changes, the motion-based segmenta-
tion was implemented and adaptive to exogenous To distinguish static and dynamic activities
changes. (standing, sitting, lying, walking, ascending
Bradski and Davies  present a fast and sim- stairs, descending stairs, cycling), Veltink et al.
ple method using a timed motion history image  presented a new approach to monitoring
ambulatory activities for use in the domestic
environment, which uses two or three uniax-
ial accelerometers mounted on the body. They
achieved a set of experiments with respect to the
static or dynamic characteristics. First, the dis-
crimination between static or dynamic activities
was studied. It was illustrated that static activi-
ties could be distinguished from dynamic activi-
ties. Second, the distinction between static activ-
Figure 30. The MANUS in MIT.
ities was investigated. Standing, sitting and ly-
ing could be distinguished by the output of two
accelerometers, one mounted tangentially on a to patients with more limited exercises capacity.
thigh, and the other sited on the sternum. Third, However, this work was only able to demonstrate
the distinction between a few cyclical dynamic the principle of assisting single limb exercise us-
activities was conducted. ing 2-D based technique. Therefore, a real system
As a result, it was concluded that the “discrim- was expected to be developed for realistic ther-
ination of dynamic activities on the basis of the apeutic exercises, which may contain “three de-
combined evaluation of the mean signal value and grees of freedom at the shoulder and two degrees
signal morphology is therefore proposed”. This of freedom at the elbow”.
ruled out the standard deviation of the signal and
the cycle time as the indexes of discriminating
activities. The performance of the dynamic ac-
tivity classiﬁcation on the basis of signal mor-
phology needs to be improved in the future work.
To ﬁnd out whether exercise therapy inﬂuences
The authors revealed that averaging adjacent mo-
plasticity and recovery of the brain following a
tion cycles might reduce standard deviations of
stroke, a tool is demanded to control the amount
signal correlation so as to improve measure per-
of therapy delievered to a patient, where appro-
formance. As a futher study, Uiterwaal et al.
priate, objectively measuring the patient’s per-
 developed a measurement system using ac-
formance. In other words, a system is required
celerometry to assess a patient’s functional phys-
to “move smoothly and rapidly to comply with
ical mobility in non-laboratory situations. Ev-
the patients’ actions” . Furthermore, abnor-
ery second the video recording was compared to
mally low or high muscle tone may misguide a
the measurment from the proposed system, and it
therapy expert to apply wrong forces to achieve
showed that the validity of the system was up to
the desired motion of limb segments. To address
these problems, a novel automatic system, named
MIT-MANUS (Figure 30), was designed to move,
6.2 Typical working systems
guide, or perturb the movement of a patient’s up-
6.2.1 Cozens per limb, whilst recording motion-related quanti-
ties, e.g. position, velocity, or forces applied 
To justify whether motion tracking techniques can (Figure 31). The experimental results were so
assist simple active upper limb exercises for pa- promising that the commercializing of the estab-
tients recovered from neurological diseases, i.e. lished system were under construction. However,
stroke, Cozens  reported a pilot study of us- it was described that the biological basis of recov-
ing torque attached to an individual joint, com- ery and individual patients’ needs should be fur-
bined with EMG measurement that indicated the ther studied in order to improve the performance
pattern of arm movement in exercises. Evidence of the system in different circumstances. These
depicted that greater assistance tended to be given ﬁndings were also justiﬁed in .
further expanded for a similar class of assistive
devices that may support and move the person’s
arm in a programmed way. Enclosed within the
system, a test-bed power assisted orthosis con-
sisted of a six DOF master with the end effec-
tor replaced by a six axis force/torque sensor. A
splint assembly was mounted on the force torque
sensor and supported the person’s arm. The base
level control system ﬁrst substract the weight of
the person’s arm from the whole measurement.
Control algorithms were established to relate the
estimation of the patient’s residual force to system
Figure 31. Image courtesy of Krebs, H.I.
position, velocity and acceleration . These
characteristic parameters are desired in regular
6.2.3 Taylor and improved systems movement analysis. Similar to this technique,
Chen et al.  provided a comprehensive jus-
Taylor  described an initial investigation
tiﬁcation for their proposal and testing protocols.
where a simple two DOFs arm support was built
to allow movements of shoulder and elbow in a
horizontal plane. Based on this simple device, he 6.2.4 MIME
then suggested a ﬁve exoskeletal system to allow Burgar et al.  and  summarized systems
activities of daily living (ADL) to be performed in for post-stroke therapy conducted at the Depart-
a natural way. The design was validated by tests ment of Veterans Affairs Palo Alto in collabora-
which showed that “conﬁguration interfaces prop- tion with Stanford University. The original prin-
erly with the human arm”, resulting in the trivial ciple had been established with two or three DOF
addition of goniometric measurment sensors for elbow/forearm manipulators. Amongst these sys-
identiﬁcation of arm position and pose. tems, the MIME shown in Figure 32 was more
Another good example was shown in , attractive due to its ability of fully supporting
where a device was designed to assist elbow the limb during 3-D movements, and self-guided
movements. This elbow exerciser was strapped modes of therapy. Subjects were seated in a
to a lever, which rotated in a horizontal plane. A wheelchair close to an adjustable height table. A
servomotor driven through a current ampliﬁer was PUMA-560 automation was mounted beside the
applied to drive the lever, where a potentiome- table that was attached to a wrist-forearm ortho-
ter indicated the position of the motor. Obtain- sis (splint) via six-axis force transducer. These
ing the position of the lever was achieved by us- position digitizer quntiﬁed movement kinematics.
ing a semi-circular array of light emitting diodes Clinical trials justiﬁed that the better improve-
(LEDs) around the lever. However, this system re- ments occurred in the elbow measures by the
quired a physiotherapist to activate the arm move- biomechanical measures than the clinical ones.
ment and to use a force handle to measure forces The disvantage of this system is that it could not
applied. This system was meanless to patients as allow the subject to freely move his/her body.
realistic physiotherapy exercises normally occur
in three dimensions. As a suggestion, a three DOF
prototype was rather advised.
To cope with the problem arisen from individ- A rehabilitator namely the “ARM guide”  was
uals with spinal cord injuries Harwin and Rah- presented to diagnose and treat arm movement
man  explored the design of a head controlled impairment following stroke and other brain in-
force-reﬂecting master-slave telemanipulators for juries. Some vital motor impairment, such as ab-
rehabilitation applications. This approach was normal tone, incoordination, and weakness, could
proposed systems feasible to non-trained users,
further studies need to be performed for the de-
velopment of a patient interface and therapist
workspace. For example, to improve the perfor-
mance of haptic interfaces, many researchers ex-
hibited their successful prototype systems, e.g.
, . Hawkins et al.  set up an exper-
imental apparatus consisting of a frame with one
chair, a wrist connection mechanism, two embed-
Figure 32. The MIME in MIT. ded computers, a large computer screen, and ex-
ercise table, a keypad and a 3 DOF haptic inter-
be evaluated. Pre-clinical results showed that face arm. The user “was seated on the chair with
this therapy produced quantiﬁable beneﬁts in the their wrist connected the haptic interface through
chronic hemiparetic arm. In the design, the sub- the wrist connection mechanism. The device end-
ject’s forearm was strapped to a specially de- effector consisted of a gimbal wich provides an
signed splint that “slides along the linear con- extra three DOF to facilitate wrist movement.”
straint”. A motor drove a chain drive attached These tests encourage a novel system to be ex-
to the splint. An optical encoder mounted on the plored so that a patient can move his/her arm con-
motor indicated the arm position. The forces pro- sistantly, smoothly, and correctly. Also, a friendly
duced by the arm were measured by a 6-axis load and human-like interface between the system and
cell addressing between the splint and the linear the user can be obtained afterwards.
constraint. The system needs to be further de- Comprehensive reviews on rehabilitation sys-
veloped in efﬁcacy and practicality although it tems are given in the literature  and .
achieved a great success.
6.3 Other relevant techniques
7.1 Remaining challenges
Although the following example might not be
relevant to the arm training systems, it still pro- The characters of the previous tracking systems
vides some hints for constructing a motion track- have been summarized earlier. It is demanding to
ing system. Hesse and Uhlenbrock  intro- understand the key problems addressed in these
duced a newly developed gait trainer allowing systems. Identifying the remaining challenges in
wheelchair-bound subjects to take repitive prac- the previous systems allows people to specify the
tice of a gait-like movement without overstress- aims of further development in the future work.
ing therapists. It consisted of two footplates po- All the previous systems required therapists to
sitioned on two bars, two rockers, and two cranks attend during training courses. Without the help
that provided the propulsion. The system gener- of a therapist, any of these systems either was
ated a different movement of the tip and of the unable to run successfully or just lost controlling
rear of the footplate during the swing. Else, the commands. The developed systems performed as
crank propulsion was controlled by a planetary supervised machines that simply followed orders
symtem to provide a ratio of 60 percent to 40 from the on-site therapists. Therefore, they did
percent between stance and swing pahses. Two not feasibly achieve patient-guided manipulation
cases of non-ambulatory patients who regained therapy so they can not be directly used in homes
their walking ability after 4 weeks of daily train- yet.
ing on the gait trainer were positively reported. The second challenge is cost. People in-
A number of projects have been undertaken for tended to build up complicated systems in order
human arm trajectories. However, to make the to achieve multi-purposes. This leads to expen-
sive components applied to the designed systems. Human movement parameters shall be properly
Some of these systems also consisted of particu- and accurately represented in the computer termi-
larly designed movement sensors, which limit the nal;
further development and broad application of the A friendly graphical interface between the sys-
designed systems. tem and the user is vital due to its application in
Inconvenience is another obvious challenge ad- home-based situations.
dressed in the previous systems. Most systems The whole system needs to be ﬂexibly attached
demanded people sit in front of a table or in a or installed in a domestic site.
chair. This conﬁguration constrains people in mo-
bility so they are not helpful at enhancing the 8 Conclusions
overall training of human limbs.
Due to the large working space requested for A number of applications have already been de-
these systems patients had to prepare spacious re- veloped to support various health and social care
covery rooms for setting up these syetems. As delivery. It has been justiﬁed that the rehabilita-
a consequence, this prevents people, who have tion systems are able to assist or replace face to
less accommodation space, from using such sys- face therapy. Unfortunately, evidence also shows
tems to regain their mobility. Alternatively, a tele- that human movement has a very complicated
metric and compact system coping with the space physiological nature, which prevents futher de-
problem shall be instead proposed. velopment of the existing systems. People hence
Poor performance of human-computer inter- need to have an insight into the formulation of hu-
face (HCI) designed for these syetems has been man movement. Our proposed project will cope
recognized. Unfortunately, people seldom touch with this technical issue by attempting to grasp
this issue as the other main technical problems human motion at each moment. Achieving such
had not been solved yet. However, a bad HCI an accurate localization of the arm may lead to
might stop post-stroke patients actively using any efﬁcient, convenient and cheap kinetic and kine-
training system. matic modelling for movement analysis.
Generally speaking, when one considers a re-
covery system, such six issues need to be taken Acknowledgements
into account: cost, size, weight, functional per-
formance, easy operation, and automation. We are grateful for the provision of partial liter-
ature sources from Miss Nargis Islam in the Uni-
7.2 Design speciﬁcation for a proposed system versity of Bath, and Dr Huiru Zheng in the Uni-
versity of Ulster.
Consider a system that looks at the limb reha-
bilitation training for the stroke-suffered patients References
during their recovery. The designer has to be
mainly concerned with such speciﬁed issues as  In http://www.polyu.edu.hk/cga/faq/.
follows:  In http://www.microstrain.com/.
Real time operation of the tracking system  In http://www.polhemus.com/.
is required in order that arm movement can be  In http://home.ﬂash.net/ hpm/.
recorded simultaneously;  In http://www.vrealities.com/cyber.html.
 In http://www.5dt.com/.
Human movement must not be limited in a par-
 In http://www.fakespace.com.
ticular workspace so telemetry is considered for  In http://www.exos.com.
transmitting data from the mounted sensors to the  In http://www.qualisys.se/.
workstation;  In http://www.charndyn.com/.
The proposed system shall not bring any cum-  In http://www.ascension-tech.com/.
bersome tasks to a user;  In http://www.arielnet.com/.
 In http://www.ndigital.com/polaris.php.  C. Bouten, K. Koekkoek, M. Verduim,
 In http://www.ptiphoenix.com/. R. Kodde, and J. Janssen. A triaxial accelerom-
 In http://www.simtechniques.com/. eter and portable processing unit for the assess-
 In http://www.peakperform.com/. ment daily physical activity. IEEE Trans. on
 J. Aggarwal and Q. Cai. Human motion anal- Biomedical Eng., 44(3):136–147, 1997.
ysis: A review. Computer Vision and Image  R. Bowden, T. Mitchell, and M. Sarhadi. Re-
Understanding: CVIU, 73(3):428–440, 1999. constructing 3d pose and motion from a sin-
 K. Akita. Image sequence analysis of real gle camera view. In BMVC, pages 904–913,
world human motion. Pattern Recognition, Southampton 1998.
17:73–83, 1984.  G. Bradski and J. Davies. Motion segmenta-
 F. Amirabdollahian, R. Louerio, and W. Har- tion and pose recognition with motion history
win. A case study on the effects of a haptic gradient. 13:174–184, 2002.
interface on human arm movements with im-  T. Brosnihan, A. Pisano, and R. Howe. Surface
plications for rehabilitation robotics. In Proc. micromachined angular accelerometer with
of the 1st Cambridge Workshop on Universal force feedback. In Digest ASME Inter. Conf.
Access an d Assistive Technology (CWUAAT), and Exp., Nov 1995.
25-27th March, University of Cambridge 2002.  S. Bryson. Virtual reality hardware. In Im-
plementating Virtual Reality, ACM SIGGRAPH
 C. Barron and I. Kakadiaris. A convex penalty
93, pages 1.3.16–1.3.24, New York 1993.
method for optical human motion tracking. In
 C. Burgar, P. Lum, P. Shor, and H. Machiel
IWVS’03, volume Nov, Berkeley, pages 1–10,
Van der Loos. Development of robots for reha-
bilitation therapy: The palo alto va/stanford ex-
 A. Baumberg and D. Hogg. An efﬁcient
perience. Journal of Rehab. Res. and Devlop.,
method for contour tracking using active shape
models. In Proc. IEEE Workshop on Motion of
 Q. Cai and J. K. Aggarwal. Tracking human
Non-Rigid and Articulated Objects, pages 194–
motion using multiple cameras. In ICPR96,
pages 68–72, 1996.
 A. Baumberg and D. Hogg. Generating spa-  J. Cauraugh and S. kim. Two coupled
tiotemporal models from examples. IVC, motor recovery protocols are better than
14:525–532, 1996. one electromyogram-triggered neuromuscular
 A. Bharatkumar, K. Daigle, M. Pandy, Q. Cai, stimulation and bilateral movements. Stroke,
and J. Aggarwal. Lower limb kinematics of hu- 33:1589–1594, 2002.
man walking with the medial axis transforma-  S. Chen, T. Rahman, and W. Harwin. Per-
tion. In Proc. of IEEE Workshop on Non-Rigid formance statistics of a head-operated force-
Motion, pages 70–76, 1994. reﬂecting rehabilitation robot system. IEEE
 D. Bhatnagar. Position trackers for head Trans. on Rehab. Eng., 6:406–414, Dec 1998.
mounted display systems. In Technical Re-  Z. Chen and H. J. Lee. Knowledge-guided vi-
port TR93-010, Deapartment of Computer Sci- sual perception of 3d human gait from a sin-
ences, University of North Carolina, 1993. gle image sequence. IEEE Trans. On Systems,
 M. Black, Y. Yaccob, A. Jepson, and D. Fleet. Man, and Cybernetics, pages 336–342, 1992.
Learning parameterized models of image mo-  J. Chung and N. Ohnishi. Cue circles: Im-
tion. In Proc. of CVPR, pages 561–567, 1997. age feature for measuring 3-d motion of artic-
 A. Blake, R. Curwen, and A. Zisserman. A ulated objects using sequential image pair. In
framework for spatio-temporal control in the AFGR98, pages 474–479, 1998.
tracking of visual contour. Int. J. Computer Vi-  M. Cordea, E. Petriu, N. Georganas, D. Petriu,
sion, pages 127–145, 1993. and T. Whalen. Real-time 21/2d head pose re-
 R. Bodor, B. Jackson, O. Masoud, and covery for model-based video-coding. In IEEE
N. Ppanikolopoulos. Image-based recon- Instrum. and Measure. Tech. Conf., Baltimore,
struction for view-independent human motion MD, May 2000.
recognition. In Int. Conf. on Intel. Robots and  J. Cozens. Robotic assistance of an active up-
Sys., 27-31 Oct 2003. per limb exercise in neurologically impaired
patients. IEEE Transactions on Rehabilitation  D. Gavrila. The visual analysis of human
Engineering, 7(2):254–256, 1999. movement: A survey. Computer Vision and Im-
 R. Cutler and M. Turk. View-based interpreta- age Understanding: CVIU, 73(1):82–98, 1999.
tion of real-time optical ﬂow for gesture recog-  L. Goncalves, E. Bernardo, E. Ursella, and
nition. In Int. Conf. on Auto. Face and Gest. P. Perona. Monocular tracking of the human
Recog., pages 416–421, 1998. arm in 3d. In ICCV95, pages 764–770, 1995.
 J. Dallaway, R. Jackson, and P. Timmers. Re-  J. Gonzalez, I. Lim, P. Fua, and D. Thalmann.
habilitation robotics in europe. IEEE Trans. on Robust tracking and segmentation of human
Rehab. Eng., 3:35–45, 1995. motion in an image sequence. In ICASSP03,
April, Hong Kong 2003.
 K. Dautenhahn and I. Werry. Issues of robot-
 R. Green. Spatial and temporal segmentation
human interaction dynamics in the rehabilita-
of continuous human motion from monocu-
tion of children with autism. In Proc. of FROM
lar video images. In Proc. of Image and Vi-
ANIMALS TO ANIMATS, The Sixth Interna-
sion Computing New Zealand, pages 163–169,
tional Conference on the Simulation of Adap-
tive Behavior (SAB2000), 11-15 Sep, Paris  G. Haisong, Y. Shiral, and M. Asada. Mdl-
2000. based segmentation and motion modeling in a
 S. Dockstader, M. Berg, and A. Tekalp. long image sequence of scene with multiple in-
Stochastic kinematic modeling and feature ex- dependently moving objects. 18:58–64, 1996.
traction for gait analysis. IEEE Transactions on  W. Harwin and T. Rahman. Analysis of force-
Image Processing, 12(8):962–976, Aug 2003. reﬂecting telerobotics systems for rebalitation
 I. Dukes. Compact motion tracking system for applications. In Proc. of the 1st European Conf.
human movement. MSc dissertation, Univer- on Dis., Virt. Real. and Assoc. Tech., pages
sity of Essex, Sep 2003. 171–178, Maidenhead 1996.
 F. Escolano, M. Cazorla, D. Gallardo, and  P. Hawkins, J. Smith, S. Alcock, M. Topping,
R. Rizo. Deformable templates for tracking and W. Harwin, R. Loureiro, F. Amirabdollahian,
analysis of intravascular ultrasound sequences. J. Brooker, S. Coote, E. Stokes, G. Johnson,
In 1st International Workshop of Energy Min- P. Mark, C. Collin, and B. Driessen. Gentle/s
imization Methods in CVPR, Venecia, Mayo, project: design and ergonomics of a stroke re-
1997. habilitation system. In Proc. of the 1st Cam-
 R. Fablet and M. J. Black. Automatic detec- bridge Workshop on Universal Access and As-
tion and tracking of human motion with a view- sistive Technology (CWUAAT), 25-27th March,
based representation. In ECCV02, pages 476– University of Cambridge 2002.
491, 2002.  S. Hesse and D. Uhlenbrock. A mechanized
gait trainer for restoration of gait. Journal
 H. Feys, W. De Weerdt, B. Selz, C. Steck,
of Rehab. Res. and Devlop., 37(6):701–708,
R. Spichiger, L. Vereeck, K. Putman, and
G. Van Hoydonck. Effect of a therapeutic in-
 A. Hilti, I. Nourbakhsh, B. Jensen, and R. Sieg-
tervention for the hemiplegic upper limb in the
wart. Narrative-level visual interpretation of
acute phase after stroke: a single-blind, ran-
human motion for human-robot interaction.
domized, controlled multicenter trial. Stroke,
In Proceedings of IROS 2001. Maui, Hawaii,
 W. Freeman, K. Tanaka, J. Ohta, and  D. Hogg. Model-based vision: A program to
K. Kyuma. Computer vision for computer see a walking person. Image and Vision Com-
games. In Proc. of IEEE International Con- puting, 1:5–20, 1983.
ference on Automatic Face and Gesture Recog-  T. Horprasert, I. Haritaoglu, D. Harwood,
nition, pages 100–105, 1996. L. Davies, C. Wren, and A. Pentland. Real-
 H. Fujiyoshi and A. Lipton. Real-time human time 3d motion capture. In PUI Workshop98,
motion analysis by image skeletonisation. In 1998.
Proc. of the Workshop on Application of Com-  http://bmj.bmjjournals.com/cgi/reprint/325/.
puter Vision, Oct. 1998.  http://www.vicon.com/.
 http://www.xsens.com/. consistency. In International Workshop on Vi-
 E. Huber. 3d real-time gesture recognition us- sion and Modeling of Dynamic Scenes (with
ing proximity space. In Proc. of Intl. Conf. on ECCV02), 2002.
Pattern Recognition, pages 136–141, August  D. Marr and K. Nishihara. Representation
1996. and recognition of the spatial organization of
 M. Isard and A. Blake. Contour tracking by three dimensional structure. Proceedings of the
stochastic propagation of conditional density. Royal Society of London, 200:269–294, 1978.
In ECCV, pages 343–356, 1996.  T. Moeslund and E. Granum. Multiple cues
 M. Ivana, M. Trivedi, E. Hunter, and P. Cos- used in model-based human motion capture. In
man. Human body model acquisition and FG’00, pages Grenoble, France, 2002.
tracking using voxel data. Int. J. of Comp. Vis.,  A. Mulder. Human movement tracking tech-
53(3):199–223, 2003. nology. In Technical Report 94-1, Simon
 O. Javed, S. Khan, Z. Rasheed, and M. Shah. Fraser University, 1994.
Camera handoff: tracking in multiple uncal-  S. Niyogi and E. Adelson. Analyzing and rec-
ibrated stationary cameras. In IEEE Work- ognizing walking ﬁgures in xyt. In CVPR,
shop on Human Motion, HUMO-2000, pages pages 469–474, 1994.
Austin, TX, 2000.
 J. O’Rourke and N. Badler. Model based im-
 G. Johansson. Visual motion perception. Sci- age analysis of human motion using constraint
entiﬁc American, 232:76–88, 1975. propagation. PAMI, 2:522–536, 1980.
 K. Kahol, P. Tripathi, and S. Panchanathan.
 X. Pennec, P. Cachier, and N. Ayache. Tracking
Gesture segmentation in complex motion se-
brain deformations in time sequences of 3D US
quences. In ICIP, pages Barcelona, Spain,
images. In Proc. of IPMI’01, pages 169–175,
 A. Kourepenis, A. Petrovich, and M. Meinberg.
 R. Polana and R. Nelson. Low level recogni-
Development of a monotithic quartz resonator
tion of human motion. In Proc. of Workshop
accelerometer. In Proc. of 14th Biennial Guid-
on Non-rigid Motion, pages 77–82, 1994.
ance Test Symp., Hollman AFB, NM, 2-4 Oct
1989.  C. Poynton. A technical introducton to digital
 H. Krebs, N. Hogan, M. Aisen, and B. Volpe. vedio. New York: Wiley, 1996.
Robot-aided neurorehabilitation. IEEE  R. Qian and T. Huang. Estimating articulated
Transactions on Rehabilitation Engineering, motion by decomposition. In Time-Varying
6(1):75–87, Mar 1998. Image Processing and Moving Object Recog-
 H. Krebs, B. Volpe, M. Aisen, and N. Hogan. nition, 3-V. Cappellini (Ed.), pages 275–286,
Increasing productivity and quality of care: 1994.
robot-aided nero-rehabilitation. Journal of  J. Rehg and T. Kanade. Model-based tracking
Rehabilitation Research and Development, of self-occluding articulated objects. In ICCV,
37(6):639–652, November/December 2000. pages 612–617, 1995.
 S. Kurakaka and R. Nevatia. Description and  D. Reinkensmeyer, L. Kahn, M. Averbuch,
tracking of moving articulated objects. In A. McKenna-Cole, B. Schmit, and W. Rymer.
ICPR92, pages 491–495, 1992. Understanding and treating arm movement im-
 W. Long and Y. Yang. Log-tracker: An pairment after chronic brain injury: Progress
attribute-based approach to tracking human with the arm guide. Journal of Rehab. Res. and
body motion. PRAI, 5:439–458, 1991. Devlop., 37(6):653–662, 2000.
 P. Lum, D. Reinkensmeyer, R. Mahoney,  R. Richardson, M. Austin, and A. Plummer.
W. Rymer, and C. Burgar. Robotic devices Development of a physiotherapy robot. In
for movement therapy after stroke: current sta- Proc. of the Intern. Biomech. Workshop, En-
tus and challenges to clinical acceptance. Top shede, pages 116–120, April 1999.
Stroke Rehabil., 8(4):40–53, 2002.  M. Ringer and J. Lasenby. Modelling and
 M. Machline, L. Zelnik-Manor, and M. Irani. tracking articulated motion from multiple cam-
Multi-body segmentation: Revisiting motion era views. In BMVC, Sep, Bristol, UK 2000.
 A. Roche, X. Pennec, G. Malandain, and of the 9th Chinese Automation and Computing
N. Ayache. Rigid registration of 3D ul- Society Conf. In the UK, England, Sep 2003.
trasound with MR images: a new approach  A. Taylor. Design of an exoskeletal arm for
combining intensity and gradient informa- use in long term stroke rehabilitation. In
tion. IEEE Transactions on Medical Imaging, ICORR’97, University of Bath, April 1997.
20(10):1038–1049, oct 2001.  C. Theobalt, M. Magnor, P. Schueler, and
 K. Rohr. Toward model-based recognition of H. Seidel. Combining 2d feature tracking and
human movements in image sequences. Com- volume reconstruction for online video-based
puter Vis., Graphics Image Process., 59:94– human motion capture. In Proceedings of Pa-
115, 1994. ciﬁc Graphics 2002, pages 96–103, Beijing
 R. Ronfard, C. Schmid, and B. Triggs. Learn- 2002.
ing to parse pictures of people. In European  M. Uiterwaal, E. Glerum, H. Busser, and
Conference on Computer Vision, LNCS 2553, R. Van Lummel. Ambulatory monitoring of
volume 4, pages 700–714, June 2002. physical activity in working situations, a vali-
 K. Sato, T. Maeda, H. Kato, and S. Inokuchi. dation study. Journal of Medical Engineering
Cad-based object tracking with distributed & Technology, 22:168–172, July/August 1998.
monocular camera for security monitoring. In  P. Veltink, H. Bussmann, W. de Vries,
Proc. 2nd CAD-Based Vision Workshop, pages W. Martens, and R. Van Lummel. Detection of
291–297, 1994. static and dynamic activities using uniaxial ac-
 N. Shimada, K. Kimura, Y. Shirai, and celerometers. IEEE Transactions on Rehabili-
Y. Kuno. Hand posture estimation by combin- tation Engineering, 4(4):375–385, Dec 1996.
ing 2-d appearance-based and 3-d model-based  G. Welch and E. Foxlin. Motion tracking sur-
approaches. In ICPR00, 2000. vey. IEEE Computer Graphics and Applica-
tions, pages 24–38, Nov/Dec 2002.
 H. Sidenbladh, M. J. Black, and D. Fleet.
 C. Wren, A. Azarbayejani, T. Darrell, and
Stochastic tracking of 3d human ﬁgures using
A. Pentland. Pﬁnder: Real-time tracking of
2d image motion. In ECCV, pages 702–718,
the human body. IEEE Transactions on Pattern
Dublin, June 2000.
Analysis and Machine Intelligence, 19(7):780–
 H. Sidenbladh, M. J. Black, and L. Sigal. Im-
plicit probabilistic models of human motion for
 H. Xie and G. Fedder. A cmos z-axis ca-
synthesis and tracking. In ECCV, 2002.
pacitive accelerometer with comb-ﬁnger sens-
 C. Sminchisescu and B. Triggs. Covariance
ing. In Technical Report, The Robotics Insti-
scaled sampling for monocular 3d body track-
tute, Carnegie Mellon University, 2000.
ing. In Proceedings of the Conference on Com-  Y. Yaccob and M. Black. Parameterized mod-
puter Vision and Pattern Recognition, pages eling and recognition of activities. In ICCV,
447–454, Kauai, Hawaii 2001. pages 120–127, 1998.
 Y. Song, X. Feng, and P. Perona. Towards de-  Y. Yacoob and L. Davies. Learned models for
tection of human motion. In CVPR, pages 810– estimation of rigid and articulated human mo-
817, 2000. tion from stationary or moving camera. Int.
 M. Storring, H. Andersen, and E. Granum. Journal of Computer Vision, 36(1):5–30, 2000.
Skin colour detection under chaning lighting  M. Yeasin and S. Chaudhuri. Development
conditions. In Proc. of 7th Symposium on In- of an automated image processing system for
telligence Robotics System, 1999. kinematic analysis of human gait. Real-Time
 S. Stroud. A force controlled external pow- Imaging, 6:55–67, 2000.
ered arm orthosis. Masters Thesis, University  M. Yeasin and S. Chaudhuri. Towards auto-
of Delaware 1995. matic robot program: Learning human skill
 D. Sturman and D. Zeltzer. A survey of glove- from perceptual data. IEEE Trans. on Systems
based input. IEEE Computer Graphics and Man and Cybernetics-B, 30(1):180–185, 2000.
Aplications, pages 30–39, 1994.  M. Yeasin and S. Chaudhuri. Visual under-
 Y. Tao and H. Hu. Building a visual tracking standing of dynamic hand gestures. Pattern
system for home-based rehabilitation. In Proc. Recognition, 33(11):1805–1817, 2000.
 L. Zelnik-Manor, M. Machline, and M. Irani.
Multi-body segmentation: Revisiting motion
consistency. In International Workshop on Vi-
sion and Modeling of Dynamic Scenes (with
 T. Zimmerman and J. Lanier. Computer data
entry and manipulation apparatus method. In
Patent Application 5,026,930, VPL Research
 A. Zisserman, L. Shapiro, and M. Brady. 3d
motion recovery via afﬁne epipolar geometry.
Int. J. of Comput. Vis., 16:147–182, 1995.