Photorealistic Visualization for Virtual and Augmented Reality in by zfz19897


									     Photorealistic Visualization for Virtual and Augmented Reality in Minimally
                                    Invasive Surgery

                       Danail Stoyanov1, Mohamed ElHelw1, Benny P Lo1, Adrian Chung1,
                                    Fernando Bello2 and Guang-Zhong Yang1,2
                                        Visual Information Processing Group,
                                             Department of Computing,
                                           Imperial College, London, UK
                              {dvs, me, benlo, ajchung, rmd99, gzy @},
                                 Department of Surgical Oncology and Technology,
                               Imperial College, St. Mary’s Hospital, London, UK

                       Abstract                              diagnosis and procedural planning methods. Meanwhile,
     In surgery, virtual and augmented reality are           intra-operatively, AR presents opportunities in
increasingly being used as new ways of training,             navigation. In Image Guided Surgery (IGS), the
preoperative planning, diagnosis and surgical                surgeon’s view of the operating field is augmented by the
navigation. Further development of virtual and               addition of computer-generated data. This may serve in
augmented reality in medicine is moving towards              conjunction with the surgeon’s perception of the
photorealistic rendering and patient specific modeling,      patient’s anatomy and allow the flexible intra-operative
permitting high fidelity visual examination and user         use of preoperative tomographic data.
interaction. This coincides with the current development          The process of using computer generated three-
in Computer Vision and Graphics where image                  dimensional models of human anatomy makes full
information is used directly to render novel views of a      utilization of all the available patient data during and
scene. These techniques require extensive use of             before a procedure. Combined with modern robotics,
geometric information about the scene and the purpose        such usage of technology in the operating theatre is
of this paper is to provide a comprehensive review of the    known as Computer Assisted Surgery (CAS). This is
underlying techniques required for building patient          reforming surgical practice by setting new standards in
specific models with photorealstic rendering. It also        accuracy and patient treatment.
highlights some of the opportunities that image based             The first work in medical VR during the early 1990s
modeling and rendering techniques can offer in the           was restricted by the limitations of the available
context of minimally invasive surgery.                       technology and the concept of CAS was an interesting
                                                             research prospect. VR based simulators struggled to
                                                             achieve realism due to compromises between the realistic
1. Introduction                                              representation of appearance and the accurate modeling
                                                             tissue and instrument behaviors. In contrast, now VR is
     One of the most promising applications for Virtual      applied in a spectrum of trainers while the prospects of
Reality (VR) and Augmented Reality (AR) technologies         AR in the operating theater are becoming reality [1]. In
is medicine. Computer generated environments can be          particular, a field of medicine that is suited to VR and
used in all stages of healthcare delivery, from the          AR environments is Minimally Invasive Surgery (MIS).
education of medical staff up to providing effective         This is largely due to the operating theater set up. A
therapy and rehabilitation methods.                          surgeon reaches target areas through small ports in which
     In surgery, the computerized visualization of patient   the instruments are inserted and observes the operating
data models has both pre- and intra-operative purposes.      field on a screen. This helps to minimize the need for
Pre-operatively, VR simulators may be used to train          elaborate devices replicating the theater for VR
practitioners in basic surgical tasks as well as complete    simulation purposes. Meanwhile, AR technologies
interventions. More excitingly, patient specific models      impose minimal intrusion on the surgical environment as
built from tomographic data offer to allow the practice of   any computerized models may be directly displayed onto
complex procedures prior to working with the patient         the present screen.
directly. Similarly, VR may also be used for effective
     In parallel to the investigation of CAS technologies,     rendering of three-dimensional computer models.
the surgical community has developed the range of              However, recent approaches have investigated the use of
minimally invasive surgical procedures. Reducing the           image information directly to generate novel views of a
trauma caused by an operation through a minimally              scene. Such techniques require geometric information of
invasive approach directly improves the treatment of the       the scene’s structure. In a sense, although Computer
patient and the chances of a prompt and successful             Graphics and Computer Vision both worked with images
recovery. However, the inherent difficulties of MIS            they have worked in opposite directions. However, due
techniques have traditionally imposed limitations on           to the recent merger of goals in Computer Vision and
their applicability. Reduced instrumental control and          Computer Graphics a young field is emerging, which
freedom, combined with unusual hand-to-eye                     promises to advance the current state of photo-realism in
coordination and limited view of the operating field,          VR and AR. This is called Image Based Modeling and
enforce restrictions on the surgeon not common to              Rendering.
conventional surgery. Advances in video imaging,                    An important feature of IBMR methods is the use of
instrumentation and robotics have played a key role in         only the most basic information available, images. This
resolving such fundamental limitations and bringing            makes it a powerful tool in many applications and in the
forth the widespread use of MIS in a variety of                context of the paper to MIS. For the purposes of in vivo
procedures of cardiothoracic surgery, gynecologic              patient specific model acquisition and for photo-realsitic
surgery, general surgery, neurosurgery, orthopaedic            VR simulation and AR based IGS, IBMR holds
surgery, urologic surgery and oncologic surgery.               unparallel potential.
     Due to technological limitations, early VR                     In the remainder of this paper we discuss the current
simulation systems for MIS only addressed hand-to-eye          state of the art in Minimally Invasive Computer Assisted
coordination and basic skill training. More recently,          Surgery (MICAS) technologies and the major challenges
commercial VR simulation systems for complete MIS              in the field. We then provide a brief review of the IBMR
procedural training have emerged (Figures 1 and 2).            techniques most likely to make a significant impact on
Such training tools will be a major part of surgical           improving visualization in MICAS.
education and skills assessment. However, the
availability of patient specific trainers will be of           2. State of the Art in CAS Visualization
significantly greater value. Patient specific systems will
not only be useful for training but will instigate new            The use of image-based computer technology in
approaches to computer assisted diagnosis and procedure        MIS may be broadly divided into four categories:
     In the intra-operative scenario, IGS systems have             •    Training, Simulation and Skills Assessment
proved more difficult to develop. This is largely due to           •    Computer Assisted Diagnosis
the added complication of registering data from different          •    Pre-operative Planning
modalities. Although IGS is widely used in areas such as           •    Intra-operative Guidance
neurosurgery [2] and spinal surgery [3], in MIS the
complications for accurate registration are greater. This          In the following sections we briefly review the
is due to the actively deforming operating field and the       current status of each area.
large changes in anatomical structure between acquiring
tomographic data and performing an intervention.               2.1. Training, Simulation and Skills Assessment
Therefore methods of acquiring the structure of the
operating field are desirable since tomographic data is             The development of Computer Graphics techniques
not enough to build reliable patient specific models.          and hardware to allow real-time, photo-realistic
Methods for structure recovery using radiotherapy are          rendering of computer environments has opened a new
undesirable due to the high radiation exposure to both         door in the education and training of medical staff.
surgical staff and the patient and the high cost of suitable   Virtual reality simulators offer potentially cost effective
instruments. Other methods using laser range finders are       methods of training surgeons through reducing the
applicable but require additional equipment in the             required practice on cadavers and actual patients, whilst
operating theater. In the case of laparoscopic surgery,        allowing unlimited preparation on a virtual subject.
they are impractical and go against the drive for minimal           The computer simulation environment also allows
intrusion. Hence, the most amicably poised methods lie         easier and prospectively automatic skills assessment of
in the utilization of the camera equipment that is already     the trainee through the analysis of movements and events
available.                                                     throughout the procedure. Preliminary results
     The problem of recovering three-dimensional               demonstrate that the time transfer rate (the time spent on
structure from two-dimensional images has been a key           a simulation compared to that spent operating on a
subject of research in Computer Vision for many years.         cadaver or patient) is 25% - 28% and may be expected to
Although fundamentally ambiguously posed, recently             rise up to 50%, when compared to other areas where
solutions based on the use multiple images of a scene          virtual simulation is employed, such as flight pilot
have had relative success. Meanwhile, the graphics             training [5]. A rise in transfer rate driven by
community has focused on achieving photo-realistic
technological advance will be of added value to surgical      physiological behaviour is needed. On the other hand,
training.                                                     FEM is more accurate for rendering the mechanical
     Technological limitations forced early virtual reality   characteristics of human organs, but it is more
simulations into a trade off between visual realism and       computationally expensive. While the ability to
accurate mechanical modeling of soft-tissue, instruments      interactively simulate the accurate deformation of soft
and their interaction. This meant that systems either         tissue is important, a system that does not include the
provided basic skills training such as instrument control     capability to modify the simulated tissue has limited
and co-ordination, or near photo-realistic quality            utility. Considerable efforts have been made by the
anatomical models, which were useful in developing a          research community to improve on the modeling and
good understanding of body regions. However, modern           interactivity of soft tissue. Most of these efforts have
simulation systems bridge the gap between the two and         concentrated on pre-computing deformations or
provide full procedural training, not only offering           simplifying the calculations. There have been several
visually adequate rendering but allowing the practitioner     attempts at locally adapting a model in and around the
to interact with the training environment.                    regions of interest [11,12,13]. However, these models
                                                              rely on a pre-processing phase strongly dependent on the
                                                              geometry of the object, with the major drawback that
                                                              performing dynamic structural modification becomes a
                                                              challenging issue, if not impossible. More recently, a
                                                              model, which adapts the resolution of the mesh by re-
                                                              tesselating it on-the-fly in and around the region of
                                                              interaction was presented by [14]. This adaptive model
                                                              does not rely on any pre-processing thus allowing
                                                              structural modifications (Figure 3).

 Figure 1,      Simulation device for colonoscopy and

                                                               Figure 3,     Adaptive soft tissue model with cutting

                                                                   Although advances in soft tissue deformation,
    Figure 2,     The operating field during virtual          haptics integration and photo-realistic representation will
                     stitching practice                       increase the impact of VR systems in surgical education
                                                              a more exciting prospect lies in the use patient specific
      MIS approaches are better suited to virtual             data models. Moreover, the use of IBMR techniques can
simulation than most, due to the nominal intrusion upon       significantly improve the realism of traditional
the surgical environment. In these schemes, the               deformation models.
practitioner looks at a screen whilst performing the
operation on the virtual subject as they would in a real      2.2. Computer Assisted Diagnosis
procedure. The key areas of development required in
                                                                   Computer technologies have reformed the varous
current simulators for MIS procedures are photo-realistic     forms of modern patient diagnosis improving efficiency
representation of the operating field, haptic interface
                                                              and accuracy as well as helping to control healthcare
devices to simulate force feedback (haptics) and accurate
                                                              costs. Image processing and computer vision techniques
behavioral models of soft tissue and instruments. Various
                                                              are already widely used in radiological diagnosis to
methods have been proposed and implemented with the
                                                              enhance images and provide some automation in data
aim of achieving real-time simulation of soft tissue
                                                              analysis. In MIS procedures, similar possibilities are
deformation. The most popular being Mass Spring
                                                              available to improve the practitioner’s view by removing
models (MSM) [6,7] and the Finite Element Method
                                                              undesirable effects such as lens distortion, specular
(FEM) [8,9,10]. MSM have the potential of reaching
                                                              reflections and shadows. More comprehensive analysis
real-time but lack in accuracy when precise modeling of
                                                              in minimally invasive diagnosis may be performed but
requires in vivo structural information. For example in        combine the various sets of data for intra-operative
[15], shape recovery from images is used to recover the        visualization in an AR framework. However, tracking the
structure of the stomach wall and automatically detect         surgical instruments and performing registration of pre-
abnormalities.                                                 operative data sets to the intra-operative state of the
     Recent developments in medical VR have instigated         patient anatomy are challenging tasks. Nevertheless IGS
the pursuit of non-invasive diagnosis, Virtual Endoscopy       methods are becoming common in many areas requiring
(VE). The idea behind VE is to use tomographic data to         high precision, such as neurosurgery, spinal surgery,
build a patient specific model, which may be used for          orthopaedic surgery and ENT (Ear, Nose and Throat)
examination. A significant advantage of such an                procedures [3,19,20].
approach is not only its non-intrusive nature but also the          Current guidance systems for non-MIS procedure
opportunity to evaluate anatomical structures previously       have focused on using markers for the tracking of
too small or delicate to investigate such as the eyes and      instrument and tissue deformations (using fiducals).
inner ears.                                                    Markers aid the registration between pre-operative
     In [16], image based rendering is applied to virtual      tomographic data and the intra-operative structure of the
colonoscopy in order to minimize inspection time               operating field. However, the introduction of markers in
without loss of diagnostic sensitivity. Arguably, images       an MIS procedure is more difficult and preferably
are a more manageable medium for communication                 avoided. Hence, methods of intra-operative structure
between physicians compared to 3D geometry. With               recovery are required to allow registration.
further developments in IBMR techniques the realism of              Another important part of IGS systems is the
VE may be improved and an environment realistically            appropriate visualization of the augmented operating
replicating a real minimally invasive procedure may be         field. Image-based rendering (IBR) techniques are
created.                                                       suitable for this purpose due to the availability of image
                                                               data and of possible structural information.
2.3. Pre-operative Planning
                                                               2.5. Major Challenges and Clinical Needs
     Planning is an important pre-operative step in an
MIS intervention. Due to the restricted movement of the             The basic clinical needs driving the use of
instruments, care must be taken to position instrument         computerized technology arise at the very foundations of
ports with maximum access to target areas.                     the surgical training of practitioners and follow the
     In the past, surgeons have had to visualize the three-    surgical process all of the way to the actual intervention.
dimensional structure of the operating area from               Underlying the use of image based computer assistance
examining two-dimensional slices of X-Ray, CT or MRI           is the aspiration to achieve comprehensive data
data. However, the avoidance of delicate areas, optimal        utilization both in the pre and intra-operative scenarios.
route planning and the forecasting of potential problems       The challenges involved in realizing this goal are related
are now given aid through volume visualization of data.        to the realism of data representation (visualization), the
As such, the practitioner is in a better position to prepare   robust attainment of patient specific models and the
the surgical procedure before its execution [17].              registration between different data modalities.
Computer algorithms for finding the optimal port                    A key feature of virtual patient models used for
placements in an MIS procedure may also be utilized.           training, planning and diagnosis is the realism of
     The dominant factors of path planning are the             visualization. IBMR may be used from two directions to
accurate reconstruction of patient anatomical structures.      improve existing systems. Firstly, images captured from
However, the realism of virtual models is also important       actual MIS operations may be used to enhance the
and may be improved by IBMR techniques. In [2,18]              realism of surface appearances and effects such as
images are captured from a tracked endoscope inserted          bleeding. On the other hand, image based modeling
into the cranial cavity. Using image based modeling            during procedures may give additional information that
techniques, several endoscopic images are painted onto         improve aspects of realism related to haptic devices
the surface of a three-dimensional model derived from          providing force feedback, instrument to tissue
MRI or CT scans of the brain. The resulting combined           interactions and tissue deformations.
dataset can be used in surgical planning as well as in              The composition of patient specific models also
training and intra-operative guidance.                         requires intra-operative image based modeling. During
                                                               a surgical procedure it is unwise to rely entirely on pre-
2.4. Intra-operative Guidance                                  operatively acquired data. There are often scanner-based
                                                               geometric distortions, due to irregular magnetic fields,
     Image Guided Surgery improves the intra-operative         imprecise table speeds and gantry tilts [19]. Additionally,
use of available patient data from different modalities by     patient anatomy tends to shift and deform significantly
augmenting the surgeon' view of the operating field.           when moved to the operating table. These have to be
Initially, after a pre-operative examination of patient        corrected prior to any intervention by using active
data, the surgeon was required to mentally maintain an         registration with intra-operative data.
idea of the specific anatomical structures during the               In image-guided surgery, a fundamental problem is
operation. Recently systems have emerged which                 the registration of the intra-operative operating field to
pre-operatively obtained data. The registration of 2D and        •    Efficient    modelling,    representation     and
3D data is a common problem in augmented reality                      rendering of complex objects;
applications [20]. In the medical field, the volumetric          •    Sustained or enhanced overall system
data (MRI, CT, PET) must be scaled, rotated and                       performance and increased frame rate;
translated to present a view consistent with the intra-          •    Increased rendering realism by accounting for
operatively captured images (fluoroscope, angiogram).                 global illumination and inter-object reflectance;
Multi-modal images can be combined to provide
complementary information in a single view, thus                  Image-based rendering approaches can be classified
reducing cognitive effort on the part of the surgeon.        into three main categories: rendering with no geometry,
Registration can be complicated by the tissue                rendering with implicit geometry and rendering with
deformations that take place during the procedure, which     explicit geometry [28]. While rendering with no
is often the case in neurosurgery [21] and even more so      geometry systems have their input as a large number of
in upper GI and colorectal surgery. This necessitates        images, explicit geometry systems use few images but
non-rigid transformations to make the pre-operative data     rely on having a full scene description in terms of
consistent with current anatomy [22,23]. Image-based         geometry or depth information in order to render new
methods aim to achieve a less invasive, non-contact, and     views of the scene. Implicit geometry systems employ
automated method for 2D-3D registration. Segmentation        point correspondences among images to generate in-
based techniques extract curves, surfaces [24] and           between views.
higher-level features [25,26] from the images and                 Even though many image-based rendering
volumes to be registered. These anatomically related         techniques do exist, very few are being used in medical
features are used as the sole input for aligning the         simulation. In the following subsections we describe
datasets. A drawback of these methods is that the            significant image-based rendering techniques within each
accuracy of the segmentation step is a limiting factor on    category and discuss their suitability for surgical
the accuracy of the registration [27].                       simulation.
     IBMR is a young field, which has found popularity
when dealing with scenes containing regular shapes and       3.1 Lightfield/Lumigraph Rendering
surface modeled by simple reflectance functions, for
example architectural scenes. In the MIS imaging                  A light field is defined as “the radiance at a point in
environment, scenes contain less well defined surfaces,      a given direction” [29]. Light Field rendering is a no
with complex reflectance and the actual image                geometry image-based rendering technique based on
acquisition is different since the geometric relationships   capturing light reflected off illuminated static surfaces. A
between the light source, the camera and the scene are on    set of images of an object from many different viewing
a much smaller scale. Therefore, thorough investigation      points that record the flow of light around the object is
of these issues is required before IBMR methods may be       used to generate new views. Light field/Lumigraph
utilized in the clinical environment.                        techniques can be described generally as generating and
                                                             storing all possible views of an object or a scene. No
3. Image Based Rendering                                     depth or other image related information is needed.
                                                             Because both methods depend on having a huge database
     Image-based rendering techniques have gained            representing almost all possible images of objects, a
considerable attention over the last few years in the        compression method was introduced to handle large
computer graphics community. Their potential to create       database size. However, the storage requirements of
high quality output images and the possibility of            scenes made from many objects are still prohibitive for
integrating them into existing computer graphics systems     surgical applications. Other delimiting factors are the
makes them attractive solutions to photorealism.             constrained lighting conditions along with the provision
Conventional Computer Graphics techniques generate           of static non-penetrating geometry required when
images by simulating the interaction of light with the       acquiring scene images.
scene described using geometric primitives and surface
properties. On the other hand, computer vision works in      3.2 View interpolation and view morphing
the opposite direction. Starting with a collection of
images, the target is to calculate depth of image points          View interpolation techniques [30] combine two
and generate 3D geometric models. In general, image-         reference images acquired at different viewing points in
based rendering uses images to generate novel views of a     order to generate in-between views. It is based on the
synthetic or real environment. The geometric models of       observation that small viewpoint movements cause little
the scene are replaced solely by images or images            variations in generated images. A pre-calculated
enhanced with geometry or depth information.                 correspondence or morph map between the reference
     In its broader definition, image-based rendering        images is used to guide image generation. New images
combines both computer graphics and computer vision in       are generated when moving along the line connecting the
order to achieve the following advantages:                   two viewing points by linearly interpolating between the
                                                             two reference viewing points and the new intermediate
                                                             viewing point.
     Using two input images, view morphing generates            Using the previous method for inspecting wrinkled
intermediate views as a linear interpolation between the   surfaces, such as the colon, introduces the problem of
two images by means of projective transformations to       inter surface occlusions, which decreases the
preserve major image features such as lines and conics.    effectiveness of the whole procedure. To alleviate this
     Both view morphing and view interpolation are         problem, some methods unfold the colon so that its inner
implicit geometry techniques that are of limited use in    wall is spread on a flat plane displayed as a panoramic
surgery simulation.                                        views, while other techniques solve this problem by
                                                           calculating a number of extra viewing points along the
3.3 Texture Mapping                                        navigation path so that most of the surface becomes
                                                           visible [35].
     The most familiar explicit geometry IBR method is
texture mapping which greatly improves the quality of      3.4 3D Image Warping
computer-generated imagery [31]. By mapping 2D
images onto planar polygons, texture mapping was the            3D image warping methods are explicit geometry
first technique to represent complex materials that are    approaches that rely on having depth information for all
otherwise hard to model and render. In addition, it is     image pixels, thus turning them into 3D samples. Using
used to enhance scene realism by simulating global         this information, source image points can be re-projected
illumination effects with less computation. Texture        from a reference image plane onto a target image plane.
mapping can change the way a computer graphics model       The 3D warping process can cause occlusion and
looks in many different ways [32]:                         disocclusion problems. Disocclusions appear as empty
                                                           holes in the target image because of parts of the warped
    •   Colour: using texture image to colour the object   scene becoming visible. Hole filling algorithms are used
        surface;                                           to alleviate this problem. Occlusion problems are caused
    •   Specular Colour: using an environment map to       by more than one point projecting to the same target
        represent the specular object reflecting the       image point. A visibility resolving solution based on the
        environment;                                       painter’s algorithm was introduced by [36].
    •   Normal Vector Perturbation: perturb surface             The usefulness of 3D image warping in real life
        normals according to corresponding values          applications has not yet been proven. However, 3D
        stored in the map to get 3D-looking bumps          image warping can be used within conventional
        appearing on the surface;                          geometry-based graphics systems to enhance realism or
    •   Displacement Mapping: instead of modulating        enhance performance. One example is relief texture
        the shading equation, the geometry itself is       mapping [37]. It is an extension to texture mapping
        modified along normals of each surface             where each texture image pixel is associated with a depth
        element;                                           value measured perpendicular to the image plane at the
    •   Transparency: opacity of transparent objects is    observed pixel. During runtime, pixels are projected to
        controlled using a map;                            the plane of the polygon to dynamically generate the
                                                           desired texture. Therefore, the resulting images have
     Texture mapping is the de facto technique used in     enhanced 3-D details along with correct view motion
almost every surgical simulation to achieve realism.       parallax. Using a cylindrical projection manifold,
Environment mapping [31,33] is a texture mapping           cylindrical relief texture mapping [38] can be used to
technique that captures most of the light incident at a    improve the realism of textured closed surfaces used in
scene point and uses the generated image to approximate    endoscopic simulations.
environment reflections on scene objects. It can also be
used to represent the environment as seen by a
     Cubic environment mapping has been used in
medical simulation to allow for VE on low-end
machines. Data sets are acquired from computer
tomography (CT) or magnetic resonance imaging (MRI).
A polygonal or volume-based rendering technique is
used to generate the images corresponding to the views
of a virtual camera at regular intervals along a central
path from point of entry to point of interest [34]. Each
view is made up from the six images corresponding to
the projections of the scene on the faces of a cube
enclosing the current sampling point. These images are
used during runtime to texture map the faces of the cube
enclosing the viewing point providing the user with four     Figure 4,   A close-up view of a tissue rendered
degrees of freedom. This technique is widely used in               using conventional polygonal modelling.
virtual bronchoscopy and virtual colonoscopy.
                                                               research effort in the area has been in the use of multiple
                                                               views of a scene [40,41] (Structure-from-Motion,
                                                               Stereopsis) where the task of structure recovery is well
                                                               posed through triangulation. Monocular cues such as
                                                               shading and texture are less well defined and do not
                                                               recover the position of scene points but the shape of
                                                               surfaces. Techniques using focus as a cue are in a sense
                                                               analogous to Structure-from-Motion (SFM) by using
                                                               multiple images but with varied camera rather than
                                                               position and orientation, however, a measure of the focus
                                                               at scene regions is ambiguous and difficult to quantify.
                                                                    In the context of MIS, the inherent characteristics of
 Figure 5,    A close-up view of the same tissue as in         the images in question immediately limit the application
    Figure 3 rendered using the image-based method.            of some reconstruction techniques. The inner body
                                                               organs have complex reflectance properties, irregular
                                                               shapes, variably textured surfaces and they may deform
     Image-based rendering can be used to generate in-
                                                               in time. Both probabilistic and texture gradient based
between frames for slow or network based surgery
                                                               Shape-from-Texture approaches produce erroneous
simulation systems. Post-rendering 3D warping [39] uses
                                                               results in regions of little or non-uniform texture.
IBR to increase the output frame rate or to make up for
                                                               Similarly, Shape-from-Focus methods, which often use
system latency. The scene is rendered at two instances in
                                                               image variance measurements to estimate changes in
time producing two frames along with their depth
                                                               camera geometry also fail in such regions. Meanwhile
information. Employing the 3D image warping
                                                               the surgical environment also imposes restrictions on the
technique, intermediate frames are generated by warping
                                                               reconstruction approaches, which may be employed. For
the two reference images using in-between viewing
                                                               example, methods of structure recovery a pattern of light
positions. Path prediction techniques are used to
                                                               projected onto the scene to improve the correspondence
determine the in-between positions for the generated
                                                               process or to help the recovery of structure through
                                                               monitoring the distortion of the projected patterns
                                                               (structured lighting) are undesirable. Structured light in
4. Structure Recovery from Images                              the visible spectrum will amend the surgeon’s view of
                                                               the operating field and also requires the introduction of
     As we have seen, some IBR techniques rely heavily
                                                               additional equipment.
on structural information about the scene. Extracting this
                                                                    The reconstruction methods, which are most
information from images alone is a cost effective
                                                               naturally and amicably suited to the operating theater and
solution, important in MIS as it imposes little intrusion to
                                                               the characteristics of MIS images are SFM and Shape-
the surgical environment.
                                                               from-Shading (SFS). For SFM, multiple images may be
     Researchers in the human visual system believe that
                                                               acquired by a single moving camera or by a stereo
we use a number of cues to perceive the structure of the
                                                               endoscope and the structure of the scene reconstructed
world around us. Namely, stereopsis, shading, shadows,
                                                               up to an unknown scale factor with reference to the
motion, texture, focus and object recognition are
                                                               camera position. SFS, on the other hand, uses single
cooperatively used to process information received from
                                                               images to infer the shape (surface gradient or normal) of
the eyes. Computer Vision researchers have borrowed
                                                               surfaces in the scene.
these ideas about biological vision to form methods of
                                                                    The SFM method for reconstruction is trivial given
reconstruction. However, in contrast to human vision,
                                                               accurate correspondence across different views and
Computer Vision researchers have predominantly
                                                               camera calibration. Determining corresponding image
focused on exploiting each cue individually resulting in
                                                               primitives (projections of the same real world feature) is
easier problem formulations and avoiding conflicting
                                                               a widely studied problem in Computer Vision known as
assumptions. However, this has also imposed a lack of
                                                               stereo matching or stereo correspondence [42,43].
robustness in existing reconstruction methods because
                                                               Difficulties arise when commonly used assumptions to
the problem is difficult and fundamentally ill posed.
                                                               constrain the problem fail. For example, similarity
Images are time frozen, framed, two-dimensional
                                                               metrics, such as Normalized Cross Correlation, perform
representations of a frameless, continuous, three-
                                                               poorly in homogeneous image regions, while smoothness
dimensional world. Additionally, limitations in imaging
                                                               restrictions (assumptions that the scene is formed of
devices fail to capture dynamic range and color of our
                                                               piece wise smooth surfaces) may fail to preserve object
                                                               boundaries. Camera Calibration on the other hand deals
     Collectively, methods of inferring the geometric
                                                               with inferring the intrinsic and extrinsic camera
properties of a scene from images are known as Shape-
                                                               parameters. Offline methods using calibration objects of
from-X (X standing for different visual cues). The cues,
                                                               known geometry are accurate and well documented
which have been used in Computer Vision are shading,
                                                               [44,45,46]. However, for many applications the
texture, focus and motion. The majority of successful
                                                               parameters may vary during image acquisition (for
example the surgeon may wish to change focus during an       work was further developed to incorporate camera
MIS procedure). Techniques performing calibration            distortion and a term for specular reflection in [51].
without calibration objects (called self-calibration or           It has long been recognized that in order to apply
autocalibration) rely heavily on scene rigidity and also     structure recovery techniques to complex practical
on constant intrinsic parameters and a wide range of         situations, ways of integrating or assimilating different
available camera motion [47]. Once correspondence and        methods       must    be    considered      [52,53].    The
calibration are given, back projecting rays to find their    complementary nature of SFM and SFS was first noted
intersection in space yields the scenes’ structure. This     by [53] and since then approaches based on Bayesians
process is known as triangulation [48].                      [54,55] or Markov Random Fields [56], game theory
     Unlike SFM, which recovers the position of scene        [57], lattice theory [58] and variational frameworks [59]
features, SFS techniques recover the shape of the            have been documented. The main ideas, which have been
surfaces in the scene. The foundations of the SFS            explored are the interaction between individual modules
problem formulation lie in the image irradiance equation,    through explicit and implicit data exchanges, the
which describes image brightness as a function of the        combination of results from different cues at different
projected surface, the light source and the camera           frequencies and the segmentation of images by some
properties [49]. However, finding a solution for shape       criteria to determine the suitability of each reconstruction
from the image irradiance equation is not feasible           method for specific separate regions. However, since the
considering the number of unknown variables and the          task of integration is difficult in a general formulation
fact that the only source of information for each image      there is no approach, which works well for a range of test
point is its intensity. Therefore, SFS techniques proposed   images. It may well be envisaged that different problem
in the literature seek a solution to a simplified problem.   formulations are better suited to specific applications
Almost comprehensively used restrictions are those of        than others.
Lambertian surface reflectance, infinitely distant light          For endoscopic images, integrating the stereo and
source and orthogonal projection. However, the only          shading cues appears to be almost essential. SFM is
source of information is the image brightness while the      difficult for images where correspondence is unreliable.
description of a surface requires more than one              This is the case when large homogeneous regions are
parameter. Hence, further constraints are typically          present. Meanwhile, SFS techniques typically require
incorporated into the problem formulation to resolve the     starting solutions, smooth surfaces and simple
ambiguity [50].                                              reflectance. These may all be aided by SFM, while SFS
                                                             contributes to recovering shape in texture-less regions.
                                                             This is the complementary nature of SFS and SFM and
                                                             an integration system to effectively utilize the strengths
                                                             of each cue would prove an indispensable tool for
                                                             accurate structure computation.

                                                                  Medical simulation systems are an actively pursued
                                                             subject that requires extensive collaboration between the
                                                             technical and medical communities. It offers unique
                                                             opportunities and challenges to emerging IBMR
                                                             techniques. It is likely that as they develop, certain
                                                             branches of IBMR techniques will evolve such that they
 Figure 6,     Structure computed using SFS source           are particularly suited for soft tissue deformation and
                  code described in [42]                     modeling - a fundamental challenge to the current VR
                                                             based simulation systems where the lack of high fidelity
     The difficulties associated with MIS images are in      in visual and tactile feedback prohibits its more wide
the violation of formulating assumptions used by             spread use in advanced skills training and assessment. In
structure recovery techniques. Multiple view approaches      this review we have highlighted some of the important
experience difficulties in establishing correspondence       techniques in IBMR with regard to improving visual
due to possible deformation of the operating field. For      realism in surgical simulation systems. Such techniques
the same reason, self-calibration methods are difficult to   are also of benefit to surgical planning, diagnosis and
use when dealing with varying camera parameters. On          especially to IGS. In particular, structure recovery
the other hand, SFS techniques suffer from the complex       methods from images offer unparalleled opportunities in
reflectance properties of surfaces which do not allow the    improving registration of different modalities and the
easy simplification of the image irradiance equation. The    visualization of AR in IGS.
restriction often imposed on light source position was
changed for endoscopic images by [15] who assumed
coinciding light source and camera center positions. This
References                                                               panoramas, In IEEE Transactions on Medical Imaging,
                                                                         vol. 21, pp. 23-30, 2000.
[1].   Virtual Reality in Medicine: A Survey of State of the        [19]. F. Gerritsen, M. Breeuwer, H. de Bliek, J. Buurman, and
     Art,                         P. Desmedt, Image-guided surgery - the easi project,
[2]. D. Dey, P. J. Slomka, D. G. Gobbi and T. M. Peters.                 presented at The Surgery Room of the 21st Century, Moat
     Mixed reality merging of endoscopic images and 3-d                  House hotel, Glasgow, Scotland, 1999.
     surfaces. Medical Image Computing and Computer-                [20]. A. P. King, P. J. E. M. R. Pike, G. L. G. Hill, and D. J.
     Assisted Intervention 2000. MICCAI, 2000.                           Hawkes. An analysis of calibration and registration errors
[3]. M. de Waal. Image-guided surgery of the spine. Medica               in an augmented reality system for microscope-assisted
     Mundi, vol. 42, 1998.                                               guided interventions. Medical Image Understanding and
[4]. M. J. Mack. Minimally Invasive and Robotic Surgery.                 Analysis, 1999.
     The Journal of the American Medical Association. vol.          [21]. A. Tei. Multi-modality image fusion by real-time
     285, pp. 568-572, 2001.                                             tracking of volumetric brain deformation during image
[5]. R. M. Satava, and S. B. Jones. Current and future                   guided neurosurgery. MIT, 2002.
     applications of virtual reality for medicine. In Proceedings   [22]. D. Rueckert, C. Hayes, C. Studholme, P. Summers, M.
     of the IEEE. vol. 86, pp. 484-489, 1998.                            Leach and D. J. Hawkes. Non-rigid registration of breast
[6]. Xavier Provot. Deformation Constraints in a Mass-                   MR images using mutual information. In Lecture Notes in
     Spring Model to Describe Rigid Cloth Behavior. Wayne                Computer Science, vol. 2208, 1998.
     A.Davis and Przemyslaw Prusinkiewicz. In Proceedings           [23]. T. Gaens, F. Maes, D. Vandermeulen and P. Suetens.
     of Graphics Interface 1995, pp. 147-154. 1995.                      Non-rigid multimodal image registration using mutual
[7]. U.Kühnapfel, H.K.Çakmak, and H.Maaß. 3D Modeling                    information. In Lecture Notes in Computer Science, vol.
     for Endoscopic Surgery. Oct. 1999.                                  1496, pp. 1099-, 1998.
[8]. S.Roth, M.Gross, S.Turello, and F.Carls. A bernstein-          [24]. J. Feldmar, N. Ayache, and F. Betting. 3D-2D projective
     bézier based approach to soft tissue simulation. In                 registration of free-form curves and surfaces. Inria, Institut
     Proceedings of the Eurographics 1998. Sept., 1998.                  National de Recherche en Informatique et en Automatique
                                                                         Technical Report RR-2434.
[9]. M.Bro-Nielsen and S.Cotin. Real-time volumetric
     deformable models for surgery simulation using finite          [25]. A. Liu, E. Bullitt and S. M. Pizer. Registration via
     elements and condensation. Computer Graphics Forum,                 skeletal near projective invariance in tubular objects. In
     vol. 15, pp. 57-66, Sept. 1996.                                     Lecture Notes in Computer Science, vol. 1496, pp. 952-,
[10]. D.Bielser and M.Gross. Open surgery simulation. In
     Proceedings of Medicine Meets Virtual Reality. 2002.           [26]. C. M. Cyr, T. Sebastian and B. B. Kimia. 2d-3d
                                                                         registration based on shape matching. In IEEE Workshop
[11]. Cignoni P., Ganovelli F., and Scopigno R. Introducing
                                                                         on Mathematical Methods in Biomedical Image Analysis,
     Multiresolution Represen-tation in Deformable Modeling.
     In SCCG, pp. 149–158, Apr. 1999.
                                                                    [27]. J. B. A. Maintz. An overview of medical image
[12]. Debunne G., Desbrun M., Cani M.P, and Barr A.H.
                                                                         registration methods. In Symposium of the Belgian
     Adaptive Simulation of Soft Bodies in Real-Time. In CA,
                                                                         hospital physicists association (SBPH-BVZF). vol. 12, pp.
     pp. 15–20, 2000.
                                                                         1-22, 1997.
[13]. Wu X.M. Adaptive Nonlinear Finite Elements for
                                                                    [28]. H-Y. Shum and S. B. Kang. A Review of Image-based
     Deformable Body Simulation Using Dynamic Progressive
                                                                         Rendering Techniques. In Visual Communications and
     Meshes. In Proceedings of the Eurographics 2001, pp.
                                                                         Image Processing. IEEE/SPIE, pp.2-13, 2000.
     439–448, 2001.
                                                                    [29]. M. Levoy, and P. Hanrahan. Light Field Rendering, In
[14]. Paloc C., Bello F., Kitney R.I, and Darzi A. Online
                                                                         Proceedings of SIGGRAPH 1996. pp. 31-42, 1996.
     multiresolution volumetric mass spring model for real
     time soft tissue deformation. In MICCAI 2002, pp. 219–         [30]. S. Chen and L. Williams. View interpolation for image
     226, 2002.                                                          synthesis. Computer Graphics (SIGGRAPH’93). pp. 279-
                                                                         288, 1993.
[15]. T. Okatani, and K. Deguchi. Shape Reconstruction from
     an Endoscope Image by Shape from Shading Technique             [31]. J. F. Blinn, and M. E. Newell. Texture and Reflection in
     for a Point Light Source at the Projection Centre.                  Computer Generated Images. CACM, vol. 19, pp. 542-
     Computer Vision and Image Understanding, vol. 66, pp.               547, 1976.
     119-131, 1997.                                                 [32]. A. Watt. 3D Computer Graphics. 2000.
[16]. I. Serlie, F. M. Vos, R. E. van Gelder, J. Stoker, R.         [33]. N. Greene. Environment mapping and other applications
     Truyen, Y. Nio and F. H. Post. Improved visualization in            of world projections. Computer Graphics and
     virtual colonoscopy using image based rendering. In                 Applications. IEEE. vol. 6, pp. 21-29, 1986.
     Proceedings of the Joint Eurographics - IEEE TCVG              [34]. D. Wagner, R. Wegenkittl and E. Gröller. Endoview: A
     Symposium on Visualizatation (VisSym-01), 2001.                     phantom study of a tracked virtual bronchoscopy. Journal
[17]. E. Coste-Maniere, L. Adhami, R. Severac-Bastide, A.                of WSCG, vol. 10, 2002.
     Lobontiu, J. K. Salisbury, J.-D. Boissonnat, N. Swarup, G.     [35]. I. W. O. Serlie, F.M. Vos, R. van Gelder et al. Improved
     Guthart, E. Mousseaux, and A. Carpentier. Optimized Port            visualization in virtual colonoscopy using image-based
     Placement for the Totally Endoscopic Corinary Artery                rendering. Accepted for publication in Proceedings of
     Bypass Grafting using the da Vinci Robotic System.                  ACM/Eurographics. VisSym, 2001.
     Proceedings of ISER, 2000.                                     [36]. L. McMillan. An Image-Based Approach to Three-
[18]. D. Dey, D. Gobbi, P. Slomka, K. Surry and T. Peters,               Dimensional Computer Graphics. UNC, 1997.
     Automatic fusion of freehand endoscopic brain images to        [37]. M. Oliveira, G. Bishop, and D. McAllister. Relief
     three-dimensional surfaces: Creating stereoscopic                   Texture Mapping. In Proceedings of SIGGRAPH 2000,
                                                                         pp. 231-242, 2000.
[38]. M. ElHelw and G-Z. Yang. Cylindrical Relief Texture        [50]. R. Zhang, PS. Tsai, J.E. Cryer, M. Shah. Shape from
     Mapping. Journal of WSCG, vol. 11, 2003.                         Shading: A Survey. IEEE Transactions on Pattern
[39]. W. Mark. Post-Rendering 3D Image Warping: Visibility,           Analysis and Machine Intelligence, vol. 21, 1999.
     Reconstruction, and Performance for Depth-Image             [51]. C. H. Q. Forster, and C. Tozzi. Towards 3D
     Warping. University of North Carolina at Chapel Hill,            Reconstruction of Endoscope Images Using Shape from
     1999.                                                            Shading. In Proceedings of the XIII Brizilian Symposium
[40]. O. Faugeras. Multiple View Geometry. 1993.                      on Computer Graphics and Image Processing, 2000.
[41]. R. Hartley and A. Zisserman. Multiple View Geometry        [52]. J. L. Marr. Vision: A Computational Investigation into
     in Computer Vision. 2000.                                        the Human Representation and Processing of Visual
[42]. U. Dhond, and J. Aggarwal. Structure from stereo - a            Information. 1982.
     review. IEEE Transactions on Systems, Manufacture and       [53]. A. Blake, A. Zisserman, and G. Knowles, Surface
     Cybernetics, vol. 19, pp. 1489-1510, 1989                        descriptions from stereo and shading. Image Vision and
[43]. D. Scharstein, and R. Sziliski. A Taxonomy and                  Computing. vol. 3, pp. 183-191, 1985.
     Evaluation of Dense Two-Frame Stereo Correspondence         [54]. S. Sakar, and K. L. Boyer, Perceptual organisation using
     Algorithms. Internation Journal of Computer Vision, vol.         Bayesian networks. In Conference on Computer Vision
     47, pp. 7-42, 2002.                                              and Pattern Recognition. IEEE. 1992
[44]. R. Y. Tsai. A versatile Camera Calibration Technique for   [55]. S. Pankanti, and A. K. Jain. A uniform Bayesian
     High-Accuracy 3D Machine Vision Metrology Using Off-             framework for integration. In International Symposium on
     the-Shelf TV Cameras and Lenses. IEEE Journal of                 Computer Vision. 1995.
     Robotics and Automation, vol. RA-3, pp. 323-344, 1987.      [56]. J. J. Little, Integrating Vision Modules at Discontinuities,
[45]. R. Y. Tsai. An Efficient and Accurate Camera                    In 12th Canadian Symposium on Remote Sensing. 1989
     Calibration Technique for 3D Machine Vision. In                  International. IGARSS 189. vol. 3, pp. 1260-1263, 1989.
     Conference on Computer Vision and Pattern Recognition,      [57]. H. I. Bozma, and J. S. Duncan. Integration of Vision
     CVPR, IEEE, 1986.                                                Modules: A Game-Theoretic Framework. In Conference
[46]. Z. Zhang. A flexible new technique for camera                   on Computer Vision and Pattern Recognition. IEEE. 1991.
     calibration. IEEE Transactions on Pattern Analysis and      [58]. A. Jepson, and W. Richards. A Lattice Framework for
     Machine Intelligence. vol. 22, pp.1330-1334, 2000.               Integrating Vision Modules, IEEE Transactions on
[47]. A. Fusiello. Uncalibrated Eucledian reconstruction: a           Systems, Man and Cybernetics. vol. 22, 1992.
     review. Image and Vision Computing. vol. 18, pp. 555-       [59]. J. Shah, H. H. Pien, and J. M. Gauch. Recovery of
     563, 2000.                                                       Surfaces with Discontinuities by Fusing Shading and
[48]. R. Hartley and P. Sturm. Triangulation. In Proceedings          Range Data Within a Variational Framework. IEEE
     of ARPA IUW, 1994.                                               Transactions on Image Processing. vol. 5, pp. 1243-1251,
[49]. B. K. Horn, and M. J. Brooks, Shape from Shading.               1996.

To top