Tutorial - PDF by Flavio58

VIEWS: 323 PAGES: 32

									                               Introduction to easyVision
                                   (in construction, February 10, 2009)


                                             Alberto Ruiz
                                       University of Murcia, Spain


1    Introduction
This document is a tutorial for easyVision, a collection of Haskell libraries for rapid prototyping of
simple computer vision applications. Performance is not compromised because low level expensive
computations are internally implemented by optimized libraries (IPP, HOpenGL, BLAS/LAPACK,
etc.). Once appropriate geometric primitives have been extracted by the image processing wrappers
we take advantage of Haskell’s expressive power to define interesting computations using elegant
functional constructions.
    This project is extremely preliminary, incomplete, and under active development, so I do not rec-
ommend it for serious applications. More advanced programming techniques should be used in some
places, and many of the example programs in the distribution are quick experiments and proofs of
concept that should be rewritten in a better coding style. In any case, I successfully use this system for
my everyday work; I find it particularly useful for experimenting with new ideas, preparing teaching
demos, and even for developing more complex research prototypes.
    The library contains four principal sections: ImagProc for image processing and extraction of
medium level geometric primitives, mainly supported by wrappers to IPP; Vision, for visual ge-
ometry algorithms, based on hmatrix; Classifier, for basic pattern recognition and learning ma-
chines, also based on hmatrix; and EasyVision, for visualization and user interface, supported by
HOpenGL. Full information about the available functions can be found in the online haddock docu-
mentation.
    We are interested in a purely functional approach to image processing and computer vision. There
is nothing special about images, they are like any other Haskell data type, and we work with them
using appropriate functions supplied by the library. We can process potentially infinite lists of images
taken from video sources, and save the results for further analysis or visualization.
    However, in most vision applications we need a convenient infrastructure to observe desired in-
termediate results and modify certain parameters of the computations ‘on the fly’. This is crucial for
experimentation and development of nontrivial programs. Specifically, we must ‘embed‘ the required
pure functions in the IO loop required by a responsive graphic user interface. This library provides
suitable combinators to easily define processing pipelines with the following general scheme:
      camera >>= process1 >>= observe >>= process2 >>= observe ... >>= save
    Notice that in general these processes will work on the whole sequence of images, while visual-
ization and user interaction will work online, in a frame-by-frame basis.
    In practice, some processing stages depend on some kind of explicit state, which may be modified
by user interaction. This behavior can also be defined by appropriate functions in the library. More
complex applications are not so easily described as a linear processing sequence; a more general graph
of asynchronous processes is required. They can also be defined using Haskell’s support for concurrent
programming, as described in Section X.
    An important question is image grabbing. Since we are interested in real-time computer vision, we
will mainly focus in image sequences taken from live cameras or test videos. We currently delegate the

                                                    1
difficult task of video capture and decoding to mplayer [ref], since this useful program understands
almost any kind of video source and is able to convert it into a convenient format for further processing
through an OS pipe. Single images in typical formats like png or jpeg can also be read with the help
of mplayer.
    Before starting you should check that all required libraries have been correctly installed. See
the web page of the project for details. Different versions of a prototype, or related programs for a
certain application, are conveniently organized in a separate folder with an appropriate Makefile and
README. The Makefile is just a simple template which includes general building instructions. For
instance, for this tutorial we have created the folder easyVision/compvis/tutorial with the
following Makefile:
          EASYVISION = ../..
          include ../Makefile.include
    Now we can just make prog to build prog.hs. We may also set the symbol ALL to the desired
targets to be built by a generic make. See any subproject under compvis for details.


2        A simple video player
The analog of the typical Hello world! program in easyVision is a simple video player1 :

                                                        – play.hs –
import EasyVision
import Tutorial (run, camera, observe)

main = run $ camera >>= observe ”Video” rgb

        The image source specified in the command line will be shown in a ‘live’ window:
          $ make play
          $ ./play road.avi




    This is just a local video file, but the program admits a more general URL, which may also be a
remote video, any alias in cameras.def, or even explicit command line arguments (quoted) under-
stood by mplayer. For instance, we can also work with the following image sources:
$ ./play http://perception.inf.um.es/public_data/videos/planar/rot3.avi
$ ./play ’tv:// -tv driver=v4l2:device=/dev/video0:fps=25’
$ ./play


    When no argument is supplied the program uses the default image source defined in cameras.def,
typically a webcam connected to the computer.
    If we run the program with a wrong URL we get the following message:
          $./play doesnt_exist.avi
          Press Ctrl-C and check the URL
    1
        For clarity, some auxiliary functions defined in the Tutorial module will be described later.


                                                              2
   This message may also appear for a few seconds while we try to open a correct URL. It will be
eventually overwritten by some information about the auxiliary stream generated by mplayer:
      $./play nicevideo.avi
      YUV4MPEG2 W640 H480 F25:1 Ip A1:1
    The size of the images can also be given in the command line in terms of rows and columns, or as
an integer multiplier of 24 rows × 32 columns. For instance:
      $ ./play [url] --rows=100 --cols=300
      $ ./play [url] --size=12
   The default size is 20 (640×480).

   The above player can be easily modified to show the effect of any image processing function. For
instance, this is the effect of a Gaussian filter with σ = 5.7 pixels:

                                            – play2.hs –
import EasyVision
import Tutorial (run,camera,observe)

f = gaussS 5.7 . float . gray

main = run (camera >>= observe ”Gauss” f)




    And this shows the edges computed by Canny’s operator (the final notI inverts the image to save
ink if it is ever printed);

                                           – canny.hs –
import EasyVision
import Tutorial (run, camera, observe)

edges = canny (0.1,0.3) . gradients . gaussS 2 . float . gray

main = run $ camera
           >>= observe ”Canny’s operator” (notI . edges)




                                                 3
Raw YUV format and channels The raw images generated by the mplayer-based camera are in the
YUV format, which is transformed into a Channels record containing a number of (lazily computed)
useful representations (RGB, gray levels, HSV, etc.). In the previous examples we select the gray
(monochrome) channel from this record and convert it into the single precision float representation
required by the gaussS filter. You can take a look at the on-line documentation and try out other
image processing functions.

Regions of Interest Each image has an associated rectangular region of interest (ROI). Outside its
ROI the image is undefined. Note that the result of the Gaussian filter shown above has a black ‘frame’.
The reason is that the drawImage function used by observe only shows the ROI of the image, on
an initially black window. Most of the convolution filters currently available obtain a reduced ROI
because the border pixels do not admit a full mask around them. (Alternative implementations that
also process the border pixels could be added in the future.) An important feature of the library is that
all image processing functions take automatically into account the ROI of the arguments to produce the
biggest ROI that makes sense in the result. Furthermore, we can freely modify the ROI of an image,
because it is an immutable object and pixel data will be shared. (Bad things may happen if you use
low level, unsafe functions to read pixels outside the ROI.)

   We can also display both the original video and the result of a given function in separate windows
for comparison purposes. The next example uses the (˜>) operator, which applies a given function to
all the objects produced by a camera:

                                              – play3.hs –
import EasyVision
import Tutorial (run,camera,observe)

main = run $ camera
           >>= observe ”original” rgb
           ˜> highPass8u Mask5x5 . median Mask5x5 . gray
           >>= observe ”high−pass filter” id




3    Camera combinators
Operations on single images are useful, but we are actually interested in a more powerful processing
approach using standard Haskell list functions on the whole, potentially infinite, lazy sequence of input
images. A ‘virtual camera’ is the result of a ‘camera combinator’, which applies some function to the
whole input list and lazily produces a transformed stream of objects. It is defined with the help of
the ˜˜> operator, where the the double ‘˜’ indicates that the function is applied to the whole list. The
previous (˜>) operator is the shortcut for a simple map implemented with minimum overhead.
    Let us see a few examples. First we will transform the input sequence into a sequence of 5x5 arrays
of small images. If the original video is captured at 25fps the result is a ‘grid’ view at 1fps.




                                                   4
                                           – combi0.hs –
import EasyVision
import Tutorial (camera, observe, run)

grid n = map (blockImage . splitEvery n) . splitEvery (n∗n) . map (resize (mpSize 4))
    where splitEvery [] = []
          splitEvery k l = take k l : splitEvery k (drop k l )

main = run $ camera ˜> rgb
           ˜˜> grid 5
           >>= observe ”grid” id




   We can even create a grid of grids:

                                           – combi1.hs –
import EasyVision
import Tutorial (camera, observe, run, grid)

main = run $ camera ˜> rgb
           >>= observe ”original” id
           ˜˜> grid 2
           >>= observe ”first grid” id
           ˜˜> grid 3
           >>= observe ”second grid” id



  The next program computes a discounted sum of images, blurring moving objects. Again, the key
process is a pure function on a list of images:

                                           – combi2.hs –
import EasyVision
import Tutorial (camera, observe, run)

drift r (a:b:xs) = a : drift r (( r .∗ a |+| (1−r).∗ b):xs)

main = run $ camera ˜> float.gray ˜˜> drift 0.9 >>= observe ”drift” id




                                                 5
   We can also insert artificial images in the sequence. Pixel values can be inspected using a zoom
window available in the library:

                                              – combi3.hs –
import EasyVision
import Tutorial (camera, observe, run)

drift r (a:b:xs) = a : drift r (( r .∗ a |+| (1−r).∗ b):xs)

interpolate (a:b:xs) = a: (0.5.∗a |+| 0.5.∗b) : interpolate (b:xs)

main = run     $ camera ˜> float. gray
             ˜˜> drift 0.9
             >>= observe ”drift” id
             ˜˜> interpolate
             >>= zoomWindow ”zoom” 600 toGray




4    Offline processing
In many applications we are not only interested in visualization, but also in saving the results for further
processing or study. The following program computes the edges of a sequence of images and saves
them in an output video file. It works ‘silently’, without any visualization window. The process
function is just the analog, for image sequences, of the standard interact function in the Haskell
Prelude:

                                              – offline0.hs –
import EasyVision
import Tutorial (process)

edges = notI . canny (0.05,0.2) . gradients . gaussS 2 . float . gray

main = process (map edges)



                                                     6
   The video source and the desired output file name are given in the command line. We run it with the
mplayer arguments -benchmark and -loop 1 to override the real-time, continuous loop default
behavior:
      $ ./offline0 ’road.avi -benchmark -loop 1’ --save=canny.yuv
   The output video is created in the uncompressed yuv4mpeg video format used by mplayer. We
may wish to convert it to a more standard format:
      $ mencoder canny.yuv -o canny.avi -ovc lavc -fps 25
   Both processes can be done at the same time using a pipe:
      ./offline ’video.avi -benchmark -loop 1’ --save=/dev/stdout \
      | mencoder - -o canny.avi -ovc lavc -fps 25
   Obviously, this kind of ‘blind’ processing is not practical for experimentation or development of
more complex applications. The same result can be achieved inside the GUI using the saveFrame
combinator. Again, the output filename is taken from the command line and the video format is
mplayer’s yuv4mpeg. In the next example we create a video which shows the local history of the
edges in the input source:

                                             – offline.hs –
import EasyVision
import Data.List ( foldl1 ’, tails )
import Tutorial (run, camera, observe, saveFrame)

edges = canny (0.05,0.2) . gradients . gaussS 2 . float . gray

history k = map (notI . foldl1 ’ orI . reverse . take k) . tail . tails

main = run $ camera
           >>= observe ”original” rgb
           ˜˜> history 5 . map edges
           >>= observe ”edge history” id
           >>= saveFrame toYUV




   We can also define a combinator to save higher level properties extracted from a video. The next
example writes the locations of detected corners in each image in succesive lines of an output text file:




                                                   7
                                             – points.hs –
import   EasyVision
import   Graphics.UI.GLUT
import   Control.Arrow
import   Tutorial (run, camera)

save filename f cam = do
    writeFile filename ” ”
    return $ do
        x <− cam
        appendFile filename (show (f x)++”\n”)
        return x

sh (im, pts) = do
    drawImage im
    pointSize $= 5; setColor 1 0 0
    renderPrimitive Points $ mapM vertex pts

salience s1 s2 = gaussS’ 2.5 s2 . sqrt32f . abs32f . hessian . gradients . gaussS’ 2.5 s1

thres r im = thresholdVal32f (mx∗r) 0 IppCmpLess im
    where ( ,mx) = EasyVision.minmax im

main = run $ camera ˜> gray
           ˜> id &&& (getPoints32f 300 . localMax 1 . thres 0.5 . salience 2 4 . float )
           >>= monitor ”Corners” (mpSize 20) sh
           >>= save ”points.txt” snd




         $ cat points.txt
         [Pixel {row = 75, col = 148},Pixel {row = 77, col = 188}, etc. ]
         [Pixel {row = 80, col = 235},Pixel {row = 82, col = 276}, etc. ]
         etc.


5   Low level details
In order to describe the essential ingredients of a typical easyVision application, we will rewrite the
simple player in Section X using low level constructions. The following program opens an image
source, then creates a graphical window, and finally sets up a callback which repeatedly grabs a new
image and shows it.




                                                  8
                                             – playll.hs –
import EasyVision

main = do
   prepare
   sz <− findSize
   c <− getCam 0 sz
   w <− evWindow () ”simple player” sz Nothing (const kbdQuit)
   launch (worker c w)

worker cam win = do
   img <− cam
   inWin win $ do
        drawImage img

   Let us analyze the code step by step:

    - We first prepare the opengl system and initialize the user interface.
    - Then we open a “camera” (a general image sequence). The source selection method described
      above is automatically provided by the function getCam. In this example we only need one
      camera, so we open the first one (index zero) from the URL given in the command line (or
      the default one). An image Size is required, which is typically obtained from the optional
      command line argument captured by findsize.
    - The evWindow function returns a reference to an opengl window with some additional infor-
      mation. The first argument is the initial value of a “state” maintained by the window. In many
      cases we don’t need any state, so we use (). The second argument is the window title, and
      the third one is the window size. We typically use the same size for the image source and the
      window. Then we can supply an optional “callback” to automatically perform some tasks on
      “redisplay” events. In many cases this is not required, so we pass Nothing. The last argu-
      ment is the mouse/keyboard handler for convenient user interaction through the window state
      mentioned above. In this example we provide a simple predefined function which just exits the
      program when the user presses ESC.
    - Finally, we launch the “worker”, which will be repeatedly called as fast as possible. In this
      case it simply grabs the next frame from the camera, sets the graphic “context” in the desired
      window (here we have only one), and shows the image in it.

Program speed The number of frames per second achieved by the program depends on the frame
rate of the image source and the amount of computation carried out by the worker. A simple player like
this has a very low CPU load. Decoding a .avi video or grabbing from a webcam is not expensive,
and hence most of the time the application is blocked waiting for a new frame. (Decoding DV format
is much more expensive.)

   We can ask mplayer to change the f.p.s. of the image source:
      ./play ’video.avi -fps 3’
and also ask for the maximum possible speed:
      ./play ’video.avi -benchmark’
    This is useful in some off-line applications which must process a possibly long video, and it makes
no sense to do it in real-time. Those options may not work with real cameras; for instance, the fps of a
webcam must be selected in the driver options, and only a limited set of possibilities may be allowed.
    We can also modify the program speed by setting an explicit frequency for the worker using
launchFreq fps worker.

                                                   9
    In live video applications the worker must be fast enough to finish before the next frame is avail-
able. Otherwise, we will get unacceptable delays and random video interruptions. At the normal video
rate of 25 fps we have less than 40ms to process each frame. If the worker is expensive it may be nec-
essary to reduce the image size or the sampling rate (a slower frequency of 10 fps may be acceptable
in some cases).
    In section X we will explain how to discard ‘obsolete’ frames from a camera, so that it always
returns the most recent one, independently of the time required by the worker.

Implementation of the auxiliary functions A number of frequently used tools are available in the
library, including 3D windows with mouse control of viewpoint, zooming windows, etc. However,
simpler functions like camera, run, or observe are better defined on the fly to suit the specific
requirements of each application.

    - A camera is just the first image source, with the size taken from the command line, and ready
      to be used in the desired image format:
            camera = findSize >>= getCam 0 ˜> channels

    - The run function prepares the OpenGL system and sets up a callback which repeteadly reads
      from the given camera, discarding the result:
            run createCam = do
                prepare
                cam <- createCam
                launch $ do
                    cam
                    return ()

      or, more concisely,
            run c = prepare >> (c >>= launch . (>> return ()))

      The only task for the worker is to grab another ‘image’. Of course, if the camera has been
      defined in terms of complex combinators, then a simple grab may produce a large amount of
      computation, possibly involving many frames in the input sequence.

    - A monitor is a camera combinator which simply passes along the argument unmodified, ren-
      dering a given OpenGL action in a window. It may be thought of as a graphical trace.
            monitor name sz fun cam = do
                w <- evWindow () name sz Nothing (const kbdQuit)
                return $ do
                    x <- cam
                    inWin w (fun x)
                    return x

      (The actual definition in the library includes a keyboard control to pause the image sequence.)
      The observe function is just a shortcut for monitor:
            observe winname f = monitor winname (mpSize 20) (drawImage.f)

    - The process function (equivalent to interact for video sources) can be defined as follows:
            process f = do
                outfile <- optionString "--save" "saved.yuv"
                xs <- readFrames 0
                let ys = map toYUV . f . map channels $ xs
                writeFrames outfile ys




                                                 10
6   State
The previous example shows that application state is easily handled a pure way by using ‘camera
combinators’ defined by appropriate list functions. However, in some cases it is also convenient to
keep some explicit mutable state between successive calls to the worker (for instance, this is useful
when computation depends on user interaction). There are several approaches to solve this (e.g.,
monad transformers to stack state and IO, etc.), but currently we have chosen a very simple approach
based on IORefs. Each evWindow keeps a state of the desired type which can be read and modified
by the worker and the keyboard/mouse callback.
    For instance, we can augment monitor to show the number of displayed frames in each window:

                                           – combi4.hs –
import EasyVision
import Tutorial (run, camera)

observe winname = monitor’ winname (mpSize 20) drawImage

 drift r (a:b:xs) = a : drift r (( r .∗ a |+| (1−r).∗ b):xs)
interpolate (a:b:xs) = a: (0.5.∗a |+| 0.5.∗b) : interpolate (b:xs)

main = do
   prepare
   alpha <− getOption ”−−alpha” 0.9
   run $ camera ˜> float . gray
     ˜˜> drift alpha
     >>= observe ”drift”
     ˜˜> interpolate
     >>= observe ”interpolate”

monitor’ name sz fun cam = do
   w <− evWindow 0 name sz Nothing (const kbdQuit)
   return $ do
        thing <− cam
        n <− getW w
        inWin w $ do
            fun thing
            text2D 20 20 (show n)
        putW w (n+1)
        return thing

    If we run this program we see that the frame rate of the "interpolate" window is twice as
fast as that of the "drift" one. Note also that the parameter alpha for the discounted sum is cap-
tured by getOption as and optional command line argument. For instance, using --alpha=0.99
we get a much stronger averaging.

  The next example shows the absolute difference between the current (gray) image grabbed from the
camera and a fixed one (e.g., some kind of background), selected by the user by pressing key ‘s’. The
window state keeps the current background image and the possible user request of a new one.




                                                 11
                                              – state.hs –
import EasyVision
import Graphics.UI.GLUT

main = do
   sz <− findSize
   prepare
   cam <− getCam 0 sz ˜> channels
   w <− evWindow (True,undefined) ”bg diff” sz Nothing (mouse kbdQuit)
   launch $ do
        img <− fmap gray cam
        (rec,bg) <− getW w
         if rec
             then putW w (False, img)
             else inWin w $ drawImage $ absDiff8u img bg

mouse st (Char ’s ’) Down    = st $= (True,undefined)
mouse def a b c d = def a b c d




    The same idea can be expressed as a reusable camera combinator:

                                              – state2.hs –
import   EasyVision
import   Graphics.UI.GLUT
import   Control.Monad(when)
import   Tutorial (run, camera)

main = run (camera ˜> gray >>= bgDiff)

bgDiff cam = do
   w <− evWindow (True,undefined) ”bg diff” (mpSize 20) Nothing (mouse kbdQuit)
   return $ do
        img <− cam
        (rec, ) <− getW w
        when rec (putW w (False, img))
        ( ,bg) <− getW w
         let r = absDiff8u img bg
        inWin w $ drawImage r
        return r
  where
   mouse st (Char ’s ’) Down      = st $= (True,undefined)
   mouse def a b c d = def a b c d

    A more advanced version which takes into account the three RGB channels and is followed by
thresholding, binarization and polyline extraction can be the basis of more interesting applications like
movement detection or shape recognition from silhouettes.


                                                   12
7   Interactive Parameters
User interaction using certain keystrokes is impractical when there are many application parameters.
More convenient ‘parameter windows’ are available in the easyVision graphical interface. They pro-
vide a more intuitive method to change parameters using the mouse. (However, this is currently based
on reading internal string representations; a statically safer implementation must be developed.)
    As a minimal example we can create a camera combinator to compute a Gaussian filter with σ
taken from an interactive parameter window:

                                           – param1.hs –
import EasyVision
import Tutorial (run, camera, observe)

smooth cam = do
   o <− createParameters [(”sigma”,realParam 3 0 20)]
   return $ do
       x <− cam
       sigma <− getParam o ”sigma”
       return (gaussS sigma x)

main = run $ camera ˜> float . gray >>= smooth >>= observe ”gauss” id




    A more interesting program is the following corner detector based on the local extrema of the Hes-
sian determinant. Strong saddle points (negative determinant) at a given scale and above a threshold
are detected. There are three interactive parameters: the amount of smoothing, the extent of the local
maximum detector, and the strength of the corner response. In a first version we just test the desired
computations directly in the worker, without paying much attention to code structure:




                                                 13
                                          – param3.hs –
import EasyVision
import Graphics.UI.GLUT

main = do
   prepare
   sz <− findSize
   cam <− getCam 0 sz ˜> float . gray . channels
   o <− createParameters [(”sigma”,realParam 3 0 20),
                            ( ”rad” ,intParam 4 1 25),
                            ( ”thres” ,realParam 0.6 0 1)]
   w <− evWindow () ”corners” sz Nothing (const kbdQuit)
   launch $ do
        img <− cam
        sigma <− getParam o ”sigma”
        rad <− getParam o ”rad”
        thres <− getParam o ”thres”
         let corners = getPoints32f 100
                     . localMax rad
                     . thresholdVal32f thres 0 IppCmpLess
                     . fixscale
                     . hessian
                     . gradients
                     . gaussS sigma
                     $ img
        inWin w $ do
             drawImage img
             pointSize $= 5; setColor 1 0 0
             renderPrimitive Points $ mapM vertex corners

fixscale im = (1/mn) .∗ im
   where (mn, ) = EasyVision.minmax im




   The essential corners function is based on available image processing primitives and utilities.
Some of them are actually very simple: for example, hessian is based on arithmetic operations
defined on whole images:
      hessian :: Grads -> ImageFloat
      hessian g = gxx g |*| gyy g |-| gxy g |*| gxy g
    We can write a more structured version of the above program by including all the interactive
parameters in a record which is ‘injected’ in the processing chain by a reusable combinator:




                                               14
                                            – param2.hs –
import   EasyVision
import   Graphics.UI.GLUT
import   Control.Arrow
import   Control.Monad
import   Tutorial (run, camera)

(.&.) = liftM2 ( liftM2 (,))

camera’ = camera ˜> (gray >>> float)

data Param = Param { sigma :: Float, rad :: Int , thres :: Float }

main = run $ camera’ .&. userParam
         ˜> fst &&& corners
         >>= monitor ”corners” (mpSize 20) sh

corners (x,p) = gaussS (sigma p)
             >>> gradients
             >>> hessian
             >>> fixscale
             >>> thresholdVal32f (thres p) 0 IppCmpLess
             >>> localMax (rad p)
             >>> getPoints32f 100
              $ x

fixscale im = (1/mn) .∗ im
   where (mn, ) = EasyVision.minmax im

sh (im, pts) = do
    drawImage im
    pointSize $= 5; setColor 1 0 0
    renderPrimitive Points $ mapM vertex pts

userParam = do
    o <− createParameters [(”sigma”,realParam 3 0 20),
                           ( ”rad” ,intParam 4 1 25),
                           ( ”thres” ,realParam 0.6 0 1)]
    return $ return Param ‘ap‘ getParam o ”sigma”
                         ‘ ap‘ getParam o ”rad”
                         ‘ ap‘ getParam o ”thres”

   In this version the original image is paired with the detected corner points since both data elements
may be required in subsequent processing stages (or just for showing them in the monitor). This
program also shows a more elaborate usage of the monitor window including some opengl commands.
   Note that some automatic way to build the userParam combinator (possibly using Template
Haskell) would be extremely useful.

ROI selection TO DO


8    Stand-alone mini-applications
We often need windows to display ‘static’ information, which does not necessarily change each time
a new image is grabbed from the cameras. They can be used to show the contents of a collection
of prototypes, the distribution of classes in feature spaces, and so on. This kind of window is easily
created by supplying a callback for the OpenGL ‘redisplay’ event.


                                                  15
    The following program defines a window constructor watch which admits a certain function and
an image, and shows the result of the function for the value of a parameter controlled with the mouse
wheel.

                                           – simple.hs –
import EasyVision
import Graphics.UI.GLUT
import System(getArgs)

main = do
   sz <− findSize
    file : <− getArgs
   prepare
   cam <− mplayer (”mf://”++file) sz
   img <− cam

    let x = float .gray.channels $ img

    watch ”Image” (const id) img
    watch ”3 ∗ 4” (const $ gaussS 3 . gaussS 4) x
    watch ”Gaussian” (gaussS . fromIntegral) x

    mainLoop

watch title f img = evWindow 0 title (size img) (Just disp) (mouse kbdQuit)
    where
    disp st = do
        k <− get st
        drawImage (f k img)
        text2D 15 15 (show k)
    mouse st (MouseButton WheelUp) Down          = do
        st $˜ (+1)
        postRedisplay Nothing
    mouse st (MouseButton WheelDown) Down           = do
        st $˜ (max 0 . subtract 1)
        postRedisplay Nothing
    mouse def a b c d = def a b c d

    The program creates three windows to illustrate the cascading property of Gaussian convolution.
The amount of smoothing in the "gaussian" window is changed by the mouse wheel. We can
visually check that two successive convolutions with σ = 3 and σ = 4 are equivalent to a single
convolution with σ = 5.
    In the next example we define a modified version of the above watch to show the desired element
in a possibly infinite list of images. The frame to display is changed with the mouse wheel. The lists
of images are read and processed lazily.




                                                 16
                                           – offline3.hs –
import EasyVision
import Graphics.UI.GLUT

f = resize (mpSize 15) . gray . channels

g = notI . canny (0.1,0.3) . gradients . gaussS 2 . float

main = do
   prepare
   xs <− map f ‘fmap‘ readFrames 0
    let ys = map g $ xs
        zs = zipWith (\a b −> blockImage [[a,b]]) ys xs
   watchList ” orig ” xs
   watchList ”canny” ys
   watchList ”both” zs
   mainLoop

watchList title zs = watch title (size (head zs)) (\k ims −> drawImage (ims!!k)) (inf zs)
    where inf xs = xs ++ repeat ( last xs)

watch title sz f x = evWindow 0 title sz (Just disp) (mouse kbdQuit)
    where
    disp st = do
        k <− get st
        f kx
        windowTitle $= ( title ++ ”: frame #”++ show k)
    mouse st (MouseButton WheelUp) Down           = do
        st $˜ (+1)
        postRedisplay Nothing
    mouse st (MouseButton WheelDown) Down            = do
        st $˜ (max 0 . subtract 1)
        postRedisplay Nothing
    mouse def a b c d = def a b c d



9   Parallel Computation
(TO DO)
   Computation of the interest points of two cameras in parallel:




                                                 17
                                               – parallel.hs –
import   EasyVision
import   Control.Arrow
import   Control.Monad
import   Debug.Trace
import   Numeric.LinearAlgebra hiding ((.∗))
import   Tutorial (run)

camera k = findSize >>= getCam k ˜> channels
observe winname f = monitor winname (mpSize 20) f
(.&.) = liftM2 ( liftM2 (,))

sigmas = (take 15 $ getSigmas 1 3)

fun img = (img, fullHessian ( surf 2 2) sigmas 50 0.2 img)

g = fun. float .gray

sh (img, feats ) = do
   drawImage img
   setColor 1 1 0
   text2D 20 20 (show $ length $ feats)
   mapM showFeat feats

showFeat p = do
   drawROI $ roiFromPixel (ipRawScale p) (ipRawPosition p)
   let Pixel y x = ipRawPosition p
   drawVector x y (10∗ipDescriptor ( ip p))


main’ op = run $ (camera 0 .&. camera 1)
          >>= observe ”img 0” (drawImage.rgb.fst)
          >>= observe ”img 1” (drawImage.rgb.snd)
          ˜> g ‘ op‘ g
          >>= observe ”feats 0” (sh.fst)
          >>= observe ”feats 1” (sh.snd)

main = do
   two <− getFlag ”−2”
    if two then main’ (|∗∗∗|) else main’ (∗∗∗)

   The next program is a pipeline of expensive computations useful to check that we actually get a
speedup in SMP. We see that many stages are good, regardless of the number of cores:




                                                     18
                                            – pipeline.hs –
−− time ./pipeline ’ video.avi −benchmark −loop 1 −frames 100’ +RTS −N2
−− time ./pipeline ’ video.avi −benchmark −loop 1 −frames 100’ ’−−levels=(1,20)’ +RTS −N2
import EasyVision
import Tutorial (run, camera, observe)

compose = foldr (.) id

expensive k = compose (replicate k f) where
    f im = resize (size im) . block . gaussS 10 $ im
   block im = blockImage [[im,im ],[ im,im]]

balance f = compose . map (pipeline . f )

main = do
   s <− getOption ”−−stages” =<< uncurry replicate ‘fmap‘ getOption ”−−levels” (20,1)
   putStrLn $ ”stages = ” ++ show s

     run $ camera ˜> float . gray
         >>= observe ”original” id
         ˜˜> balance expensive s
         >>= observe ”result” id



10     Concurrent Processes
(TO DO)




                                                  19
                                           – conc0.hs –
import EasyVision
import Graphics.UI.GLUT(postRedisplay)
import Control.Monad(liftM2)

monitor’ name sz fun = do
   w <− evWindow () name sz (Just (const fun)) (const kbdQuit)
   return $ postRedisplay (Just (evW w))

observe winname f a = monitor’ winname (mpSize 20) (a >>= drawImage.f)

run n ws = sequence ws >>= launchFreq n . sequence

async f = asyncFun 0 id f

 infixl 1 −<
( f ,d) −< n = asyncFun d f n

(.&.) = liftM2 (,)

hz d = 10ˆ6 ‘ div ‘ d

main = do
   prepare

    cam <− findSize >>= getCam 0 >>= async ˜> channels
    x <− (float . gray , hz 2) −< cam
    s <− (float.highPass8u Mask5x5 . gray , hz 30) −< cam
    dif <− (\(u,v) −> (0.8 .∗ u |+| 0.2 .∗ v ), hz 25) −< x .&. s

    run 20 [   observe ”cam” rgb cam
           ,   observe ”dif ” id dif
           ,   observe ”s” id s
           ,   observe ”x” id x
           ]




                                                20
                                           – conc1.hs –
import EasyVision
import Graphics.UI.GLUT(postRedisplay)

monitor’ name sz fun = do
   w <− evWindow () name sz (Just (const fun)) (const kbdQuit)
   return $ postRedisplay (Just (evW w))

observe winname f a = monitor’ winname (mpSize 20) (a >>= f)

run n ws = sequence ws >>= launchFreq n . sequence

camera k = findSize >>= getCam k >>= async ˜> channels

async f = asyncFun 0 id f
 infixl 1 −<
f −< n = asyncFun 0 f n

main = do
   prepare
   cam1 <− camera 0
   cam2 <− camera 1

    feat1 <− fun . float . gray −< cam1
    feat2 <− fun . float . gray −< cam2

    run 20 [   observe ”cam1” (drawImage.rgb) cam1
           ,   observe ”cam2” (drawImage.rgb) cam2
           ,   observe ”f1” sh feat1
           ,   observe ”f2” sh feat2
           ]

sigmas = (take 15 $ getSigmas 1 3)

fun img = (img, fullHessian ( surf 2 2) sigmas 50 0.2 img)

sh (img, feats ) = do
    drawImage img
    setColor 1 1 0
    text2D 20 20 (show $ length $ feats)
    mapM showFeat feats

showFeat p = do
   drawROI $ roiFromPixel (ipRawScale p) (ipRawPosition p)
−− let Pixel y x = ipRawPosition p
−− drawVector x y (10∗ipDescriptor ( ip p))




                                                21
                                           – conc-par.hs –
import   EasyVision
import   Graphics.UI.GLUT(postRedisplay)
import   Control.Monad(liftM2)
import   Control.Arrow((∗∗∗))

monitor’ name sz fun = do
   w <− evWindow () name sz (Just (const fun)) (const kbdQuit)
   return $ postRedisplay (Just (evW w))

observe winname f a = monitor’ winname (mpSize 12) (a >>= f)

run n ws = sequence ws >>= launchFreq n . sequence

camera k = findSize >>= getCam k ˜> float . gray . channels

(.&.) = liftM2 ( liftM2 (,))

async f = asyncFun 0 id f
 infixl 1 −<
f −< n = asyncFun 0 f n

main = do
   prepare
   cams <− camera 0 .&. camera 1 >>= async

     feats <− g |∗∗∗| g −< cams

     run 20 [   observe ”cam1” (drawImage.fst) cams
            ,   observe ”cam2” (drawImage.snd) cams
            ,   observe ”f1” (sh. fst ) feats
            ,   observe ”f2” (sh.snd) feats
            ]

sigmas = (take 15 $ getSigmas 1 3)
fun img = (img, fullHessian ( surf 2 2) sigmas 50 0.2 img)
g = fun

sh (img, feats ) = do
    drawImage img
    setColor 1 1 0
    text2D 20 20 (show $ length $ feats)
    mapM showFeat feats

showFeat p = do
   drawROI $ roiFromPixel (ipRawScale p) (ipRawPosition p)
   let Pixel y x = ipRawPosition p
   drawVector x y (10∗ipDescriptor ( ip p))



11       Planar metric rectification
The next example uses a 3D window and some visual geometry facilities in the library. We would
like to rectify a planar scene containing a known reference. For simplicity we consider only polygonal
shapes, to easily establish point correspondences. The following subtasks must be solved:
     - Finding candidate polygonal shapes. This is currently done using the interface to an efficient
       straight line segment extractor, written in C, previously developed by our group [ref] (simpler

                                                  22
      methods based just on Haskell and IPP could also be used). Individual segments are grouped as
      closed polygons using functions from Data.Graph.

    - Reference detection. Polygons with the wrong number of sides are discarded. The remaining
      ones are checked for projective compatibility in all rotations. The detected reference is shown
      in the camera window.

    - Image rectification. We use warp with the estimated plane-image homography.

    - Estimation of camera pose. The library contains utilities to compute a whole camera matrix from
      a planar homography under the reasonable assumption of a diag(f, f, 1) calibration model.

    The rectified image is rendered in the 3D window in the z = 0 plane. We also include an idealized
representation of the estimated camera showing the captured images in the projection plane, which is a
nice demonstration of perspective laws. We can change the initial vertical viewpoint in the 3D window
using the mouse as a simulated trackball [Section X].




                                                 23
                                             – pose.hs –
import   EasyVision
import   Vision
import   Numeric.LinearAlgebra
import   System.Environment(getArgs)
import   Graphics.UI.GLUT hiding (Point,Size,scale)
import   Control.Monad(when)

main = do
   prepare
   sz <− findSize
   mbf <− maybeOption ”−−focal”
   (cam,ctrl ) <− getCam 0 sz ˜> gray.channels >>= withPause
   wIm <− evWindow () ”image” sz Nothing (const $ kbdcam ctrl)
   w3D <− evWindow3D () ”3D view” 400 (const (kbdcam ctrl))
   launch $ do
        orig <− cam
         let segs = filter ((>0.1).segmentLength) $ segments 4 1.5 5 40 20 True (autoBin orig)
             polis = segmentsToPolylines 0.06 segs
             candis = concat [ alter p | Closed p <− polis, length p == length ref ]
             refs = filter (isRef 0.01) candis
             pts = map pl (head refs)
             h = estimateHomography ref pts
             imf = float orig
             ground = warp 0 (Size 256 256) (scaling 0.2 <> h) imf
             grpos = ht (scaling (1/0.2)) [[1,1],[−1,1],[−1,−1],[1,−1]]
             imt = extractSquare 128 imf
             Just (cam, ) = cameraFromPlane 1E−3 500 mbf pts ref
        inWin wIm $ do
             drawImage orig
             pointCoordinates (size orig )
             setColor 0 0 1; lineWidth $= 1; renderSegments segs
             setColor 1 0 0; lineWidth $= 3; mapM (renderAs LineLoop) candis
             setColor 0 1 0; pointSize $= 5; mapM (renderAs Points) candis
        when (not.null $ refs ) $ inWin w3D $ do
             setColor 0 0 1; lineWidth $= 2; renderAs LineLoop ref
             drawTexture ground $ map (++[−0.01]) grpos
             drawCamera 1 cam (Just imt)
             pointCoordinates (Size 400 400); setColor 1 1 1
             text2D 0.95 (−0.95) (show $ focalFromHomogZ0 $ inv h)

ref = map (map (∗2)) cornerRef
pl (Point x y) = [x,y]
alter pts = map (rotateList pts) [0 .. length ref −1]
 rotateList list n = take (length list ) $ drop n $ cycle list
drawSeg s = (vertex $ extreme1 s) >> (vertex $ extreme2 s)
autoBin img = binarize8u (otsuThreshold img) img
renderAs prim = renderPrimitive prim . (mapM vertex)
renderSegments segs = renderPrimitive Lines $ mapM drawSeg segs

isRef tol pts = dif < tol where
    lps = (map pl pts)
    h = estimateHomography lps ref
    lps ’ = ht h ref
     dif = pnorm Infinity $ flatten (fromLists lps − fromLists lps ’)

    In this example we use parameters which work reasonably well with the video rot3.avi avail-
able from our server:


                                                 24
       ./pose http://perception.inf.um.es/public_data/videos/planar/rot3.avi




     More stable pose estimates are obtained if we supply the focal parameter of the camera:
       ./pose     --focal=1.7       rot3.avi
     If possible, you should try this program with live images from a handheld webcam.
     There is a more complete version in the compvis/pose folder, allowing interactive modification
of all relevant parameters. (It also admits reference polygons with only four points (e.g., paper sheets);
in this case wrong views are discarded by checking for consistency with a diag(f, f, 1) camera.)
     You can also find in this folder more interesting metric rectification experiments based only on
the view of right angles, circles, etc., or even just from frame-to-frame homographies from a com-
pletely unknown planar scene. Actually, the above bottom-up pose estimation method is unacceptably
unstable. A much better approach is based on some variant of the Kalman filter, which is able to
maintain a stable pose estimate based on corrections induced by local prediction errors. This is beau-
tifully demonstrated in the program poseTracker, using as measurements low frequency Fourier
descriptors of shape. This prototype can also be tested with the rot3.avi video.


12     Stereo geometry
TO DO


13     Car plate detection
Let us now write a very basic computer vision application. We would like to detect the location of car
plates in images. A simple (and naive) approach may be to collect a number of typical images, label
them with the location of the plate, and train a pattern classifier to recognize image regions which look
like those of a true car plate in some appropriate feature space.
    We have recorded a few video sequences containing car plates like this:




    The first step is to write a simple tool to select frames in a sequence, manually mark a ROI in it
with the mouse and save it for subsequent analysis.




                                                   25
                                            – roisel.hs –
−− $ ./ roisel source −−save=selected.yuv
−− SPACE to stop video, mark region with mouse right button,
−− S to save desired frame/region, ESC to end.
import EasyVision
import Graphics.UI.GLUT
import Control.Monad(when)

main = do
   sz <− findSize
   (cam,ctrl ) <− getCam 0 sz >>= withPause
   prepare
   mbname <− getRawOption ”−−save”
    let name = case mbname of
                Nothing −> error ”−−save=filename.yuv is required”
                Just nm −> nm
   w <− evWindow False ”Press S to save frame”
                  sz Nothing (mouse (kbdcam ctrl))
   save <− optionalSaver sz

   launch $ do
       orig <− cam
       rec <− getW w
        roi <− getROI w
       inWin w $ do
            drawImage orig
            drawROI roi
            when rec $ do
                save orig
                appendFile (name++”.roi”) (show roi++”\n”)
                putW w False

mouse st (Char ’s ’) Down    = st $= True
mouse def a b c d = def a b c d

    The worker reads the built-in ROI selector of any evWindow and the recording status. If it is
True we save the frame and the ROI. Since the camera has a withPause modifier the user can stop
the video at any time, mark the ROI (see section X) and press S to save the information.
    The ROI-labeled frames selected by the above tool can be reviewed with the following program:




                                                 26
                                          – roibrowse.hs –
−− $ ./roibrowse selected.yuv
−− Browses selected.yuv / selected.yuv.roi with mouse wheel
import EasyVision
import Graphics.UI.GLUT
import System(getArgs)
import qualified Data.Colour.Names as Col

main = do
   prepare
   sz <− findSize
     file : <− getArgs
   roisraw <− readFile ( file ++”. roi ” )
    let rois = map read (lines roisraw) :: [ROI]
         nframes = length rois
   cam <− mplayer (file++” −benchmark”) sz
   imgs <− sequence (replicate nframes cam)
   putStrLn $ show nframes ++ ” cases”
   seeRois imgs rois
   mainLoop

seeRois imgs rois = evWindow 0 ”Selected ROI” (mpSize 20) (Just disp) (mouse kbdQuit)
   where
   disp st = do
       k <− get st
       drawImage (imgs!!k)
       lineWidth $= 3; setColor’ Col.yellow
       drawROI (rois!!k)
       text2D 50 50 (show $ k+1)
   mouse st (MouseButton WheelUp) Down         = do
       st $˜ (min (length imgs −1) . (+1))
       postRedisplay Nothing
   mouse st (MouseButton WheelDown) Down          = do
       st $˜ (max 0 . subtract 1)
       postRedisplay Nothing
   mouse def a b c d = def a b c d

   Using the mouse wheel we can check that the examples of car plates are correct:




    Now we would like to train a pattern classifier to detect the most probable region of each image
containing a car plate. To do this we first build a list of examples from the labeled images and train
a classifier based on more or less promising image features. Then we test it on examples not used for
learning, and finally run the classifier on a previously unseen image sequence. For clarity we prepare
some auxiliary functions in a separate module:




                                                 27
                                               – Util.hs –
module Util where

import   EasyVision
import   Numeric.LinearAlgebra
import   Classifier
import   Text. Printf

readSelectedRois sz file = do
    roisraw <− readFile ( file ++”.yuv. roi ” )
     let rois = map read (lines roisraw) :: [ROI]
         nframes = length rois
    cam <− mplayer (file++”.yuv −benchmark”) sz
    imgs <− sequence (replicate nframes cam)
    putStrLn $ show nframes ++ ” cases in ” ++ file
    return (zip imgs rois)

createExamples commonproc candirois feat (img,roi) = ps ++ ns
    where candis = candirois (theROI img)
         imgproc = commonproc img
         ps = ejs ”+” . sel (>0.5) $ candis
         ns = ejs ”−” . sel (<0.2) $ candis
         ejs t = map (\r −> (overROI feat (r, imgproc), t ))
         sel c = filter (c.overlap roi )

overROI feat ( r , obj) = feat (modifyROI (const r) obj)

shErr d c = putStrLn $ ( printf ” error %.3f” $ 100 ∗ errorRate d c) ++ ” %”
shConf d c = putStrLn $ format ” ” (show.round) (confusion d c)

     The function readSelectedRois reads labeled images in the same way as it was done in the
previous program. More interesting is createExamples. From a list of images and rois it creates
labeled examples (attribute vectors) in the format required by the pattern classifiers available in the
library. Given a list of candidate rectangles covering the image we compute the amount of overlapping
of each of them with the true roi. If it is large enough we have a positive example of plate, and vice
versa. The argument feat computes the desired feature vector. Note that commonproc is previously
performed on the whole image as a previous step shared by all the candidate rois. There are also a few
utilities to display classification error rates, confusion matrices, etc.
     We can now proceed to the main program. It will read one or more roi-labeled video files, train a
pattern classifier with them, estimate the misclassification probability on a different set of examples,
and finally run the obtained classifier on a target video, detecting the most probable location of the
plate in each frame. We must only decide the size and spacing of the candidate grid of rectangles
to consider (genrois), the preprocessing step for each frame (commonproc), the features to com-
pute in each candidate region (feat), and a learning machine for pattern classification (machine).
We can experiment with any desired choice of these functions trying to get acceptable classification
performance.
     Let us show here very simple options. Since car plates typically have high contrast, some statis-
tical characterization (e.g. an histogram) of the result of a high pass filter will probably have some
discrimination ability. For classification we first try one of the most simple techniques, a simple least
squared error linear machine. We use both the ‘crisp’ classifier and the distribution of class outputs
generated by the learning machine. In a given frame we draw all the rois with positive response (in
gray), although we are interested in the ‘most positive’ one (in red). For faster testing we have added a
built-in detectStatic camera combinator which filters out moving images. We can check that the
program works as expected using testing videos available at
http://perception.inf.um.es/public_data/videos/car_plate_recognition/


                                                   28
                                            – roiclass.hs –
−− detection of rois based on examples labeled by roisel
−− $ ./roiclass ’ newvideo −benchmark’ test train1 train2 etc.
import EasyVision
import Graphics.UI.GLUT hiding (histogram)
import Control.Monad(when)
import System(getArgs)
import qualified Data.Colour.Names as Col
import Numeric.LinearAlgebra
import Classifier
import Data.List (maximumBy)
import Util

genrois = roiGridStep 50 200 25 25
commonproc = highPass8u Mask5x5 . median Mask5x5 . gray . channels
feat = vector . map fromIntegral . histogram [0,8 .. 256]
machine = multiclass mse

main = do
   sz <− findSize
   video: test : files <− getArgs
   prepare
   rawexamples <− mapM (readSelectedRois sz) files
    let examples = concatMap (createExamples commonproc genrois feat) (concat rawexamples)
        ( classifier , probs) = machine examples
        detector = head . probs
   putStrLn $ show (length examples) ++ ” total examples”

    rawtest <− readSelectedRois sz test
    let test = concatMap (createExamples commonproc genrois feat) rawtest
    putStrLn $ show (length test ) ++ ” total test examples”
    shConf test classifier
    shErr test classifier

    (camtest, ctrl ) <− mplayer video sz >>= detectStatic 0.01 5 >>= withPause
    w <− evWindow () ”Plates detection” sz Nothing (const (kbdcam ctrl))
    launch $ inWin w $ do
        img <− camtest
        drawImage img
         let imgproc = commonproc img
             candis = map (\r −> (r,imgproc)) (genrois (theROI img))
        lineWidth $= 1
        setColor’ Col.gray
         let pos = filter ((==”+” ). classifier . overROI feat) candis
        mapM (drawROI.fst) pos
        when (not.null $ pos) $ do
             lineWidth $= 3
             setColor’ Col.red
              let best = maximumBy (compare ‘on‘ detector . overROI feat) pos
             drawROI (fst best)

    In some combinations of training/test data we get about 6% error rate, which is a fairly good result
for such a trivial feature and classifier:

 ./roiclass plates-still-left-big.dv plates1 plates2 plates3
YUV4MPEG2 W640 H480 F25:1 Ip A1:1
15 cases in plates2
YUV4MPEG2 W640 H480 F25:1 Ip A1:1
13 cases in plates3


                                                  29
8075 total examples
YUV4MPEG2 W640 H480 F25:1 Ip A1:1
11 cases in plates1
3220 total test examples
356   65
126 2673
error 5.932 %

     The maximum detection response is typically obtained over the plate:




     Of course, these results only give some confidence that there are not disastrous errors in the pro-
gram. A much larger database of plates, imaged under different light conditions, distances, viewpoints,
etc., is required to build a reliable plate detector using this kind of empirical approach.
     An improved version of this program can be found in the compvis/classify folder. It may
be the starting point of several interesting applications: automatic plate reading, plate blurring, etc.


14     Image Types
TO DO

     - Image Types
     - Internal representation
     - ST Interface


15     Adding new low-level functions
15.1    IPP wrappers
The easyVision library includes an automatic wrapper generator for IPP functions based on the pow-
erful Parsec library. This is a relatively easy task, since IPP headers are defined using certain macros
in a very regular way. A minor difficulty is that the arguments to many IPP functions are whole structs
instead of pointers, and therefore we must generate some auxiliary C wrappers suitable for Haskell’s
FFI. In any case, adding new function wrappers is not difficult. As an example, we describe the steps
required to add a wrapper for the function ippiComputeThreshold_Otsu_8u_C1R, which obtains
a reasonable threshold for image binarization based on the distribution of pixel values.
    The first step is adding the function name and header to lib/ImagProc/Ipp/functions.txt.
This file contains the desired functions to be processed by the parser. In this case we add just a single
line:

           ippi.h      ippiComputeThreshold_Otsu_8u_C1R

   If we now make any program in the source tree the parser will rebuild the automatically generated
modules. The end result is that a new low level wrapper ippiComputeThreshold_Otsu_8u_C1R,
working in the IO monad, is available to Haskell. Then a more convenient high level pure function is
defined in the ImagProc/Ipp/AdHoc.hs module:

                                                  30
-- | Computes the Otsu threshold, useful for binarization
otsuThreshold :: ImageGray -> CUChar
otsuThreshold (G im) = unsafePerformIO $ do
    pf <- malloc
    (ippiComputeThreshold_Otsu_8u_C1R // dst im (vroi im)) pf // checkIPP "otsuTh" [im]
    r <- peek pf
    free pf
    return r


    The adapters src (not used here) and dst transform the raw IPP image representation (pointer,
step, roisize) into the image datatypes used in this library. The mandatory final checkIPP checks for
IPP errors and touches the ForeignPtrs of the function arguments.
    This last step cannot be easily automated. It requires some knowledge about the particular com-
putations performed by the function, as well as a reasonable API design, specially in the more exotic
cases.
    Fortunately, for more regular IPP functions (e.g., unary or binary operations on images) we have
tools for automatic generation of pure functions. For example, the |+| operator for ImageFloat
(in module ImagProc.Ipp.Pure) is defined as

(|+|) :: ImageFloat -> ImageFloat -> ImageFloat
(|+|) = mkIntersection ioAdd_32f_C1R

where mkIntersection specifies the ROI policy for the result and ioAdd_32f_C1R has been
automatically generated (in Auto.hs) just from its name in functions.txt.
    See module ImagProc.Ipp.Auto for additional examples.

Wrappers chain The result of the automatic interface generator ioAdd_32f_C1R is an IO func-
tion using ordinary Haskell datatypes, with ‘auxiliary’ arguments (here there are none) moved to the
first places for easy currying:

-- |      Adds, subtracts, or multiplies pixel values of two
--              source images and places the scaled result in the destination image.
ioAdd_32f_C1R = {-# SCC "ippiAdd_32f_C1R" #-} auto_2_32f_C1R f "ippiAdd_32f_C1R"
     where f pSrc1 src1Step pSrc2 src2Step pDst dstStep roiSize =
           ippiAdd_32f_C1R pSrc1 src1Step pSrc2 src2Step pDst dstStep roiSize


This function is a more Haskellish version of the raw wrapper ippiAdd_32f_C1R in Adapt.hs:

ippiAdd_32f_C1R pSrc1 src1Step pSrc2 src2Step pDst dstStep roiSize = do
    proiSize <- new roiSize
    r <- ippiAdd_32f_C1Rx pSrc1 src1Step pSrc2 src2Step pDst dstStep proiSize
    free proiSize
    return r


which uses the auxiliary C wrapper ippiAdd_32f_C1Rx

foreign import ccall "adapt.h ippiAdd_32f_C1Rx"
    ippiAdd_32f_C1Rx :: Ptr Float -> Int -> Ptr Float -> Int
                     -> Ptr Float -> Int -> Ptr IppiSize -> IO Int


defined in adapt.c and adapt.h to use pointers instead of direct struct arguments:

int ippiAdd_32f_C1Rx(Ipp32f* pSrc1, int src1Step, Ipp32f* pSrc2, int src2Step,
                     Ipp32f* pDst, int dstStep, IppiSize* roiSize) {
    return ippiAdd_32f_C1R(pSrc1, src1Step, pSrc2, src2Step, pDst, dstStep, *roiSize);
}




                                                 31
15.2    in Haskell
TO DO

15.3    in C
TO DO


16     Keyboard/Mouse control
In any window (using kbdQuit):

     • Q, ESC: exit application
     • i: save window as png screenshot

In monitor windows (using kbdCam):

     • SPACE: pause
     • S: single frame capture
     • Shift SPACE: don’t call original camera in pause mode.

In 3D windows (trackball):

     • O: reset view
     • Mouse wheel: zoom
     • Mouse left button drag: change viewpoint

In zoom windows:

     • Mouse wheel: zoom
     • Mouse left button click: center pixel


References
TO DO

       Haskell.org
       Hutton
       RWH
       IPP
       Hartley & Zisserman
       Faugeras & Luong
       Ma et al.
       Gonzalez & Woods
       mplayer
       etc.


                                                  32

								
To top