Docstoc

Next-Generation Stanford Shading

Document Sample
Next-Generation Stanford Shading Powered By Docstoc
					Interactive Time-Dependent Tone Mapping
  Using Programmable Graphics Hardware
 Nolan Goodnight   Rui Wang   Cliff Woolley Greg Humphreys
                   University of Virginia




Eurographics Symposium on Rendering 2003
           25-27th June - Leuven, Belgium
HDR and Tone Mapping




  Clamped to [0,1]     Compressed
Advances in graphics hardware


     Physically-based
      rendering on the
      GPU
      (Purcell et al, 2003)




     High dynamic range
      texture mapping

      (Debevec et al, 2001)
System Overview

   Interactive tone mapping system for an OpenGL
    application

        application         tone mapping system

         Display
                                  HDR image
         callback




          Frame
                                  LDR image
          buffer
Interface to the application



       application          tone mapping system




      tmInitialize();     // Initialize the system
      tmEnable();         // Retarget GL calls
          Draw geometry
          tmCompress();   // Compress output
      tmDisable();        // Restore app context
Choosing a tone mapping operator

   Photographic Tone Reproduction for High Contrast
    Images (Reinhard et al, 2002)
        Global operator is a simple transfer function


                                  Ld
           Ls                 1


    Ld 
         1  Ls
    Ls   scaled luminance
                              0                          Ls
Choosing a tone mapping operator

   Local operator
        Digital analog to ‘burning’ and ‘dodging’




           Ls
    Ld 
         1  Vs
    Vs    local area luminance
                                         Center-surround
Why use this tone mapping operator?



   Global operator is
    simple and fast to
    compute


   Only one global
    computation


   We can dynamically
    choose the number
    of zones
Variable number of zones: 3




                              3 Zones
Variable number of zones: 4




                              3 Zones
Variable number of zones: 5




                              3 Zones
Variable number of zones: 6




                              3 Zones
Variable number of zones: 7




                              3 Zones
Variable number of zones: 8




                              3 Zones
System block diagram

   OpenGL                                                            Tone mapping system
  application
                                original                  Buffer 1                Buffer 2
                                Image
   Display
   callback                                         Scaled                 Scaled
                              luminance           luminance              luminance


                              luminance /         convolution            convolution
              yes
compress                    log luminance

      no                                          Gaussian                Gaussian
                               reduction
                                                   pyramid                pyramid
                                                  (Level si)             (Level si+1)
                                Scaled
                              luminance
 Frame buffer
                                                           zone calculation
                                       Buffer 0
                                                  Zone map               Zone map
                      Operator
      LDR image                                    {0,..,i+1}             {0,…,i}
                    global / local
Implementation

   Target architecture
       ATI Radeon 9800 (R350)
   Data storage
       Floating-point off-screen buffers (pbuffers)
       Multiple rendering surfaces (GL_AUXi)
   Algorithms
       ARB fragment and vertex assembly
       Generate fragments with image-sized quads
   Data representation
       Vector vs. scalar organization
Global operator block diagram

   OpenGL                                                            Tone mapping system
  application
                                original                  Buffer 1                Buffer 2
                                Image
   Display
   callback                                         Scaled                 Scaled
                              luminance           luminance              luminance


                              luminance /         convolution            convolution
              yes
compress                    log luminance

      no                                          Gaussian                Gaussian
                               reduction
                                                   pyramid                pyramid
                                                  (Level si)             (Level si+1)
                                Scaled
                              luminance
 Frame buffer
                                                           zone calculation
                                       Buffer 0
                                                  Zone map               Zone map
                      Operator
      LDR image                                    {0,..,i+1}             {0,…,i}
                    global / local
Implementation: global operator

                      Simple luminance transform
   HDR image          Store luminance and log
                       luminance in separate channels

  Luminance
 Log luminance
                  luminance        log luminance
    Mipmap
   reduction


   LDR image


  Single buffer
Implementation: global operator

                  Single rendering surface
   HDR image


  Luminance
 Log luminance


    Mipmap
   reduction      log luminance channel

                      log average luminance
   LDR image
                               1                         
                      Lw  exp   log   Lw ( x, y )  
  Single buffer                 N x, y                   
Implementation: global operator


   HDR image


  Luminance       texture 1   operator
 Log luminance
                              shader
    Mipmap
   reduction


   LDR image


  Single buffer
Local operator block diagram

   OpenGL                                                            Tone mapping system
  application
                                original                  Buffer 1                Buffer 2
                                Image
   Display
   callback                                         Scaled                 Scaled
                              luminance           luminance              luminance


                              luminance /         convolution            convolution
              yes
compress                    log luminance

      no                                          Gaussian                Gaussian
                               reduction
                                                   pyramid                pyramid
                                                  (Level si)             (Level si+1)
                                Scaled
                              luminance
 Frame buffer
                                                           zone calculation
                                       Buffer 0
                                                  Zone map               Zone map
                      Operator
      LDR image                                    {0,..,i+1}             {0,…,i}
                    global / local
Implementation: GPU-based convolutions

   Transform n-vector product into multiple 4-vector
    products

        filter

    luminance




                 +             + …………
Vectorizing the luminance


     Stacked domain
                          Output 4 pixels at
                           the same time


                          Useful for expensive
                           algorithms


                          Requires a
                           conversion back to
                           scalar form.
Vectorizing the luminance

   A simple method for luminance vectorization:



luminance

                                                   R
                                                   G
                                                   B
                                                   A
Vectorizing the luminance

   A simple method for luminance vectorization:



luminance

                                                   R
                                                   G
                                                   B
                                                   A
Vectorizing the luminance

   A simple method for luminance vectorization:



luminance

                                                   R
                                                   G
                                                   B
                                                   A
Vectorizing the luminance

   A simple method for luminance vectorization:



luminance

                                                   R
                                                   G
                                                   B
                                                   A
Vectorizing the luminance

   A simple method for luminance vectorization:



luminance

                                                   R
                                                   G
                                                   B
                                                   A

   Preserves spatial locality
GPU-based convolutions

 filter
image



          Example:   1 x n inner product



stacked
image
GPU-based convolutions

 filter
image

          Pass 1




stacked
image
GPU-based convolutions

 filter
image

          Pass 1       Pass 2


                   +



stacked
image
GPU-based convolutions

 filter
image

          Pass 1       Pass 2       Pass 3


                   +            +



stacked
image
GPU-based convolutions

   Compute multiple 4-vector products per pass
       Less shader and texture switching


                    Single render pass


                     +             +



stacked
image
GPU-based convolutions

   Compute multiple 4-vector products per pass
       Less shader and texture switching


                    Single render pass


                     +             +



stacked
image
GPU-based convolutions

   Compute multiple 4-vector products per pass
       Less shader and texture switching


                    Single render pass


                     +             +



stacked
image
GPU-based convolutions

   Compute multiple 4-vector products per pass
       Less shader and texture switching


                    Single render pass


                     +             +



stacked
image
GPU-based convolutions

   Compute multiple 4-vector products per pass
       Less shader and texture switching


                    Single render pass


                     +             +



stacked
image
GPU-based convolutions

   Advantages:
       Handles large kernels
       Efficient memory access
       No transform back to scalar values

    512 X 512 image
     11 x 11 kernel                          ~ 6 ms
     21 x 21 kernel                          ~ 10 ms
     41 x 41 kernel                          ~ 16 ms
System block diagram

   OpenGL                                                            Tone mapping system
  application
                                original                  Buffer 1                Buffer 2
                                Image
   Display
   callback                                         Scaled                 Scaled
                              luminance           luminance              luminance


                              luminance /         convolution            convolution
              yes
compress                    log luminance

      no                                          Gaussian                Gaussian
                               reduction
                                                   pyramid                pyramid
                                                  (Level si)             (Level si+1)
                                Scaled
                              luminance
 Frame buffer
                                                           zone calculation
                                       Buffer 0
                                                  Zone map               Zone map
                      Operator
      LDR image                                    {0,..,i+1}             {0,…,i}
                    global / local
Calculating adaptation zones on the GPU

        luminance          luminance
FRONT


                    0
               filtered
                                       1
                                  filtered




BACK


         Buffer 0           Buffer 1
Calculating adaptation zones on the GPU

        luminance          luminance
FRONT


                    2
               filtered
                                       1
                                  filtered




BACK


         Buffer 0           Buffer 1
Calculating adaptation zones on the GPU

        luminance          luminance
FRONT


                    2
               filtered
                                       3
                                  filtered




BACK


         Buffer 0           Buffer 1
Calculating adaptation zones on the GPU

        luminance          luminance
FRONT


                    4
               filtered
                                       3
                                  filtered




BACK


         Buffer 0           Buffer 1
Performance: global operator

                     700

                     600
                                                       16 bit floats
 Frames per second




                     500
                                                       32 bit floats
                     400

                     300

                     200

                     100

                      0
                           256 x 256   512 x 512   1024 x 512   1024 x 1024

                                            Image size
Performance: local operator

                     25
                                             16 bit floats
                     20
 Frames per second




                                             32 bit floats

                     15


                     10


                     5


                     0
                          4   5         6       7            8

                                  Number of zones
Performance comparison: CPU vs. GPU
Results: Accuracy

   Comparison with CPU: 512 x 512 image

    Image                   RMS % error

    Scaled luminance        0.022 %
    Convolution (5 x 5)     0.026 %
    Convolution (49 x 49)   0.032 %
    Final image             1.051 %
False-color zone images




CPU                 GPU
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Images generated at ~30Hz

    Clamped [0,1]   Compressed: 2 zones
Conclusion and Future Work

   Summary
       System for interactively compressing HDR output
        from an OpenGL application
       Complex tone mapping operator on the GPU


   Future Work
       Other tone mapping operators
       Further optimizations
       Non-invasive implementation

				
DOCUMENT INFO