This approach still allows ray marching to be completely eliminated outside the light’s frustum. Inside the illuminated frustum, the three segments described in step 7 can be sampled with different granularity. Results using a variety of geometry and lighting are shown in Figure 4.2.5.
FIGURE 4.2.5 Using our hybrid approach under multi-colored textured spotlights allows sampling at different granularity along different segments of each viewing ray, allowing images like these to be rendered at up to 80 frames per second.
IMPLEMENTATION DETAILS
While the overall algorithm is straightforward, we encountered various additional optimizations and implementation issues that needed addressing. We found computing scattering at full-screen resolution excessive; computing scattering at 2562 was sufficient for high-quality 1,0242 renderings. This coarser resolution is acceptable because participating media has a diffusing effect and generally contributes only a small portion of the scene’s illumination. Sharp lighting variations simply become blurred in these environments. Typically, a 32 Gaussian filter eliminates most artifacts when upsampling this lower-resolution scattering image, but light leakage occurs when blurring across depth discontinuities, so an edge-preserving filter is needed. We used a 32 bilateral filter [Tomasi98] to eliminate light leakage, though the accompanying demo on this book’s DVD-ROM uses a simplified edge-preserving filter based on a 32 tent filter that provides similar quality. The idea behind both filters is that a weight of a texel in the coarse image is zeroed if scattering was computed at a drastically different depth from a pixel in the final image, which can be implemented as per the following GLSL pseudo code.
float texelDepth = DistanceToObj ectOccludedByFog( T_ij );
if ( abs( texelDepth - depthOfP ixel ) < distanceThreshold )
Sun et al.’s model precomputes the integrals described above, requiring two texture lookups to evaluate each interval. However, their model was designed to analytically compute one integral per pixel, so costs for texture coordinate computation played only a small role in overall performance. In their model, the scattering integral is approximated as
where A0, A1, and A2 are functions of dl, ds, Ks, and γ given by
and is precomputed and stored in a texture. While A0 and A1 are independent of distance along the viewing ray, A2 with its expensive arctangent must be recomputed every step along the ray. We found reparameterizing F as a function of cos γ and x, both values already cheaply computed inside the shader, and dynamically recomputing the reparameterization F’ as necessary saved a significant amount of time. In this case, we used
where F′(cosγ,x) = A0(d1,γ,Ks)F(A1(d1,γ,Ks),A2(d1,γ,x)) and must be recomputed whenever the distance to the light dl or the scattering coefficient Ks change. This reparameterization can be performed via the following GLSL pseudo code shader, which generates a 2D table storing F′(cosγ,x) in a texture, where the x-coordinate represents in the range cosγ [–1..1] and the y-coordinate represents the distance x, in the range [0..far].
uniform sampler2D fTex; // Tabulated precomputed F integral uniform v ec2 fTexRange; // Maximal u,v v alues sampled in fTex uniform v ec2 imageS iz e; // Output buffer siz e, in pixels.
v oid main( v oid ) {
float X = (gl_FragCoord.y / imageS iz e) * distToFarP lane;
float cosGamma = 2 *( gl_FragCoord.x / imageS iz e) - 1;
float sinGamma = sqrt( 1 - cosGamma * cosGamma );
float gamma_2 = 0.5 * atan( sinGamma, cosGamma );
float distToLight = DistanceFromCurrentE yeP ointToLight();
float A_0 = k_s * exp( -k_s * cosGamma * distToLight ) / ( 6.28319 * distToLight * sinGamma );
float A_1 = k_s * distToLight * sinGamma;
float A_2 = 0.78540 +
0.5 * atan ( ( X - distToLight * cosGamma ) / ( distToLight * sinGamma ) );
v ec2 fCoord = v ec2( A_1, A_2 ) / fTexRange;
gl_FragColor = A_0 * texture2D( fTex, fCoord );
}
One key to reducing the number of samples used during ray marching is to intelligently pick the sample locations. Imagire et al.
[Imagire07] chose to sample along planes perpendicular to the viewing direction, using a slicing technique.
Unfortunately, this adds correlation between sampling locations that causes aliasing along shadow boundaries. For our simple point light hybrid, we found that uniformly subdividing the region between front and back shadow volumes, as shown in Figure 4.2.6, worked satisfactorily. While this still correlates samples, the correlation planes lie roughly parallel to shadow volumes, reducing aliasing significantly. Unfortunately, under textured spotlights the resulting crease in the sampling planes causes an abrupt and noticeable change in the scattered illumination. To avoid this, we sample the interval Δ along each eye ray, where this interval depends on the angle between eye and light viewing directions:
where dfront is the distance to the light of the point along the front-facing shadow volume (i.e., the closest black points in Figure 4.2.6), φ is the light field-of-view, and β is the angle between viewing and light orientations. Unfortunately, the denominator blows up as β approaches
so we approximate this ΔI ≈ 6 * dfrontsinφ to avoid this error. This leads to sampling planes similar to those in the right of Figure 4.2.6.
Sampling this whole interval Δ' is unnecessary, since some samples may be occluded by geometry. Having samples correlated along planes parallel to neither the light nor the eye avoids many aliasing issues and avoids introducing artifacts where sampling planes intersect.
FIGURE 4.2.6 When marching between shadow polygons under a point light, sampling the front-to-back distance uniformly (left) significantly reduces aliasing, especially compared to naïve ray marching the whole ray. However, this leads to a sudden change in sampling planes (center) clearly visible under textured illumination, so a slightly wasteful sampling (right) that aligns sample planes partway between the eye and light views yields much higher quality, even though some samples are discarded.
Another technique that reduces aliasing uses a variance shadow map (VSM) [Donnelly06], rather than a standard shadow map, when sampling visibility during ray marching. This allows shadow map queries to return non-binary values and helps smooth shadow boundaries. While this reduces the number of ray steps required, variance shadow maps cost slightly more to generate, and we found this cost roughly equal to the performance gained by using fewer samples. However, in applications with already computed VSMs, they provide an additional method for reducing sampling costs.
RESULTS
We implemented our technique using OpenGL and GLSL, with provided timings from an NVIDIA GeForce 8800 GTX on a 2.66 GHz multi-core Xeon processor. Our prototype implementation is provided on the accompanying DVD-ROM. Timings from Table 4.2.1 are all computed when rendering 1,0242 final images, and for reference are compared with costs for naïve, brute force ray tracing and a rendering without volumetric shadows.
The ray samples required for high quality vary dramatically, depending on the complexity of the scene and the light source. For consistency, the timings provided in Table 4.2.1 all use the same sampling rates, determined based on a sampling rate that provided alias-free results for most of our scenes. For brute force marching, our point light hybrid, and our textured light hybrid, respectively, we used 150, 50, and 150 ray samples per pixel. For simple scenes, such as the sphere, our hybrid required many fewer samples for good quality. In more complex scenes, for example, using the “YeahRight” model, the sampling rate needed to be doubled (300 samples per pixel) to eliminate aliasing at shadow boundaries.
We found that limiting ray stepping to inside the shadow volume typically gave a 3–8 times speedup over brute force ray marching, and computing scattering at lower resolution increases performance by another 25–100%. When using our textured light hybrid, the performance improvement is somewhat less pronounced, especially in scenes where the light frustum covers most of the view.
CONCLUSIONS
This article presented a hybrid for rendering volumetric shadows that combines ray marching and shadow volumes. The basic idea works only for point light sources, but can be extended to intelligently sample illumination and visibility under more complex textured spotlights. Under both lighting conditions, our technique runs in real time for simple scenes, remains interactive for more complex geometry, and outperforms brute-force ray marching while giving comparable results.
Further examples, videos, an executable demo, code, and discussion are available either on the accompanying DVD-ROM, the technical paper [Wyman08], or on the project Web page at www.cs.uiowa.edu/~cwyman/publications/.
TABLE 4.2.1 A comparison of our hybrid's costs to those for a brute force, ray marching approach and a rendering without volumetric shadows
REFERENCES
[Biri06] Biri, V., Arques, D., and Michelin, S. “Real time rendering of atmospheric scattering and volumetric shadows.” Journal of WSCG, Vol 14, pages 65–72, 2006.
[Donnelly06] Donnelly, W. and Lauritzen, A. “Variance shadow maps.” In Proceedings of ACM Symposium on Interactive 3D Graphics and Games, pages 161–165, 2006.
[Hoffman03] Hoffman, N. and Preetham, A. “Real-time light-atmosphere interactions for outdoor scenes.” In Graphics Programming Methods, pages 337–352. Charles River Media, 2003.
[Imagire07] Imagire, T., Johan, H., Tamura, N., and Nishita, T. “Anti-aliased and real-time rendering of scenes with light scattering effects.” The Visual Computer, 23(9), pages 935–944, 2007.
[James03] James, R. “True volumetric shadows.” In Graphics Programming Methods, pages 353–366. Charles River Media, 2003.
[Nishita87] Nishita, T., Miyawaki, Y., and Nakamae, E. “A shading model for atmospheric scattering considering luminous distribution of light sources.” In Proceedings of ACM SIGGRAPH, pages 303–310, 1987.
[Sun05] Sun, B., Ramamoorthi, R., Narasimhan, S., and Nayar, S. “A practical analytic single scattering model for real time rendering.”
ACM Transactions on Graphics, 24(3), pages 1040–1049, 2005.
[Tomasi98] Tomasi, C. and Manduchi, R. “Bilateral filtering for gray and color images.” In Proceedings of IEEE International Conference on Computer Vision, pages 839–846, 1998.
[Wyman08] Wyman, C. and Ramsey, S. “Interactive Volumetric Shadows in Participating Media with Single Scattering.” In Proceedings of the IEEE Symposium on Interactive Ray Tracing, 2008.
4.3 Real-Time Dynamic Shadows for Image-Based Lighting
MARK COLBERT, JAROSLAV KRIVÁNEK
INTRODUCTION
We describe the implementation of a simple real-time GPU-based algorithm to compute a spherical harmonic-based visibility function for environment lighting. Visibility can be computed at the vertices or texels of an object and be used for fully dynamic shadow computation on diffuse as well as glossy surfaces in scenes illuminated by an environment map where geometry, illumination, and materials are allowed to change in real-time, at about 70 FPS (see Figure 4.3.1). The algorithm first appeared in our paper [Krivánek08]; this article describes it in more detail with a special focus on practical implementation issues. In addition, we provide full source code of a demo application on the DVD-ROM. The demo is written in C++ with DirectX and HLSL for the GPU shaders. The code snippets in the article use the same languages.
FIGURE 4.3.1 Using our technique, shadows for dynamic scenes can be computed in real-time without any precomputation for both diffuse surfaces (left) as well as complex, spatially varying, glossy surface reflection (right). For these two images, the
boxes render at 70 FPS, and the rusty robot renders at 75.8 FPS for a 1k × 1k on an NVIDIA 8800 GTX.
RELATED WORK
Rendering of objects illuminated by high-dynamic-range (HDR) environment maps provides images of remarkable visual quality [Debevec02]. Existing real-time rendering algorithms based on precomputed radiance transfer [Kautz05] suffer from long precomputation times precluding their use in dynamic scenes. Other real-time algorithms support dynamic scenes at the expense of visibility. However, the absence of shadows can compromise image quality since shadows play an important role in understanding a scene. Annen et al. [Annen08] recently proposed an alternative technique for real-time soft shadows due to environment maps that is more accurate than ours. However, it is slower and consumes significantly more memory.
ALGORITHM OVERVIEW
For realistic shadows under complex environment lighting, we first determine the directions in the environment that cast the strongest shadows. Using importance sampling, we find high-intensity areas in an environment map (“Environment Map Importance Sampling” section). To find the occluding geometry, we then render orthographic shadow maps for the sampled directions. However, evaluating all these shadow maps for every pixel of every frame can be computationally burdensome. As an efficient alternative, we convert the occlusion defined per shadow map into a filtered visibility map defined per object (“Visibility Map Generation”).
Once the visibility map is computed, we can use it to render images with shadows. For diffuse reflections, we use a simple dot product operation per pixel (“Rendering Shadows on Diffuse Surfaces”). For glossy reflections, we adapt the filtered importance sampling algorithm [Colbert07, K ivánek08], where we add shadows by attenuating the contribution of each filtered sample using the visibility map (“Rendering Shadows on Glossy Surfaces”).
ENVIRONMENT MAP IMPORTANCE SAMPLING
Importance sampling is used to generate directions from which we will create the shadow maps. Our goal is to find more samples in directions where the environment map intensity is large, that is, produce the sample directions proportionally to the environment map luminance. The rationale is that the brightest parts of the environment cast the most pronounced shadows, so we want to compute the shadow maps for these directions.
We use a latitude-longitude mapping to associate a direction on the sphere with 2D texture coordinates, since the sampling
procedure works in the rectangular texture domain. A pixel of the environment map with texture coordinates of (u, v) corresponds to the direction whose spherical coordinates are
φ=2πu
θ=π(1-v) EQUATION 4.3.1
The Cartesian coordinates of the direction can be computed as:
x=sinθ cosφ y=sinθ sinφ
z=cosθ EQUATION 4.3.2
Before importance sampling starts, the RGB color of each pixel of the environment map is converted to luminance using the formula, Y = 0.2126 R + 0.7152 G +0.0722 B. The luminance is multiplied by the factor sin Θ, where Θ is the elevation angle corresponding to each pixel, in order to compensate for the stretching near the poles due to the latitude-longitude mapping. This step gives us a luminance map of our lighting environment that we can use for sampling.
Formally, the luminance map is an unnormalized probability mass function (PMF). In general, a PMF is just a set of probabilities that a number of events will take place. In our case an event means choosing a particular pixel of the 2D environment map in the sampling procedure. In order to call our luminance map a PMF, we should first normalize it by scaling every pixel’s value such that all pixels sum to one. However, we can skip this costly explicit normalization step. Instead, we perform an implicit normalization by rescaling the random numbers used for sampling by the sum of all environment map pixels.
Our sampling problem is defined in 2D, but we can easily simplify it to sampling two 1D functions: First, we pick a row of the environment map and, second, we pick a pixel within the selected row. To select the row, we use the marginal PMF, which is simply the sum of luminances for each row of the environment map. This way, rows with brighter pixels are more likely to be sampled than rows with dim pixels. After selecting a row, we pick a pixel according to the 1D PMF given by the pixel luminances in the selected row.
This concept is illustrated in Figure 4.3.2.
Now all we need to perform sampling is a procedure to randomly pick an element according to a 1D PMF. Imagine that you stack the probabilities in the PMF next to each other as shown in Figure 4.3.3. If we choose a random number, where there exists an equal likelihood that any value between 0 and 1 will be produced, then more numbers will map to the higher-probability item, the third probability in our example, and thus we will sample that direction more often. The stacking of the PMF is formally known as the discrete cumulative distribution function (CDF).
FIGURE 4.3.2 Sampling from a 2D probability mass function (PMF). First, a row is selected using the 1D marginal PMF (given by the sum of the probabilities in each row). Second, a pixel in the selected row is picked using the 1D PMF for the selected row.
FIGURE 4.3.3 The discrete cumulative distribution function (CDF). By stacking the PMF of each column (left), we obtain a CDF (right). Using a random number u between zero and one, we can find the column whose CDF range contains u. Continuing this process, we get a distribution of column samples proportional to the PMF. In this case, since the third column is larger than the
others, we will have more samples from that column.
Sampling from the PMF thus reduces to generating a random number u uniformly distributed between zero and the sum of all probabilities, and finding the first entry in the CDF larger than u. Binary search can be used for this purpose.
It would be undesirable if two or more samples were generated very close to each other, which could easily happen with purely random sampling. To prevent this sample clumping, we generate quasi-random numbers using the folded Hammersley sequence [Pharr04], which provides good sample distributions.
Let us now summarize the sampling procedure.