Paper's Information
Author
Sylvain Paris and Fr´edo Durand
Periodical
A. Leonardis, H. Bischof, and A. Pinz (Eds.): ECCV 2006, Part IV, LNCS 3954, pp. 568–580, 2006. c Springer-Verlag Berlin Heidelberg 2006
Titile
A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach
Background
The bilateral filter is a nonlinear filter that smoothes a signal while preserving strong edges. It has demonstrated great effectiveness for a variety of problems in computer vision and computer graphics, and a fast version has been proposed. Unfortunately, little is known about the accuracy of such acceleration.
Abstract
In this paper, we propose a new signal processing analysis of the bilateral filter, which complements the recent studies that analyzed it as a PDE or as a robust statistics estimator. Importantly, this signal-processing perspective allows us to develop a novel bilateral filtering acceleration using a downsampling in space and intensity. This affords a principled expression of the accuracy in terms of bandwidth and sampling. The key to our analysis is to express the filter in a higher-dimensional space where the signal intensity is added to the original domain dimensions. The bilateral filter can then be expressed as simple linear convolutions in this augmented space followed by two simple nonlinearities. This allows us to derive simple criteria for down sampling the key operations and to achieve important acceleration of the bilateral filter. We show that, for the same running time, our method is significantly more accurate than previous acceleration techniques.
Introduction
Elad [11] proposes an acceleration method using Gauss-Seidel iterations, but it only applies when multiple iterations of the filter are required.
Durand and Dorsey [3] describe a linearized version of the f ilter that achieves dramatic speed-ups by downsampling the data, achieving running times under one second.Unfortunately, this technique is not grounded on firm theoretical foundations, and it is difficult to evaluate the accuracy that is sacrificed.
In this paper, we build on this work but we interpret the bilateral filter in terms of signal processing in a higher-dimensional space. This allows us to derive an improved acceleration scheme that yields equivalent running times but dramatically improves numerical accuracy.
Related Work
Our work follows a similar idea and also uses S×R to describe bilateral filtering. Our formulation is nonetheless significantly different because we not only use the higher-dimensional space for the definition of a distance, but we also use convolution in this space.
The main difference between our study and existing work is that the previous approaches link bilateral filtering to another nonlinear f ilter based on PDEs or statistics whereas we cast our study into a signal process ing framework. We demonstrate that the bilateral filter can be mainly computed with linear operations, the nonlinearities being grouped in a final step.
Elad [11] uses Gauss-Seidel iterations to accelerate the convergence of iterative filtering. Unfortunately, no results are shown– and this technique is only useful when the filter is iterated to reach the stable point, which is not its standard use of the bilateral filter (one iteration or only a few).
Durand and Dorsey [3] linearize the bilateral filter and propose a downsampling scheme to accelerate the computation down to few seconds or less. However, no theoretical study is proposed, and the accuracy of the approx imation is unclear.
In comparison, we base our technique on signal processing grounds which help us to define a new and meaningful numerical scheme. Our algorithm performs low-pass filtering in a higher-dimensional space than Durand and Dorsey’s [3]. The cost of a higher-dimensional convolution is offset by the accuracy gain, which yields better performance for the same accuracy.
Signal Processing Approach
We decompose the bilateral filter into a convolution followed by two nonlineari ties.We study each point separately and isolate them in the computation flow.
Homogeneous Intensity
By assigning a couple (WqIq,Wq)toeachpixelq, we express the filtered pixels as linear combinations of their adjacent pixels. Of course, we have not “removed” the division since to access the actual value of the intensity, the first coordinate (WI) has still to be divided by the second one (W). This can be compared with homogeneous coordinates used in projective geometry. Adding an extra coordinate to our data makes most of the computation pipeline computable with linear operations; a division is made only at the final stage. Inspired by this parallel, we call the couple (WI,W)thehomogeneous intensity. Although Equation (3) is a linear combination, this does not define a linear f ilter yet since the weights depend on the actual values of the pixels. The next section addresses this issue.
The Bilateral Filter as a Convolution
But the range weight depends on Ip−Iq and there is no summation on I. To overcome this point, we introduce an additional dimension ζ and sum over it.
Thus,wehavereachedourgoal.Thebilateralfilterisexpressedasaconvolution followedbynonlinearoperations:
Intuition
we propose an informal description of the process before discussing further its consequences.
The spatial domainS is a classical xy image plane and the range domain R is a simple axis labelled ζ.
Then using these two functions wi and w, the bilateral filter is computed as follows. First, we “blur” wi and w i.e. we convolve wi and w with a Gaussian defined on xyζ. This results in the functions wbf ibf and wbf.For eachpoint of the xyζ space, we compute ibf(x,y,ζ) by dividing wbf(x,y,ζ) ibf(x,y,ζ)by wbf(x,y,ζ). The final step is to get the value of the pixel (x,y) of the filtered image Ibf. This directly corresponds to the value of ibf at (x,y,I(x,y)) which is the point where the input image I was “plotted”. Figure 1 illustrates this process on a simple 1D image.
Fast Approximation
In practice, we downsample (wi,w), perform the convolution, and upsample the result:
We use box-filtering for the prefilter of the downsampling (a.k.a. average downsampling), and linear upsampling.
Evaluation
To evaluate the error induced by our approximation, we compare the result Ibf ↓↑ from the fast algorithm to Ibf obtained from Equations (1).
we compute the peak signal-to-noise ratio (PSNR) considering R = [0;1]:
Figure 2 show that this remark is globally valid in practice. A closer look at the plots reveals that S can be slightly more aggressively downsampled than R. This is probably due to the nonlinearities and the anisotropy of the signal.
Fig.2. Accuracy evaluation. All the images are filtered with (σs =16,σr =0.1). The PSNR in dB is evaluated at various sampling of S and R (greater is better). Our ap proximation scheme is more robust to space downsampling than range downsampling. It is also slightly more accurate on structured scenes (a,b) than stochastic ones (c).
Figure 3-left shows the running times for the architectural picture with the same settings. In theory, the gain from space downsampling should be twice the one from range downsampling since S is two-dimensionaland R one-dimensional.
Fig.3. Left: Running times on the architectural picture with (σs =16,σr =0.1). The PSNR isolines are plotted in gray. Exact computation takes about 1h. • Right: Accuracy-versus-time comparison. Both methods are tested on the architectural picture (1600×1200) with the same sampling rates of S×R (from left to right): (4;0.025) (8;0.05) (16;0.1) (32,0.2) (64,0.4).
Our scheme achieves a speed-up of two orders of magnitude: Direct computation of Equations (1) lasts about one hour whereas our approximation requires one second. This dramatic improvement opens avenues for interactive applications.
Comparison with the Durand-Dorsey Speed-Up
The differences comes from the downsampling approach. Durand and Dorsey interleave linear and nonlinear operations: The division is done after the convolution but before the upsampling. There is no simple theoretical base to estimate the error. More importantly, the Durand-Dorsey strategy is such that the intensity ι and the weight ω are functions defined on S only. A given pixel has only one intensity and one weight. After downsampling, both sides of the discontinuity may be represented by the same values of ι and ω.This isapoor representation of the discontinuities since they inherently involve several values.
In comparison, we define functions on S×R. For a given image point in S,we can handle several values on the R domain. The advantage of working in S×R is that this characteristic is not altered by downsampling. It is the major reason why our scheme is more accurate than the Durand-Dorsey technique, especially on discontinuities.
Implementation
Fig.4. We have tested our approximated scheme on three images (first row): an ar tificial image (512 × 512) with different types of edges and a white noise region, an architectural picture (1600 × 1200) with strong and oriented features, and a natural photograph (800×600) with more stochastic textures. For clarity, we present represen tative close-ups (second row). Full resolution images are available on our website. Our approximation produces results (fourth row) visually similar to the exact computation (third row). A color coded subtraction (fifth row) reveals subtle differences at the edges (red: negative, black: 0, and blue: positive). In comparison, the Durand-Dorsey approx imation introduces large visual discrepancies: the details are washed out (bottom row). All the filters are computed with σs =16andσr =0.1. Our filter uses a sampling rate of (16,0.1). The sampling rate of the Durand-Dorsey filter is chosen in order to achieve the same (or slightly superior) running time. Thus, the comparison is done fairly, using the same time budget.
Discussion
Dimensionality.
Our separation into linear and nonlinear parts comes at the cost of the additional ζ dimension.
Comparison with Generalized Intensity.
We not only manipulate points in this space but also define functions and perform convolutions and slicing.
Complexity.
One of the advantage of our separation is that the convolution is the most complex part of the algorithm. Using |·| for the cardinal of a set, the convolution can be done in O(|S||R|log(|S||R|)) with fast Fourier transform and multiplication in the frequency domain.
Reference
这篇论文提出了一种新的双边滤波加速方法:
1. 将双边滤波看作是在一个高维空间中的操作。这个高维空间不仅包括图像的空间维度(如2D图像的x,y坐标),还包括图像的强度维度(如图像的灰度值或颜色值)。在这个高维空间中,双边滤波可以表示为简单的线性卷积,然后再接两个简单的非线性变换。
2. 在高维空间中对双边滤波进行下采样。既然双边滤波可以表示为高维空间中的线性卷积,那么就可以对其进行下采样来加速计算。这里的下采样既包括对空间维度的下采样,也包括对强度维度的下采样。通过合理地选择下采样的步长,可以在保证精度的同时显著加速双边滤波的计算。
3. 对下采样后的结果进行补偿。由于下采样会损失一部分高频信息,所以需要对下采样后的结果进行补偿,以恢复丢失的高频细节。这可以通过在空间维度和强度维度上进行插值来实现。
4. 通过带宽和采样率来度量加速的精度。由于双边滤波可以表示为高维空间中的线性卷积,所以可以用频域的概念来分析其精度。下采样的步长越大,加速的效果越明显,但失真也越严重。通过分析双边滤波在高维空间中的频谱特性,可以得到带宽和采样率与加速精度之间的定量关系。
本文的创新之处在于,它从信号处理的角度对双边滤波进行了分析,提出了一种新的基于下采样的加速方法。与以往的加速方法相比,该方法在相同的运行时间内可以取得更高的精度。同时,该方法还给出了一种定量评估加速精度的方法,即用带宽和采样率来度量。这为双边滤波的加速提供了一种新的思路。
峰值信噪比 - 维基百科,自由的百科全书 (wikipedia.org)