The Steerable Pyramid
What is a steerable pyramid?
The Steerable Pyramid is a linear multi-scale, multi-orientation image decomposition that provides a useful front-end for image-processing and computer vision applications. We developed this representation in 1990, in order to overcome the limitations of orthogonal separable wavelet decompositions that were then becoming popular for image processing (specifically, those representations are heavily aliased, and do not represent oblique orientations well). Once the orthogonality constraint is dropped, it makes sense to completely reconsider the filter design problem (as opposed to just re-using orthogonal wavelet filters in a redundant representation, as is done in cycle-spinning or undecimated wavelet transforms!). Detailed information may be found in the references listed below. |
The basis functions of the steerable pyramid are Kth-order directional derivative operators (for any choice of K), that come in different sizes and K+1 orientations. As directional derivatives, they span a rotation-invariant subspace (i.e., they are equi-variant) and they are designed and sampled such that the whole transform forms a tight frame. An example decomposition of an image of a white disk on a black background is shown to the right. This particular steerable pyramid contains 4 orientation subbands, at 2 scales. The smallest subband is the residual lowpass information. The residual highpass subband is not shown. | |
The block diagram for the decomposition (both analysis and synthesis) is shown to the right. Initially, the image is separated into low and highpass subbands, using filters L0 and H0. The lowpass subband is then divided into a set of oriented bandpass subbands and a low(er)-pass subband. This low(er)-pass subband is subsampled by a factor of 2 in the X and Y directions. The recursive (pyramid) construction of a pyramid is achieved by inserting a copy of the shaded portion of the diagram at the location of the solid circle (i.e., the lowpass branch). More detailed descriptions may be found in the references (below). | |
What advantages does it have over separable orthonormal wavelets?
The steerable pyramid performs a polar-separable decomposition in the frequency domain, thus allowing independent representation of scale and orientation. Since it is a tight frame, it obeys the generalized form of Parseval's Equality: The vector-length (L2-norm) of the coefficients equals that of the original signal.
More importantly, the representation is translation-invariant (i.e., the subbands are aliasing-free, or equivariant with respect to translation) and rotation-invariant (i.e., the subbands are steerable, or equivariant with respect to rotation). This can make a big difference in applications that involve representation of position or orientation of image structure.
The primary drawback is that the representation is overcomplete by a factor of 4k/3, where k is the number of orientation bands. Also, the filter design problem is messy, and so space-domain implementation is not perfect-reconstruction (although errors are small enough for most applications). We typically use a frequency-domain implementation, which provides perfect reconstruction, but the resulting filters exhibit more spatial "ringing".
Here is a table comparing properties to other well-known transforms (more information about these transforms may be found in this book chapter):
| Steerable Pyramid | Separable Orthog. Wavelet | Laplacian Pyramid | Gabor (octave) | Block DCT |
jointly-localized (space/frequency) | yes | yes (can be) | yes | not inverse | not in frequency |
translation-equivariant (no aliasing) | yes (approx) | no | yes (approx) | no | no |
oriented kernels | yes | no (not diagonals) | N/A | yes | no |
rotation-equivariant (steerable) | yes (approx) | no | N/A | no | no |
tight frame (self-inverting) | yes (approx) | yes | no | no | yes |
overcompleteness | 4k/3 | 1 | 4/3 | 1 | 1 |
For what applications is the steerable pyramid useful?
Applications include: orientation analysis, noise removal and enhancement, transient detection, texture representation and synthesis.
Some more examples:
Bill Freeman's Filip Rooms' How can I try it out?
Matlab source code is available in our GitHub repository. More information may be found in the README file. A listing of the contents of this file may be found in the Contents file. The latest modifications to the program are described in the ChangeLog file. Some older C source code is also available, although the filters accompanying this code are not very accurate. More information may be found in the README file.
Partial List of References
Steerable Pyramid Transforms
M Unser, N Chenouard, and D Van De Ville Steerable Pyramids and Tight Wavelet Frames in L2(Rd) IEEE Trans. Image Processing, 20(10):2705-2721, Oct 2011. J Portilla and E P Simoncelli A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients Int'l Journal of Computer Vision. October, 2000. Abstract and code [This paper describes and uses a complex steerable pyramid] R Manduchi, P Perona and D Shy. Efficient Deformable Filter Banks. IEEE Trans Signal Processing, 46(4):1168-1173, 1998. A Karasaridis and E Simoncelli. A Filter Design Technique for Steerable Pyramid Image Transforms. Int'l Conf. Acoustics Speech and Signal Processing. Atlanta GA, May 1996. Abstract & Download E P Simoncelli and W T Freeman. The Steerable Pyramid: A Flexible Architecture for Multi-Scale Derivative Computation. IEEE Second Int'l Conf on Image Processing. Washington DC, October 1995. Abstract & Download [This paper describes the decomposition implemented in the source code above] H Greenspan, S Belongie, R Goodman, P Perona, S Rakshit, and C H Anderson. Overcomplete steerable pyramid filters and rotation invariance. Proceedings CVPR 1994, pp. 222-228. D Shy and P Perona. X-Y Separable Pyramid Steerable Scalable Kernels. Proceedings CVPR 1994, pp. 237-244. E P Simoncelli, W T Freeman, E H Adelson and D J Heeger. Shiftable Multi-Scale Transforms [or, "What's Wrong with Orthonormal Wavelets"]. IEEE Trans. Information Theory, Special Issue on Wavelets. Vol. 38, No. 2, pp. 587-607, March 1992. Abstract & Download Applications
S Lyu and E P Simoncelli. Modeling multiscale subbands of photographic images with fields of Gaussian scale mixtures IEEE Trans. Patt. Analysis and Machine Intelligence, Apr 2009. Abstract [state-of-the-art denoising, as of 2010 - equal to BM3D] J A Guerrero-ColC3n, E P Simoncelli and J Portilla. Image denoising using mixtures of Gaussian scale mixtures Proc 15th IEEE Int'l Conf on Image Proc, Oct 2008. Abstract M Raphan, EP Simoncelli. Optimal denoising in redundant representations IEEE Trans Image Processing, Aug 2008. Abstract J Portilla, M Wainwright, V Strela, E P Simoncelli. Image denoising using a scale mixture of Gaussians in the wavelet domain IEEE Transactions on Image Processing, Nov 2003. Abstract, Code, Download J Portilla and E P Simoncelli A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients Int'l Journal of Computer Vision. October, 2000. Abstract, Code, Download E P Simoncelli Bayesian Denoising of Visual Images in the Wavelet Domain In Bayesian Inference in Wavelet Based Models. eds. P Müller and B Vidakovic. Springer-Verlag, Lecture Notes in Statistics 141, August 1999. Abstract & download E P Simoncelli and J Portilla Texture Characterization via Joint Statistics of Wavelet Coefficient Magnitudes. 5th IEEE Int'l Conf on Image Processing. Chicago, IL. Oct 4-7, 1998. Abstract & download D Heeger and J Bergen. Pyramid-based Texture Analysis/Synthesis. Proceedings, ACM Siggraph, August, 1995. E P Simoncelli and E H Adelson. Noise Removal via Bayesian Wavelet Coring. IEEE Third Int'l Conf on Image Processing. Laussanne Switzerland, September 1996. Abstract JS Nimeroff, E Simoncelli, J Dorsey. Efficient re-rendering of naturally illuminated environments. 5th Annual Eurographics Symposium on Rendering, 1994. Abstract Steerability, Steerable filters, Derivative filters
H Farid and E P Simoncelli. Differentiation of discrete multi-dimensional signals IEEE Trans Image Processing, 13(4):496-508, Apr 2004. Abstract & Download J W Zweck and L R Williams. Euclidean group invariant computation of stochastic completion fields using shiftable-twistable functions, December, 1999. H Farid and E P Simoncelli Optimally rotation-equivariant directional derivative kernels Int'l Conf Computer Analysis of Images and Patterns, 207-214, Kiel, Germany, 1997. Abstract P Teo and Y Hel-Or A Computational Group-Theoretic Approach to Steerable Functions. STAN-CS-TN-96-33, Dept. of Computer Science, Stanford University, April 1996. E Simoncelli and H Farid. Steerable Wedge Filters for Local Orientation Analysis. IEEE Trans. Image Processing, Sept 1996. Abstract / Full PostScript (461k) E P Simoncelli. A Rotation-Invariant Pattern Signature. 3rd IEEE Int'l Conf on Image Processing. Laussanne Switzerland, Sept 1996. Abstract / Full Text (377k, ps.gz) M Michaelis and G Sommer. A Lie Group-Approach to Steerable Filters. Pattern Recognition Letters, v16, n11. November, 1995. pp. 1165-1174. W Beil. Steerable Filters and Invariance Theory. Pattern Recognition Letters, v16, n11, 1994. pp. 453-460. Klas Nordberg, Signal Representation and Processing Using Operator Groups. Linkoping University Dissertation, No. 366. 1994. Eero P Simoncelli. Design of Multi-dimensional Derivative Filters. IEEE First Int'l Conf on Image Processing. Austin TX, November 1994. Abstract / Full PostScript (74k) J Segman and Y Y Zeevi. Image Analysis by Wavelet-Type Transforms: Group Theoretic Approach. J. Mathematical Imaging and Vision 3, pp 51-77, 1993. Pietro Perona. Steerable-scalable kernels for edge detection and junction analysis. 2nd European Conf. Computer Vision (1992), pp. 3-18. W T Freeman and E H Adelson. The Design and Use of Steerable Filters. IEEE Trans. Patt. Anal. Mach. Intell., Vol 13 Num 9, pp 891-906, September 1991. J G Daugman. Six Formal Properties of Anisotropic Visual Filters: Structural Principles and Frequency/Orientation Selectivity. IEEE Trans. Systems, Man, and Cybernetics. vol13, pp882-887. 1983. H Knutsson and G H Granlund. Texture analysis using two-dimensional quadrature filters. IEEE Computer Society Workshop on Computer Architecture for Pattern Analysis and Image Database Management, 1983, pp. 206-213. Per-Erik Danielsson. Rotation-Invariant Linear Operators with Directional Response. 5th Int'l Conf. Patt. Rec., Miami, December, 1980. Related (Multi-Scale, Oriented) Image Transforms
JL Starck, EJ Candès and DL Donoho. The Curvelet Transform for Image Denoising. IEEE Trans Image Processing. 11, 670--684, 2000. R Navarro, A Tabernero, and G Cristobal. Image Representations with Gabor Wavelets and its Applications. Advances in Imaging and Electron Physics, vol 97, 1996. E P Simoncelli & E H Adelson. Non-separable Extensions of Quadrature Mirror Filters to Multiple Dimensions Proceedings of the IEEE, 78(4): 652-664, April, 1990. Abstract E P Simoncelli and E H Adelson. Subband Image Coding with Hexagonal Quadrature Mirror Filters. Proc. Picture Coding Symposium, Cambridge, MA, March 1990. Abstract M Porat and Y Zeevi. The Generalized Gabor Scheme of Image Representation in Biological and Machine Vision IEEE Trans. PAMI. 10:452-468, 1988. J G Daugman. Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression. IEEE Trans. ASSP, 36(7): 1169-1179, 1988. A B Watson. The cortex transform: rapid computation of simulated neural images. Comp. Vis. Graph. Image Proc. 39:311-327, 1987. Full text (pdf) General Books on Wavelets
Barbara Burke Hubbard. The World According to Wavelets . A.K. Peters, Wellesley MA, 1996. Stephane Mallat A Wavelet Tour of Signal Processing . Academic Press, 1998. Gilbert Strang & Truong Nguyen. Wavelets and Filter Banks. Wellesley-Cambridge Press, Wellesley MA, 1996. Martin Vetterli & Jelena Kovacevic. Wavelets and Subband Coding . Prentice Hall, 1995.
0 Response to "The Design and Use of Steerable Filters"
Post a Comment