Why Gabor Frames? Two Fundamental Measures of Coherence and Their Role in Model Selection

  • Bajwa, Waheed U. (Program in Applied and Computational Mathematics, Princeton University) ;
  • Calderbank, Robert (Department of Electrical Engineering and the Program in Applied and Computational Mathematics, Princeton University) ;
  • Jafarpour, Sina (Department of Computer Science, Princeton University)
  • Received : 2010.06.03
  • Accepted : 2010.07.02
  • Published : 2010.08.31

Abstract

The problem of model selection arises in a number of contexts, such as subset selection in linear regression, estimation of structures in graphical models, and signal denoising. This paper studies non-asymptotic model selection for the general case of arbitrary (random or deterministic) design matrices and arbitrary nonzero entries of the signal. In this regard, it generalizes the notion of incoherence in the existing literature on model selection and introduces two fundamental measures of coherence-termed as the worst-case coherence and the average coherence-among the columns of a design matrix. It utilizes these two measures of coherence to provide an in-depth analysis of a simple, model-order agnostic one-step thresholding (OST) algorithm for model selection and proves that OST is feasible for exact as well as partial model selection as long as the design matrix obeys an easily verifiable property, which is termed as the coherence property. One of the key insights offered by the ensuing analysis in this regard is that OST can successfully carry out model selection even when methods based on convex optimization such as the lasso fail due to the rank deficiency of the submatrices of the design matrix. In addition, the paper establishes that if the design matrix has reasonably small worst-case and average coherence then OST performs near-optimally when either (i) the energy of any nonzero entry of the signal is close to the average signal energy per nonzero entry or (ii) the signal-to-noise ratio in the measurement system is not too high. Finally, two other key contributions of the paper are that (i) it provides bounds on the average coherence of Gaussian matrices and Gabor frames, and (ii) it extends the results on model selection using OST to low-complexity, model-order agnostic recovery of sparse signals with arbitrary nonzero entries. In particular, this part of the analysis in the paper implies that an Alltop Gabor frame together with OST can successfully carry out model selection and recovery of sparse signals irrespective of the phases of the nonzero entries even if the number of nonzero entries scales almost linearly with the number of rows of the Alltop Gabor frame.

Keywords

Acknowledgement

Supported by : NSF, ONR, AFOSR

References

  1. A. Miller, Subset Selection in Regression. Chapman and Hall, 1990.
  2. N.Meinshausen and P. Bühlmann, "High-dimensional graphs and variable selection with the lasso," Ann. Statist., vol. 34, no. 3, pp. 1436–1462, June 2006. https://doi.org/10.1214/009053606000000281
  3. S. S. Chen, D. L. Donoho, and M. A. Saunders, "Atomic decomposition by basis pursuit," SIAM J. Scientific Comput., vol. 20, no. 1, pp. 33–61, Jan. 1998. https://doi.org/10.1137/S1064827596304010
  4. G. Davis, S. Mallat, and M. Avellaneda, "Adaptive greedy approximations," in Constructive Approximation. New York, NY: Springer, Mar. 1997, vol. 13, no. 1, pp. 57–98.
  5. W. U. Bajwa, R. Calderbank, and S. Jafarpour, "Model selection: Two fundamental measures of coherence and their algorithmic significance," in Proc. ISIT, Austin, TX, June 2010, pp. 1568–1572.
  6. J. A. Tropp, "Norms of random submatrices and sparse approximation," in C. R. Acad. Sci., Ser. I, Paris, 2008, vol. 346, pp. 1271–1274.
  7. W. Alltop, "Complex sequences with low periodic correlations," IEEE Trans. Inf. Theory, vol. 26, no. 3, pp. 350–354, May 1980. https://doi.org/10.1109/TIT.1980.1056185
  8. T. Strohmer and R. Heath, "Grassmanian frames with applications to coding and communication," Appl. Comput. Harmon. Anal., vol. 14, no. 3, pp. 257–275, May 2003. https://doi.org/10.1016/S1063-5203(03)00023-X
  9. D. L. Donoho, "For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution," Commun. Pure Appl. Math., vol. 59, no. 6, pp. 797–829, June 2006. https://doi.org/10.1002/cpa.20132
  10. C. Genovese, J. Jin, and L. Wasserman. Revisiting marginal regression. submitted. [Online]. Available: arXiv:0911.4080v1
  11. C. L. Mallows, "Some comments on Cp," Technometrics, vol. 15, no. 4, pp. 661–675, Nov. 1973. https://doi.org/10.2307/1267380
  12. H. Akaike, "A new look at the statistical model identification," IEEE Trans. Autom. Control, vol. 19, no. 6, pp. 716–723, Dec. 1974. https://doi.org/10.1109/TAC.1974.1100705
  13. P. Massart, "A non-asymptotic theory for model selection," in European Congress of Mathematics, A. Laptev, Ed., Stockholm, Sweden, 2005, pp. 309–323.
  14. G. Schwarz, "Estimating the dimension of a model," Ann. Statist., vol. 6, no. 2, pp. 461–464, 1978. https://doi.org/10.1214/aos/1176344136
  15. D. P. Foster and E. I. George, "The risk inflation criterion for multiple regression," Ann. Statist., vol. 22, no. 4, pp. 1947–1975, 1994. https://doi.org/10.1214/aos/1176325766
  16. B. K. Natarajan, "Sparse approximate solutions to linear systems," SIAM J. Comput., vol. 24, no. 2, pp. 227–234, Apr. 1995. https://doi.org/10.1137/S0097539792240406
  17. R. Tibshirani, "Regression shrinkage and selection via the lasso," J. Roy. Statist. Soc. Ser. B, vol. 58, no. 1, pp. 267–288, 1996.
  18. P. Zhao and B. Yu, "On model selection consistency of lasso," J. Mach. Learning Res., vol. 7, pp. 2541–2563, 2006.
  19. M. J.Wainwright, "Sharp thresholds for high-dimensional and noisy sparsity recovery using 1-constrained quadratic programming (lasso)," IEEE Trans. Inf. Theory, vol. 55, no. 5, pp. 2183–2202, May 2009.
  20. E. J. Candès and Y. Plan, "Near-ideal model selection by 1 minimization," Ann. Statist., vol. 37, no. 5A, pp. 2145–2177, Oct. 2009.
  21. V. Saligrama andM. Zhao. Thresholded basis pursuit: An LP algorithm for achieving optimal support recovery for sparse and approximately sparse signals from noisy random measurements. submitted to IEEE Trans. Inf. Theory. [Online]. Available: arXiv:0809.4883v3
  22. K. Schnass and P. Vandergheynst, "Average performance analysis for thresholding," IEEE Signal Process. Lett., vol. 14, no. 11, pp. 828–831, Nov. 2007.
  23. A. K. Fletcher, S. Rangan, and V. K. Goyal, "Necessary and sufficient conditions for sparsity pattern recovery," IEEE Trans. Inf. Theory, vol. 55, no. 12, pp. 5758–5772, Dec. 2009.
  24. G. Reeves and M. Gastpar, "A note on optimal support recovery in compressed sensing," in Proc. 43rd Asilomar Conf. Signals, Syst. Comput., Pacific Grove, CA, Nov. 2009.
  25. D. L. Donoho and I. M. Johnstone, "Ideal spatial adaptation by wavelet shrinkage," Biometrika, vol. 81, no. 3, pp. 425–455, 1994. https://doi.org/10.1093/biomet/81.3.425
  26. "Special issue on compressive sampling," IEEE Signal Process. Mag., vol. 25, no. 2, Mar. 2008.
  27. E. J. Candès and T. Tao, "The Dantzig selector: Statistical estimation when p is much larger than n," Ann. Statist., vol. 35, no. 6, pp. 2313–2351, Dec. 2007. https://doi.org/10.1214/009053606000001523
  28. S. G. Mallat and Z. Zhang, "Matching pursuits with time-frequency dictionaries," IEEE Trans. Signal Process., vol. 41, no. 12, pp. 3397–3415, Dec. 1993. https://doi.org/10.1109/78.258082
  29. W. Dai and O. Milenkovic, "Subspace pursuit for compressive sensing signal reconstruction," IEEE Trans. Inf. Theory, vol. 55, no. 5, pp. 2230– 2249, May 2009.
  30. D. Needell and J. A. Tropp, "CoSaMP: Iterative signal recovery from incomplete and inaccurate samples," Appl. Comput. Harmon. Anal., vol. 26, no. 3, pp. 301–321, 2009. https://doi.org/10.1016/j.acha.2008.07.002
  31. T. Blumensath and M. E. Davies, "Iterative hard thresholding for compressed sensing," Appl. Comput. Harmon. Anal., vol. 27, no. 3, pp. 265– 274, 2009. https://doi.org/10.1016/j.acha.2009.04.002
  32. A. C. Gilbert, M. J. Strauss, J. A. Tropp, and R. Vershynin, "One sketch for all: Fast algorithms for compressed sensing," in Proc. STOC, San Diego, CA, June 2007, pp. 237–246.
  33. A. C. Gilbert, S. Guha, P. Indyk, S. Muthukrishnan, andM. Strauss, "Nearoptimal sparse Fourier representations via sampling," in Proc. STOC, Montreal, Canada, May 2002, pp. 152–161.
  34. M. A. Iwen, "A deterministic sub-linear time sparse Fourier algorithm via non-adaptive compressed sensing methods," in Proc. SODA, San Francisco, CA, Jan. 2008, pp. 20–29.
  35. E. J. Candès, "The restricted isometry property and its implications for compressed sensing," in C. R. Acad. Sci., Ser. I, Paris, 2008, vol. 346, pp. 589–592. https://doi.org/10.1016/j.crma.2008.03.014
  36. L. Welch, "Lower bounds on the maximum cross correlation of signals," IEEE Trans. Inf. Theory, vol. 20, no. 3, pp. 397–399, May 1974. https://doi.org/10.1109/TIT.1974.1055219
  37. G. E. Pfander, H. Rauhut, and J. Tanner, "Identification of matrices having a sparse representation," IEEE Trans. Signal Process., vol. 56, no. 11, pp. 5376–5388, Nov. 2008.
  38. M. A. Herman and T. Strohmer, "High-resolution radar via compressed sensing," IEEE Trans. Signal Process., vol. 57, no. 6, pp. 2275–2284, June 2009.
  39. J. A. Tropp, "On the conditioning of random subdictionaries," Appl. Comput. Harmon. Anal., vol. 25, pp. 1–24, 2008. https://doi.org/10.1016/j.acha.2007.09.001
  40. M. J. Wainwright, "Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting," IEEE Trans. Inf. Theory, vol. 55, no. 12, pp. 5728–5741, Dec. 2009.
  41. M. Akcakaya and V. Tarokh, "Shannon-theoretic limits on noisy compressive sampling," IEEE Trans. Inf. Theory, vol. 56, no. 1, pp. 492–504, Jan. 2010.
  42. D. V. Sarwate, "Meeting the Welch bound with equality," in Proc. SETA, 1998, pp. 79–102.
  43. M. Rudelson and R. Vershynin, "Non-asymptotic theory of random matrices: Extreme singular values," to appear in Proc. Int. Congr. of Mathematicians, Hyderabad, India, Aug. 2010.
  44. J. Lawrence, G. E. Pfander, and D. Walnut, "Linear independence of Gabor systems in finite dimensional vector spaces," J. Fourier Anal. Appl., vol. 11, no. 6, pp. 715–726, Dec. 2005. https://doi.org/10.1007/s00041-005-5017-6
  45. S. A. Gersgorin, "Über die abgrenzung der eigenwerte einer matrix," Izv. Akad. Nauk SSSR Ser. Fiz.-Mat., vol. 6, pp. 749–754, 1931.
  46. R. A. Devore, "Nonlinear approximation," in Acta Numerica, A. Iserles, Ed. Cambridge, U.K.: Cambridge University Press, 1998, vol. 7, pp. 51– 150.
  47. W. U. Bajwa, J. Haupt, G. Raz, and R. Nowak, "Compressed channel sensing," in Proc. CISS, Princeton, NJ, Mar. 2008, pp. 5–10.
  48. J. Haupt, W. U. Bajwa, G. Raz, and R. Nowak, "Toeplitz compressed sensing matrices with applications to sparse channel estimation," to appear in IEEE Trans. Inf. Theory, 2010.
  49. O. Christensen, Frames and Bases. Boston, MA: Birkhauser, 2008.
  50. W. U. Bajwa, A. M. Sayeed, and R. Nowak, "Learning sparse doublyselective channels," in Proc. 45th Annu. Allerton Conf. Commun., Control, and Computing, Monticello, IL, Sept. 2008, pp. 575–582
  51. C. McDiarmid, "On the method of bounded differences," in Surveys in Combinatorics, J. Siemons, Ed. Cambridge University Press, 1989, pp. 148–188.
  52. R. Motwani and P. Raghavan, Randomized Algorithms. New York, NY: Cambridge University Press, 1995.
  53. S. J. Wright, R. D. Nowak, and M. A. T. Figueiredo, "Sparse reconstruction by separable approximation," IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2479–2493, Jul. 2009.
  54. K. Azuma, "Weighted sums of certain dependent random variables," Tohoku Math. J., vol. 19, no. 3, pp. 357–367, 1967. https://doi.org/10.2748/tmj/1178243286
  55. S. M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory. relax Upper Saddle River, NJ: Prentice Hall, 1998.