University of Minnesota


DTC Seminar Series

Complete Dictionary Recovery over the Sphere and Beyond


Ju Sun
Electrical Engineering
Columbia University

Thursday, April 16, 2015
3:45 p.m. reception
4:15 p.m. seminar

401/402 Walter Library

Ju Sun

How can we concisely represent a class of signals? This is the central problem to address in signal compression, and has become increasingly important to signal acquisition, processing, and analysis. Dictionary learning is an attractive conceptual framework that learns sparse representation for a collection of input signals, and has found numerous successful applications in modern signal processing and machine learning. By contrast, theoretical understanding of dictionary learning is still limited to date.

In this talk, I will focus on the problem of recovering a complete (i.e., square and invertible) dictionary A and coefficients X from Y = AX, provided X is sufficiently sparse. This recovery problem is central to the theoretical understanding of dictionary learning. I will describe an efficient algorithm that provably recovers A when X has O(n) nonzeros per column, under suitable probability model for X. This is the first result of its kind, as prior results based on efficient algorithms provide recovery guarantees when X has only O(n^{1/2}) nonzeros per column, severely limiting the model capacity of dictionary learning.

The algorithmic pipeline centers around solving a certain nonconvex optimization problem with a spherical constraint, and hence is naturally phrased in the language of manifold optimization. To show this apparently hard problem is tractable, we provide a geometric characterization of the high-dimensional objective landscape, which shows that with high probability there are no spurious local minima. This particular geometric structure allows us to design a Riemannian trust region algorithm over the sphere that provably converges to one global minimizer with an arbitrary initialization, despite the presence of saddle points.

Besides the new theoretical insight into the dictionary learning problem, the geometric approach we develop here may shed light on other problems arising from nonconvex recovery of structured signals.

This is joint work with Prof. John Wright and Mr. Qing Qu. A more formal summary of the result can be found at:

Ju Sun, Qing Qu, John Wright, "Complete Dictionary Recovery over the Sphere: a Summary."


Ju Sun received his B. Eng. degree in computer engineering (with a minor in Mathematics) from the National University of Singapore in 2008. He has been working towards a PhD degree in the Department of Electrical Engineering, Columbia University, New York, since 2011. His research interests lie at the intersection of computer vision, machine learning, numerical optimization, signal/image processing, information theory, and compressive sensing, focused on modeling, harnessing, and computing with low-dimensional structures in massive data, with provable guarantees and practical algorithms. Recently, he is particularly interested in explaining the surprisingly effectiveness of nonconvex optimization heuristics arising in interesting practical problems, such as representation learning. He maintains a webpage dedicated to this topic:

Contact Information

Digital Technology Center

499 Walter Library, 117 Pleasant Street SE, Minneapolis, MN 55455

P: (612) 624-9510

Connect on Social Media