Instructor: Yi Yang, Burnside Hall 1241
Lectures:
Monday; 10:05 AM - 11:25 AM Burnside Hall 1205
Wednesday; 3:05 PM – 4:25 PM, Burnside Hall 1234
Office hours:
Wednesday; 4:00 PM – 6:00 PM, Burnside Hall 1241
SB: Convex Optimization: Algorithms and Complexity, by Sebastien Bubeck.
PP: Non-convex Optimization for Machine Learning, by Prateek Jain and Purushottam Kar.
SLS: Statistical Learning with Sparsity, by Trevor Hastie, Robert Tibshirani and Martin Wainwright. Chapter 7 and Chapter 10.
Topics:
Applications: sparse recovery, low-rank matrix recovery
Applications: robust linear regression, phase retrieval
Reading:
SB Chapter 1, 2
PP Chapter 1
Topics:
Convex analysis
Convex projections
Projected gradient descent
Convergence guarantees for PGD
Reading:
PP Chapter 2
SB Chapter 3
L1 Projection: Duchi, John, et al. Efficient projections onto the L1-ball for learning in high dimensions, ICML 2008.
Multitask L1 Penalization: Fontaine, S., et al. A unified approach to sparse tweedie modeling of multi-source insurance claim data, Technometrics, revision. Supplemental Materials
More detailed proof on projected gradient descent convergence
Topics:
Non-convex projections
Restricted strong convexity and smoothness
Generalized projected gradient descent
Reading:
PP Chapter 3
Topics:
Problem formulation
Sparse regression: two perspectives
Sparse recovery via projected gradient descent
Restricted isometry and other design properties
Ensuring RIP and other Properties
A sparse recovery guarantee for IHT
Reading:
PP Chapter 7
Nullspace Property: Cohen, Albert, et al. Compressed sensing and best k-term approximation, JAMS 2009.
Restricted Eigenvalue Property: Raskutti, Garvesh, et al. Restricted eigenvalue properties for correlated gaussian designs, JMLR 2010.
Restricted Isometry Property: Emmanuel Candes and Terence Tao. Decoding by linear programming, IEEE Transactions on Information Theory 2005.
Restricted Strong Convexity/Smoothness Property: Jain et al. On iterative hard thresholding methods for high-dimensional m-Estimation, NIPS 2014.
Topics:
Signals and Sparse Representations
Orthogonal Bases
Approximation in Orthogonal Bases
Reconstruction in Overcomplete Bases
Random Projection and Approximation
Johnson-Lindenstrauss Approximation
Compressed Sensing
Reading:
SLS Chapter 10
Topics:
Marginal convexity and other properties
Generalized alternating minimization
A convergence guarantee for gAM for convex problems
A convergence guarantee for gAM under MSC/MSS
Reading:
Coordinate descent (coxnet): Simon, N., Friedman, J., Hastie, T., Tibshirani, R. Regularization paths for Cox's proportional hazards model via coordinate descent. JSS 2011.
Coordinate descent (glmnet): Friedman, J., Hastie, T., & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. JSS 2010.
Topics:
The EM algorithm
Implementing the E/M steps
A monotonicity guarantee for EM
Local strong convavity and local strong smoothness
A local convergence guarantee for EM
Reading:
EM Convergence: Balakrishnan, S., Wainwright, M. J., Yu, B. Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics, 2017.
Topics:
Motivating applications
Saddles and why they proliferate
The strict saddle property
The noisy gradient descent algorithm
A local convergence guarantee for NGD
Reading:
Tensor Decompositions: Anandkumar, A., Ge, R., Hsu, D., Kakade, S. M., and Telgarsky, M. Tensor decompositions for learning latent variable models. JMLR 2014.
Stochastic Gradient: Ge, R., Huang, F., Jin, C., and Yuan, Y. Escaping from saddle points-online stochastic gradient for tensor decomposition. COLT 2015.
Online Alternating Minimization: Choromanska, A., Cowen, B., Kumaravel, S., Luss, R., Rigotti, M., Rish, I. and Bouneffouf, D. Beyond backprop: online alternating minimization with auxiliary variables.