Instructor: Yi Yang
Email: yi.yang6@mcgill.ca
Lecture Time: On Zoom, Tuesday, Thursday 3:35 pm – 4:55 pm
Lecture Zoom Link: on Mycourses
Office Hour: Thursday 5:00 pm – 6:00 pm
Prerequisites: MATH 556, MATH 557 or permission of instructor
www.math.mcgill.ca/yyang/comp.html
Join Here. You can start discussions here with other students.
References: optional, no required textbook:
ESL: The Elements of Statistical Learning (2nd Ed) by T. Hastie, R. Tibshirani and J. Friedman
PR: Pattern Recognition and Machine Learning by Christopher M. Bishop
CVX: Convex Optimization by Boyd and Vandenberghe
Numerical Analysis for Statisticians (2nd Ed) by K. Lange
SLS: Statistical Learning with Sparsity: The Lasso and Generalizations by Trevor Hastie, Robert Tibshirani, Martin Wainwright
PA: Proximal Algorithms by Neal Parikh | Stephen Boyd
Homeworks 50% + Paper review 20% + Course project 25% + Scribe notes 5% = 100%
Homeworks (50%): Four homeworks (plus a warmup which does not count towards your grade). Each must be submitted through Mycourses. You must submit a typed written assignment (e.g. LaTeX or Microsoft Word, here's a template). We will not accept scanned handwritten assignments. You can complete assignments in groups, each consists of maximum 2 persons. You need to change your teammates for each assignment and make sure that you only work with the same person once.
Paper review (20%): you will write a 2-4 page review of papers. The goal is to learn to read technically demanding papers critically, and hopefully in the process, generate novel research ideas. Your review should not only summarize the main result of the paper, but critique it, instantiate it on examples, discuss its overall significance, and suggest possible future directions. See this Google Doc for detailed guidelines and a list of papers. The paper reviews can be done in pairs. The papers are selected by the students and require approval from the instructor. Paper reviews that are done in pairs will be evaluated with a slightly higher bar, and they ideally should contain reviews for two closely-related papers and are allowed two additional pages. Appendix or references beyond the page limit are allowed, but you will not be graded based on them.
Course project (25%): there will be a course project, with two milestones, a final report, and a class conference. Students are required to complete a course project. 2-3 students can form a team, the report must specify their individual contribution in their team project. The projects are selected by the students and require approval from the instructor. The team will present their project in class and a written technical report must be submitted. The page limit for project report is 10 pages, not including reference or appendix. (here's a template) More information will be given during the first lecture.
Scribe notes (5%): you will be asked to scribe notes twice during the semester for our lectures in LaTeX. Each is worth 2.5% grades. See this Google Doc for the detailed guidelines. The scribe notes are due 2 days after the lecture. Please sign up here before Sept 29th and plan the time ahead.
Exam: none
Collaboration on homework assignments with fellow students is encouraged. However, such collaboration should be clearly acknowledged, by listing the names of the students with whom you have had discussions concerning your solution.
No restriction.
Topics | Slides | Notes | R codes | Readings | Papers |
Introduction | Slides | matrix basics | - | SLS 1-2 | Enet Adlasso Group lasso |
Convex Optimization Basics | Slides | lipschitz strong cvx | - | CVX 1-3 | - |
Graphical Models | - | Notes | - | - | Glasso CLIME |
Matrix Decomposition | - | Notes | code | - | - |
Gradient Descent | Slides | - | - | CVX 4.1-4.2 CVX 9.1-9.4 | - |
Subgradient | Slides | - | - | CVX 9.1-9.4 | - |
Proximal Gradient Descent | Slides | - | - | PA 1-3, 4.2 | FISTA Proximal |
Structural Group Penalization | Chapter 4 of SLS | - | - | ESL 3, 18 SLS 4 | ISTA GGM UTMOST SGL Multitask |
Stochastic Gradient Descent | Slides | - | - | - | SAG SAGA SVRG Proximal SVRG |
Model Selection | Chapter 7 of ESL | note1 note2 df | - | ESL 7 | - |
Variational Inference | Chapter 10 of PA | KL | - | PR 10 | - |