DS4420 // Machine Learning 2

Course description

Machine learning 2

Grading

30%	Homeworks
5%	In class exercises
30%	Mid-term
35%	Final project

Prerequisites

I assume you have taken ML1 (ds4400) or equivalent. Working knowledge of Python required (or you must be willing to pick up rapidly as we go).

Homeworks

Homeworks will consist of both written and programming components. The latter will be completed in Python, using a mix of standard libraries (numpy, pytorch, etc.)

Late Policy. Homeworks that are one day late will be subject to a 20% penalty; two days incurs 50%. Homeworks more than two days late will not be accepted.

Mid-term

The mid-term will be given in class, and will be testing for understanding of the core material presented in the course regarding the fundamentals covered in the first half of the course.

Projects

A big component of this course will be your project, which will involve picking a particular dataset on which to implement, train and evaluate machine learning models. Collaboration is allowed (team sizes <= 2, however). This project will be broken down into several graded deliverables, and culminate in a report and final presentation in class to your peers.

Here is an outline of the project expectations, (tentative) dates, etc.

Academic integrity policy

A commitment to the principles of academic integrity is essential to the mission of Northeastern University. The promotion of independent and original scholarship ensures that students derive the most from their educational experience and their pursuit of knowledge. Academic dishonesty violates the most fundamental values of an intellectual community and undermines the achievements of the entire University. For more information, please refer to the Academic Integrity Web page.

More specific to this class: It is fine to consult online resources for programming assignments (of course), but lifting a solution/implementation in its entirety is completely inappropriate. Moreover, you must list all sources (websites/URLs) consulted for every homework; failing to do so will constitute a violation of academic integrity. In general, you must also be able to explain whatever code you use.

Shedule outline

Meeting	Topic(s)	readings	things due	lecture notes/etc
1/7 (t)	Logistics, overview			Slides; Notebook
1/9 (r)	Math Review	Math for ML, Part 1: 5-5.5, 6-6.5		Slides; probability review NB; autodiff in Torch NB; A follow-up note to our exercise on gradients
1/14 (t)	MLE, MAP, and graphical models	Math for ML, Part 2: 8.3, 8.4, 8.5		Slides; Notebook on conjugate priors
1/16 (r)	Neural networks / backprop	A Course in Machine Learning, Ch. 10		Slides; Notes ok NNs/backprop; Notebook on backprop
1/21 (t)	Clustering I	Elements of Statistical Learning, 14--14.6; (optional) CIML 11.3		Slides; Notes on clustering; Notebook on k-means; In class exercise/NB clustering BERT vectors of Trump tweets
1/23 (r)	Clustering II → Mixture models and EM	Elements of Statistical Learning, 14.6--14.9; MML, Part 2: 11	HW1 DUE	Slides; Notes on Mixture Models; In-class exercise on EM+NB; Worked
1/28 (r)	Topic modeling I	Applications of Topic Models (Boyd-Graber, Hu, Mimno)		Slides; Notes on PLSA; Notebook/in-class exercise on PLSA (worked)
1/30 (r)	Topic modeling II	Applications of Topic Models (Boyd-Graber, Hu, Mimno)		Slides; Notes; Notebook (sampling + LDA)
2/4 (t)	Dimensionality reduction I	Math for ML, Part 2: 10		Slides; Notes; Notebook (PCA/PPCA)
2/6 (r)	Dimensionality reduction II	t-SNE paper	HW2 DUE	Slides; Notes; Notebook
2/11 (t)	Auto-encoders/"Self-supervision"; Learning to embed			Slides; Notes; Notebook
2/13 (r)	Structured prediction I	A Course in Machine Learning, Ch 17		Slides; Notes
2/18 (t)	Structured prediction II	A Course in Machine Learning, Ch 17		Slides; Notes Notebook
2/20 (r)	No class		HW3 DUE tomorrow 2/21
2/25 (t)	Review			Review slides
2/27 (r)	Midterm exam
	Spring break!
3/10 (t)	Transformers	The Illustrated Transformer		Notebook on (stripped down) transformers; Slides; Refresher: RNNs
3/12 (r)	Fairness and bias	A Course in Machine Learning, Ch. 8	Project proposal due FRIDAY (3/13)	Slides; Notebook on accidentally making a racist model (by Robyn Speer)
3/17 (t)	Project pitches and feedback		In class project pitches!
3/19 (r)	"Green" AI	Green AI		Slides; Notebook/exercise on model distillation
3/24 (t)	Active learning		~~HW4 DUE~~	Slides; Notebook/exercise on active learning
3/26 (r)	Interpretability (Guest Lecturer: PhD student Sarthak Jain)		HW DUE	Slides; Notebook
3/31 (t)	By popular demand: Sequence-2-sequence models			Slides; seq2seq notebook
4/2 (r)	Project Q's / help
4/7 (t)	Final project presentations I		Presentations!
4/9 (r)	Final project presentations II		Presentations!
4/14 (t)	No class (final write-ups due)		FINAL PROJECT WRITE-UPS DUE!

HTML/CSS/JS used (and modified), with permission, courtesy of Prof. Alan Mislove