DS4420 // machine learning 2 // spring 2020
Course details
Byron Wallace
Office: 2208 177 Huntington
Office hours: by appointment (just email!)
Sarthak Jain jain.sar@husky.neu.edu Office hours: Mondays 11am -- 12:30pm and Friday 4pm -- 5:30pm, room 2204 in 177 Huntington Avenue.

Lecture Time/Location
Tuesdays 11:45a - 1:25p and Thursdays 2:50p -- 4:30p / Forsyth Building 128 KA 202

Here is a link to the course Piazza site.
Course description

Machine learning 2

5%In class exercises
35%Final project

I assume you have taken ML1 (ds4400) or equivalent. Working knowledge of Python required (or you must be willing to pick up rapidly as we go).


Homeworks will consist of both written and programming components. The latter will be completed in Python, using a mix of standard libraries (numpy, pytorch, etc.)

Late Policy. Homeworks that are one day late will be subject to a 20% penalty; two days incurs 50%. Homeworks more than two days late will not be accepted.


The mid-term will be given in class, and will be testing for understanding of the core material presented in the course regarding the fundamentals covered in the first half of the course.


A big component of this course will be your project, which will involve picking a particular dataset on which to implement, train and evaluate machine learning models. Collaboration is allowed (team sizes <= 2, however). This project will be broken down into several graded deliverables, and culminate in a report and final presentation in class to your peers.

Here is an outline of the project expectations, (tentative) dates, etc.

Academic integrity policy

A commitment to the principles of academic integrity is essential to the mission of Northeastern University. The promotion of independent and original scholarship ensures that students derive the most from their educational experience and their pursuit of knowledge. Academic dishonesty violates the most fundamental values of an intellectual community and undermines the achievements of the entire University. For more information, please refer to the Academic Integrity Web page.

More specific to this class: It is fine to consult online resources for programming assignments (of course), but lifting a solution/implementation in its entirety is completely inappropriate. Moreover, you must list all sources (websites/URLs) consulted for every homework; failing to do so will constitute a violation of academic integrity. In general, you must also be able to explain whatever code you use.

Shedule outline

MeetingTopic(s)readingsthings duelecture notes/etc
1/7 (t)Logistics, overviewSlides; Notebook
1/9 (r)Math ReviewMath for ML, Part 1: 5-5.5, 6-6.5Slides; probability review NB; autodiff in Torch NB; A follow-up note to our exercise on gradients
1/14 (t)MLE, MAP, and graphical modelsMath for ML, Part 2: 8.3, 8.4, 8.5Slides; Notebook on conjugate priors
1/16 (r)Neural networks / backpropA Course in Machine Learning, Ch. 10Slides; Notes ok NNs/backprop; Notebook on backprop
1/21 (t)Clustering IElements of Statistical Learning, 14--14.6; (optional) CIML 11.3Slides; Notes on clustering; Notebook on k-means; In class exercise/NB clustering BERT vectors of Trump tweets
1/23 (r)Clustering II → Mixture models and EMElements of Statistical Learning, 14.6--14.9; MML, Part 2: 11HW1 DUESlides; Notes on Mixture Models; In-class exercise on EM+NB; Worked
1/28 (r)Topic modeling IApplications of Topic Models (Boyd-Graber, Hu, Mimno)Slides; Notes on PLSA; Notebook/in-class exercise on PLSA (worked)
1/30 (r)Topic modeling IIApplications of Topic Models (Boyd-Graber, Hu, Mimno)Slides; Notes; Notebook (sampling + LDA)
2/4 (t)Dimensionality reduction IMath for ML, Part 2: 10Slides; Notes; Notebook (PCA/PPCA)
2/6 (r)Dimensionality reduction IIt-SNE paperHW2 DUESlides; Notes; Notebook
2/11 (t)Auto-encoders/"Self-supervision"; Learning to embedSlides; Notes; Notebook
2/13 (r)Structured prediction IA Course in Machine Learning, Ch 17Slides; Notes
2/18 (t)Structured prediction IIA Course in Machine Learning, Ch 17Slides; Notes Notebook
2/20 (r)No classHW3 DUE **tomorrow 2/21**
2/25 (t)ReviewReview slides
2/27 (r)Midterm exam
Spring break!
3/10 (t)TransformersThe Illustrated TransformerNotebook on (stripped down) transformers; Slides; Refresher: RNNs
3/12 (r)Fairness and biasA Course in Machine Learning, Ch. 8Project proposal due FRIDAY (3/13)Slides; Notebook on accidentally making a racist model (by Robyn Speer)
3/17 (t)Project pitches and feedbackIn class project pitches!
3/19 (r)"Green" AIGreen AISlides; Notebook/exercise on model distillation
3/24 (t)Active learningHW4 DUESlides; Notebook/exercise on active learning
3/26 (r)Interpretability (Guest Lecturer: PhD student Sarthak Jain)HW DUESlides; Notebook
3/31 (t)By popular demand: Sequence-2-sequence modelsSlides; seq2seq notebook
4/2 (r)Project Q's / help
4/7 (t)Final project presentations IPresentations!
4/9 (r)Final project presentations IIPresentations!
4/14 (t)No class (final write-ups due)FINAL PROJECT WRITE-UPS DUE!

HTML/CSS/JS used (and modified), with permission, courtesy of Prof. Alan Mislove