DS4440 // practical neural networks // fall 2020
Course details
Byron Wallace
Office: 177 Huntington (but really online, because... 2020)
Office hours: On Zoom, Wednesdays at 2:00-3:00 PM
David Lowell
Office Hours: On Zoom, Tuesdays at 2:00-3:00 PM


Ryder Hall 154 & the internet (2020 ...) / MR 11:45 am - 1:25 pm
Here is a link to the course Piazza site.
Books & Resources

Dive into Deep Learning
This is the main book for the class. It is online and free.

Course description

This course is a hands-on introduction to modern neural network ("deep learning") tools and methods. The course will cover the fundamentals of neural networks, and introduce standard and new architectures: from simple feed forward networks to recurrent neural networks. We will cover stochastic gradient descent and backpropagation, along with related fitting techniques.

The course will have a particular emphasis on using these technologies in practice, via modern toolkits. We will specifically be working with PyTorch, which provides a flexible framework for working with computation graphs. While PyTorch will be our toolkit of choice, the concepts of automatic differentiation and neural networks are not tied to this particular package, and a key objective in this class is to provide sufficient familarity with the methods and programming paradigms such that switching to new frameworks is no great obstacle. (This is particularly important given the rapid pace of development in the deep learning toolkit space.)

We will introduce now-standard neural network architectures for data of various types, including images and text. This iteration will have a bit of a bias toward the latter, reflecting instructor biases.

5%In class exercises
15%Midterm topic "survey"
40%Final project

Prior exposure to machine learning is recommended. Working knowledge of Python required (or you must be willing to pick up rapidly as we go). Familiarity with linear algebra, (basic) calculus and probability will be largely assumed throughout, although we will also review some of these prequisites.


Homeworks will consist of both written and programming components. The latter will be completed in Python, often using PyTorch.

Late Policy. Homeworks that are one day late will be subject to a 20% penalty; two days incurs 50%. Homeworks more than two days late will not be accepted.

Mid-term survey on a topic of your interest

Typically this class includes an in-class midterm. Given the pandemic and remote nature of this offering, we're going to try something different this year. You are asked to survey the literature/methods on a particular "topic" (or "task") of interest. This will culminate in a brief write-up and in-class presentation explaining the task, dataset, methods. Ideally this will motivate your final project, although this is not required. Details on this assignment are available here: Midterm survey details. Please do not hesitate to post to Piazza or reach out directly with questions (the former encouraged so others may benefit).


A big component of this course will be your project. This will be completed individually. The project might entail building on top of what you learned in your topic survey -- e.g., perhaps you surveyed the area of automatic translation using neural models; your project might then be to re-evaluate a state-of-the-art approach, or to reproduce the results reported in recent papers, etc. This project will be broken down into several graded deliverables, and culminate in a project report and final presentation in class to your peers. Here are additional project details.

Academic integrity policy

A commitment to the principles of academic integrity is essential to the mission of Northeastern University. The promotion of independent and original scholarship ensures that students derive the most from their educational experience and their pursuit of knowledge. Academic dishonesty violates the most fundamental values of an intellectual community and undermines the achievements of the entire University. For more information, please refer to the Academic Integrity Web page.

More specific to this class: It is fine to consult online resources for programming assignments (of course), but lifting a solution/implementation in its entirety is completely inappropriate. Moreover, you must list all sources (websites/URLs) consulted for every homework; failing to do so will constitute a violation of academic integrity.

Shedule outline

MeetingTopic(s)readingsthings duelecture notes/etc
9/10Course aims, expectations, logistics; Review of supervised learning / Perceptron / intro to colab d2l: Introductionjoin the Piazza site!Intro/logistics slides; Notes; Perceptron notebook
9/14Preliminaries, Logistic Regression and Optimization via SGDd2l: PreliminariesNotes; In-class gradient exercise starter; Notebook on Linear Regression via Gradient Descent
9/17Beyond Linear Models: The Multi-Layer Perceptrond2l: MLPs (4.1)HW 1 Due!Notes on MLPs; Notebook on (non-linear) MLPs; Notes on metrics; Notebook on metrics
9/21Abstractions: Layers and Computation Graphsd2l: Layers and blocksNotes; Notebook on computation graphs; In class exercise starter (see notes)
9/24Backpropagation Id2l: Autodiff; Backprop (Colah's blog)Notes; Notebook; In class exercise on backprop (see notes)
9/28Backpropagation IId2l: BackpropNotes; Notebook: Wacky custom layer exercise
10/1Optimizer matters: Training NNs in Practiced2l: OptimizationHW 2 Due!Notes; Notebook; (Overly audacious) in-class exercise on (custom) optimizers in torch
10/5Learning continuous representations of discrete things: Embeddingsd2l: Word embeddings (14.1) Notes; Notebook: CBoW w2v
10/8Convolutional Neural Networks (CNNs) Id2l: CNNs (6.1)Notes; Notebook: A simple example in torch
10/12No class (holiday)
10/15Convolutional Neural Networks (CNNs) IId2l: CNNs (6.2 -- 6.5); Modern CNNs (7.1, 7.5 -- 7.7)Notes; Notebook: ConvNets in action!
10/19Recurrent Neural Networks (RNNs) IRNNs (8.1 and 8.4)/The Unreasonable Effectiveness of RNNsHW 3 Due!Notes; Notebook: RNNs (intro)
10/22Recurrent Neural Networks (RNNs) IId2l: RNNs (8.7) /The Unreasonable Effectiveness of RNNsNotes; Notebook: More fun with RNNs; Notebook: Character RNN to generate Shakespeare
10/26Transformer Networks (+ Self-Supervision and Contextualized Word Embeddings)d2l: TransformersNotes; Notebook: Self-Attention/Transformersish (1)
10/29More Transformers --> BERT; + Neural Sequence TaggingNotes; Notebook: Training a BERTish model; Official PyTorch tutorial on BiLSTM-CRFs
11/2Midterm topic survey presentationsTopic survey write-ups due!
11/5Sequence-to-Sequence Models 1Notes; Notebook: Seq2Seq for learning to "add"
11/9Sequence-to-Sequence Models 2d2l: Encoder-Decoder (seq2seq)HW 4 Due!Notes; Attention in Seq2Seq models (start)
11/12Summarization Models (guest: PhD student Jered McInerney)Slides (from Jered); Code
11/16Auto-EncodersAuto-encoders; Intuitive VAEsProject proposals due!Notes; Notebook: t-SNE in sklearn; Notebook: Autoencoders
11/19Ethics and Bias 1Fairness in MLNotes; Slides
11/23Ethics and Bias 2Fairness in machine learning: against false positive rate equality as a measure of fairness
11/26No class (thanksgiving)
11/30Interpretability (Guest: PhD student Sarthak Jain)HW 5 Due!Slides (from Sarthak)
12/3Active Learning and Augmentation (Guest: PhD student and TA David Lowell)Slides
12/7Final project presentations/discussion
Notes: BONUS HW due 12/16; Final project deliverables due 12/11. (Both on Canvas.)

HTML/CSS/JS used (and modified), with permission, courtesy of Prof. Alan Mislove