CS 7670; Fall, 2024; Instructor: Gene Cooperman (gene@ccs.neu.edu)
-
Course grade:
In keeping with a PhD-level seminar course, providing the student makes
a good-faith effort for full participation, the default grade should
be an A. The work load is intentionally intended to be modest,
so that students can continue to work on the research for their thesis,
or else do investigations that could lead to a thesis topic. The course
is intended to be of interest to a broad variety of area in
computer science, and not solely the area of computer science.
While the emphasis in 2024 will be in Code LLMs, the course will
accommodate a broad variety of interests in computer systems, including
areas beyond LLMs.
Structure of the Course
The course will be divided approximately into three parts. However,
there will not be a sharp dividing line among those three parts.
Rather, the right way to look at this is three "tracks", and at
any given time, we may be emphasizing one track, while providing
a summary of the previous track or an introduction to the next track.
- First three weeks of the semester, or so:
Introductory lectures for background in "Code LLMs", given by the
instructor, with class discussion.
This corresponds to an exploratory literature survey that one
encounters in any effort to study a new research area.
In parallel, students will begin
to either state their own research topic preference, or else
they can choose the more guided topic preference of the
instructor: The "topic preference of the instructor" for this
semester is based on
the idea of using Model Checking (for example, McMini) to generate
an execution trace showing a bug in a multithreaded program.
The goal is to ask a Code LLM to explain a bug exhibited
by an execution trace. For an example in the literature (using
stack traces instead of execution traces), see the "ChatGDB"
paper described on the course web page. That domain is low
hanging fruit: debugging simple, short Python and Jupyter
notebook programs based on homework for an introductory Computer
Programming course. Nevertheless, there are important innovations
to be learned from that paper.
Instead of a pure paper-reading exercise,
students can optionally choose a more "hands-on" research
study, based on creating novel software. They will still be
required to make formal presentations, but based on the software
structure, its goals, and its novel aspects. They will still be
required to present some of the background readings of competing
software approaches, so as to place their own efforts in context.
This is intended to accommodate students who may be in the phase
of developing software as part of their thesis preparation or
search for a thesis topic.
- Next two months, or so:
Student Paper Readings and Presentations.
A suggested set of readings will be provided at approximately
the fourth class (a Tuesday). The suggested readings will be
much broader than Code LLMs, or LLMs in general, so that students
may consider other interests in Computer Systems.
- Remainder of the course:
Student Writing of a Technical Paper.
There is a "grammar" for technical writing, just as there is
a grammar for a programming language. A programming effort
divides into programming "in the large" (e.g.,
algorithm and flow of code) and "in the small" (e.g.,
concise, computationally efficient functions). Similarly, the
"grammar of technical writing" divides into two parts: writing
"in the large"; and writing "in the small".
`
In order to learn good technical writing for research,
the goal must involve the struggle to creatively explore new
ideas. Otherwise, the technical report becomes a dry summary
closely reflecting the source material.
`
If the student has a current research project with some
component related to computer systems, they can use that as the
creative goal. Otherwise, the instructor will provide a range of
mini-topics, with a choice of three papers for each mini-topic.
The writing goal will be to survey the mini-topic based on the
three papers, while creatively doing "compare and contrast" to
state what are the common innovations across all three papers,
and where does one or two papers provide a unique approach.
` If a survey paper is chosen, then the
student must assess what are the goals and likely impact of
a subject, and what are the likely future successes (or is it likely
to be a dead end). This will be the student's own opinion.
The goal is to express your opinion in good technical writing.
But this is an "opinion" only. You will be assessed on expressing
your own opinion well --- not on whether your opinion is better
or worse than my own opinion. :-)
Office Hours:
Office hours are after class on Tuesday and Friday and by appointment.
My office is in 336 WVH. If you can't make the office hours,
then ask me after class about a good time, or else drop into my
office (336 WVH) or lab (370 WVH) at any time.
I often will
be in my office or lab in the afternoon, but if we have trouble
connecting, then let's Zoom. In this modern age, it's easy to send
me a calendar invite, and I can then accept, or negotiate a different
time if needed. In a seminar course, research proceeds by leaps,
and it's important for me to be accessible when you've digested
the current ideas, and you with to brainstorm on next steps.