Natural Language Processing CS 6120

Khoury College of Computer Science. San Jose, CA - Karl's E-Mail

This course in Natural Language Processing (NLP) is a mix of first principles and heavy doses of software engineering. The key objectives are two-fold: (1) teach fundamental concepts of NLP and (2) to provide extensive and practical hands-on modeling experience. Our language modeling curriculum will cover a variety of use cases, including but not limited to sentiment analysis, question / answer, summarization, translation, and more. The core topics to be covered in this course include topic models, word/sentence embedding models, deep attention models, and large language models (LLMs).

This course is heavily project oriented, where students will build and serve applications that leverage modern capabilities. All materials can be attained from open source (even LLMs, like Facebook’s Llama 3.) Because of the nature of the rapidly progressing field, a core part of the curriculum will be learning to read and understand papers on ArXiv. Through a generous grant from Google, students will have access to cloud computing credits to leverage resources through Google Cloud Compute (GCP), which they can use for both training and inference.

By the end of this class, students will have first-hand knowledge and be able to replicate (if not contribute to) any modern NLP approaches that companies like OpenAI, Google, and Meta might invent. They will be able to read most of the technical papers that advance this technology. But perhaps most importantly, I hope that they will have a fun and rewarding experience processing natural language and understanding the machine learning that goes into it.

Course Reference: CRN 21066 (San Jose)

announcements

No announcements so far...