CS6120: Natural Language Processing
Spring 2015 Syllabus
Return to basic course information.
This schedule is subject to change. Check back as the class progresses.
Why NLP?
Language Models
- n-gram models, naive Bayes classifiers, probability, estimation
- We also played the Shannon game, guessing the
next letter from the previous n letters.
- Readings for Jan. 22: Bo Pang, Lillian Lee,
and Shivakumar
Vaithyanathan. Thumbs
up? Sentiment Classification using Machine Learning
Techniques. EMNLP,
2002.
- Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily
Scherlis, and Noah
A. Smith. Word
Salad: Relating Food Prices and
Descriptions. In EMNLP, 2012.
- Reading for Jan. 29:
C. E. Shannon. Prediction
and Entropy of Printed English. The Bell System
Technical Journal, January 1951.
- Background: Jurafsky & Martin, chapter 4
Regular Languages
- history of NLP research, the Chomsky hierarchy, regular
expressions, (weighted) finite-state automata and
transducers
- Background on NLP with unweighted
finite state machines: Karttunen, Chanod, Grefenstette, and
Schiller. Regular
expressions for language engineering. Journal of
Natural Language Engineering, 1997. We discussed the
main points and interesting examples from this paper on
Jan. 23, so although it's long, you should be able to usefully
skim it.
- Reading for Feb. 5: Kevin Knight and
Jonathan
Graehl. Machine
Transliteration. Computational Linguistics,
24(4), 1998.
- Reading for Feb. 12: Okan Kolak, William
Byrne, and Philip
Resnik. A
Generative Probabilistic OCR Model for NLP
Applications. In HLT-NAACL, 2003.
- More background: Jurafsky & Martin, chapter 2
Noisy Channel and Hidden Markov Models
- noisy channel models with finite state transducer;
part-of-speech tagging; hidden Markov models as noisy channel
models; Viterbi and Forward-Backward algorithms; parameter
estimation with supervised maximum likelihood and expectation
maximization
- Reading for Mar. 5: Ritter, Cherry &
Dolan. Unsupervised
Modeling of Twitter Conversations. HLT-NAACL,
2010.
- Background: Jurafsky & Martin, chapter 5 and 6.1-6.5
Context-Free Grammars and Parsers
Log-Linear Models
- also known as: logistic regression, and maximum entropy
(maxent) models; directly modeling the conditional probability
if output given input, rather than the joint probability of
input and output (and then using Bayes rule)
- Background: Jurafsky & Martin,
chapter 6.6-6.7; N. Smith, Appendix C
Models with Structured Outputs
- models that decide among combinatorially many outputs,
e.g. sequences of tags or dependency links; locally normalized
(action-based) models such as Maximum Entropy Markov Models
(MEMMs); globally normalized models such as linear-chain
Conditional Random Fields (CRFs)
- Background: Jurafsky & Martin,
chapter 6.8; N. Smith, chapter 3.1-3.5
Machine Translation
- word-based alignment models; phrase-based models;
syntactic and tree-based models; learning from comparable
corpora
- Background: Jurafsky & Martin,
chapter 25
Semantics
- logical form:
lambda expressions, event semantics, quantifiers, intensional
semantics; computational semantics: semantic role labeling,
combinatory categorial grammar (CCG), tree adjoining grammar
(TAG); lexical semantics: vector space representations, greedy
agglomerative clustering, k-means and EM clustering; learning
hyper(o)nym relations for nouns and verbs; compositional
vector-space semantics
- Background: Jurafsky & Martin,
chapters 18-20; see
also NLTK book,
chapter 10
NLP and Linguistics
Short presentations on literature reviews