CS6120: Natural Language Processing

Spring 2013 Syllabus

Return to basic course information.

This schedule is subject to change. Check back as the class progresses.

Why NLP?
Language Models
- n-gram models, naive Bayes classifiers, probability, estimation
- We also played the “Shannon game”, guessing the next letter from the previous n letters. If you want background, see the Shannon's paper from 1950.
- Reading for Jan. 17: Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP, 2002.
- Background: Jurafsky & Martin, chapter 4
Regular Languages
- history of NLP research, the Chomsky hierarchy, regular expressions, (weighted) finite-state automata and transducers
- Reading for Jan. 24: Karttunen, Chanod, Grefenstette, and Schiller. Regular expressions for language engineering. Journal of Natural Language Engineering, 1997.
- Background: Jurafsky & Martin, chapter 2
- Reading for Jan. 31: Okan Kolak, William Byrne, and Philip Resnik. A Generative Probabilistic OCR Model for NLP Applications. In HLT-NAACL, 2003.
Noisy Channel and Hidden Markov Models
- noisy channel models with finite state transducer; part-of-speech tagging; hidden Markov models as noisy channel models; Viterbi and Forward-Backward algorithms; parameter estimation with supervised maximum likelihood and expectation maximization
- Background: Jurafsky & Martin, chapter 5
- Reading for Feb. 14: Bikel, Schwartz, and Weischedel. An Algorithm that Learns What's in a Name. Machine Learning, 34(1–3), 1999.
Context-Free Grammars and Parsers
- Reading for Feb. 28: Dan Klein and Christopher D. Manning. Accurate Unlexicalized Parsing. ACL, 2003.
Log-Linear Models
- also known as: logistic regression, and maximum entropy (maxent) models; directly modeling the conditional probability if output given input, rather than the joint probability of input and output (and then using Bayes rule)
- Background: Jurafsky & Martin, chapter 6
Log-Linear Models with Structured Outputs
- models that decide among combinatorially many outputs, e.g. sequences of tags or dependency links; locally normalized (action-based) models such as Maximum Entropy Markov Models (MEMMs); globally normalized models such as linear-chain Conditional Random Fields (CRFs)
Semantics
- logical form: lambda expressions, event semantics, quantifiers, intensional semantics; computational semantics: semantic role labeling, combinatory categorial grammar (CCG), tree adjoining grammar (TAG); lexical semantics: vector space representations, greedy agglomerative clustering, k-means and EM clustering; learning hyper(o)nym relations for nouns and verbs
- Background: Jurafsky & Martin, chapters 18-20; see also NLTK book, chapter 10
Machine Translation
NLP and Linguistics