Return to basic course information.

This schedule is subject to change. Check back as the class progresses.

- n-gram models, naive Bayes classifiers, probability, estimation
- We also played the Shannon game, guessing the
next letter from the previous
*n*letters. If you want background, see the Shannon's paper from 1950. **Reading for Jan. 16:**Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP, 2002.**Background:**Jurafsky & Martin, chapter 4

- history of NLP research, the Chomsky hierarchy, regular expressions, (weighted) finite-state automata and transducers
**Background**on NLP with unweighted finite state machines: Karttunen, Chanod, Grefenstette, and Schiller. Regular expressions for language engineering. Journal of Natural Language Engineering, 1997. We discussed the main points and interesting examples from this paper on Jan. 23, so although it's long, you should be able to usefully skim it.**Reading for Jan. 30:**Kevin Knight and Jonathan Graehl. Machine Transliteration. Computational Linguistics, 24(4), 1998.**Reading for Feb. 6:**Okan Kolak, William Byrne, and Philip Resnik. A Generative Probabilistic OCR Model for NLP Applications. In HLT-NAACL, 2003.**More background:**Jurafsky & Martin, chapter 2

Noisy Channel and Hidden Markov Models

- noisy channel models with finite state transducer; part-of-speech tagging; hidden Markov models as noisy channel models; Viterbi and Forward-Backward algorithms; parameter estimation with supervised maximum likelihood and expectation maximization
**Reading for Feb. 20:**Barzilay & Lee. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization. HLT-NAACL, 2004.**Reading for Feb. 20:**Ritter, Cherry & Dolan. Unsupervised Modeling of Twitter Conversations. HLT-NAACL, 2010.**Background:**Jurafsky & Martin, chapter 5 and 6.1-6.5

Context-Free Grammars and Parsers

**Reading for Feb. 27:**Dan Klein and Christopher D. Manning. Accurate Unlexicalized Parsing. ACL, 2003.

- also known as: logistic regression, and maximum entropy (maxent) models; directly modeling the conditional probability if output given input, rather than the joint probability of input and output (and then using Bayes rule)

Log-Linear Models with Structured Outputs

- models that decide among combinatorially many outputs, e.g. sequences of tags or dependency links; locally normalized (action-based) models such as Maximum Entropy Markov Models (MEMMs); globally normalized models such as linear-chain Conditional Random Fields (CRFs)

Semantics

- logical form: lambda expressions, event semantics, quantifiers, intensional semantics; computational semantics: semantic role labeling, combinatory categorial grammar (CCG), tree adjoining grammar (TAG); lexical semantics: vector space representations, greedy agglomerative clustering, k-means and EM clustering; learning hyper(o)nym relations for nouns and verbs; compositional vector-space semantics
**Background:**Jurafsky & Martin, chapters 18-20; see also NLTK book, chapter 10

NLP and Linguistics