COMP 150TP: Text Processing: Inductive Techniques and Applications
Course Web page (this page)
http://www.eecs.tufts.edu/g/150TP/
Prerequisites:
COMP 160 (Algorithms) or permission by
instructor.
Class Times:
Monday, Wednesday 5:10-6:30pm
Instructor:
Roni Khardon
Office: Halligan 230
Office Hours: Tue, Wed 3:30-4:30pm or by appointment
Phone: 1-617-627-5290
Email: roni@eecs.tufts.edu
Teaching Assistant:
Marta Arias
Email: marias@eecs.tufts.edu
Text and Notes
-
Foundations of Statistical Language Processing, Christopher Manning
and Hinrich Schutze, 1999, MIT Press.
The book's web page
http://www-nlp.stanford.edu/fsnlp/
includes an errata list as well as useful resources
- Additional articles will be distributed to supplement the text.
Other Recommended Texts
-
James Allen, Natural Language Understanding, Addison-Wesley, 1995.
-
Eugene Charniak, Statistical Language Learning, MIT Press, 1993.
-
Daniel Jurafsky and James Martin, Speech and Language Processing,
Prentice Hall, 2000.
- Tom Mitchell, Machine Learning, McGraw-Hill, 1997.
Additional Articles and Pointers
-
The
SENSEVAL project home page
- A report/summary
English SENSEVAL: Report and Results A. Kilgarriff and J. Rosenzweig
-
Comparative Experiments on Disambiguating Word Senses: An Illustration
of the Role of Bias in Machine Learning
Raymond J. Mooney Proceedings
of the 1996 Conference on Empirical Methods in Natural Language
Processing, pp. 82-91, Philadelphia, PA, May 1996.
-
A Winnow-Based Approach to Spelling Correction
A. R. Golding and D. Roth,
Machine Learning, Volume 34, pp. 107-130 ,1999.
-
The Weighted Majority Algorithm,
N. Littlestone and M. Warmuth,
Information and Computation, Vol. 108, No. 2, pp. 212-261, 1994.
-
Learning Quickly when irrelevant attributes abound.
N. Littlestone, Machine Learning, 2:285-318, 1988.
-
Part of Speech Tagging Using a Network of Linear Separators
D. Roth & D. Zelenko.
- User Guides for
SNoW
and its feature extractor
FeX
-
Transformation-Based Error-Driven Learning and Natural Language
Processing: A Case Study in Part of Speech Tagging
E. Brill,
Computational Linguistics, Dec. '95
- An introduction to the application of the theory of probabilistic
functions of a Markov Process to automatic speech recognition.
Rabiner, Levinson, and Sondhi, The Bell System Technical Journal, Vol
62, No 4, , pages 1035-1074, 1983.
-
Three Generative, Lexicalised Models for Statistical Parsing
Proceedings of the 35th Annual Meeting of the ACL.
Michael Collins, 1997.
Class Handouts
Notes
Written Homework Assignments
Computer-Based Homework Assignments
-
Practical Exercise 1
(identify whether a document is written is English, Spanish or French)
postscript
,
pdf
-
Practical Exercise 2
(word-sense disambiguation using the Naive-Bayes algorithm)
postscript
,
pdf
-
Practical Exercise 3
(part-of-speech tagging with SNoW)
postscript
,
pdf
-
Practical Exercise 4
(working with PCFGs)
postscript
,
pdf