APLNG 597E: Introduction to Corpus Linguistics
Spring 2007
General Information
Instructor: Xiaofei Lu
Office: 301
Mailbox: 305
Phone: (814)
8654692
Email: xxl13
at psu dot edu
Webpage: http://www.personal.psu.edu/xxl13/teaching/sp07/apling597e
Lectures: T
R 1:00-2:15pm, 069 Willard
Office hours: T 2:30-4:30pm and by
appointment
Required books
1.
Douglas
Biber, Susan Concrad, and Randi Reppen (1998). Corpus Linguistics: Investigating Language Structure and Use.
2.
Allen Downey, Jeff Elkner, and
Chris Meyers (2002). How to Think Like a Computer Scientist:
Learning with Python. Green Tea Press.
3.
Graeme Kennedy (1998). An Introduction to Corpus Linguistics.
Longman.
4.
Martin
Wynne (Ed.) (2005).
Developing Linguistic Corpora: a Guide to
Good Practice. Oxbow Books.
Course
Objectives
This course provides a hands-on introduction to the use of
large text corpora in the study of language. The specific objectives of the
course are to help students:
Course
Outline
This course will be organized around the following 5 topics.
Course
Requirements
Make-up
Policy
Academic
Misconduct
All suspected academic dishonesty (e.g., plagiarism, faking
data/analysis, etc.) will be reported to the Academic Integrity Committee and,
if verified, will be subject to academic and/or disciplinary sanctions.
Tentative
Schedule
|
W |
D |
Date |
Topic |
|
Presenters |
|
1 |
T |
1/16 |
Kennedy (1998): Ch1; 2.1-2.4 |
|
|
|
R |
1/18 |
Kennedy (1998): 2.5-2.7 Wynne (2005): Ch1 |
|
||
|
2 |
T |
1/23 |
|
||
|
R |
1/25 |
UNIX tools |
|
||
|
3 |
T |
1/30 |
|
||
|
R |
2/1 |
Downey et al. (2002): Ch1-4 |
|
||
|
4 |
T |
2/6 |
|
||
|
R |
2/8 |
Downey et al. (2002): Ch5-8 |
|
||
|
5 |
T |
2/13 |
|
||
|
R |
2/15 |
Downey et al. (2002): Ch9-11 |
|
||
|
6 |
T |
2/20 |
|
||
|
R |
2/22 |
Wynne (2005): Ch2 Kennedy
(1998): 4.1 |
|
||
|
7 |
T |
2/27 |
|
||
|
R |
3/1 |
|
|||
|
8 |
T |
3/6 |
|
||
|
R |
3/8 |
|
|||
|
9 |
Spring Break |
|
|||
|
10 |
T |
3/20 |
Biber et al. (1998) Ch1; IV.6 Kennedy (1998): |
|
|
|
R |
3/22 |
Biber et al. (1998): Ch2 |
|
||
|
11 |
T |
3/27 |
|
||
|
R |
3/29 |
Kennedy (1998): |
|
||
|
12 |
T |
4/3 |
|
||
|
R |
4/5 |
Grammar & lexico-grammar |
Kennedy (1998): 3.2-3.3 |
||
|
13 |
T |
4/10 |
Grammar & lexico-grammar |
Biber et al. (1998): Ch3-4 |
|
|
R |
4/12 |
|
|||
|
14 |
T |
4/17 |
Variation |
Biber et al. (1998): Ch6 Kennedy (1998): 3.5 |
|
|
R |
4/19 |
Discourse & stylistic analysis |
Biber et al. (1998): Ch5 & Ch 8 |
||
|
15 |
T |
4/24 |
|||
|
R |
4/26 |
Lang acquisition; applications |
Biber et al. (1998): Ch7 Kennedy et al. (1998): Ch5 |
||
|
16 |
T |
5/1 |
Research project presentations |
Tom, Tracy, Park, Davi, Hyewon |
|
|
R |
5/3 |
Research project presentations;
proposals due |
Jie, Nathan, So-Eun, Wei |
||
Additional References
1.
Brew,
C. and M. Moens (2002). Data-Intensive
Linguistics. Manuscript.
2.
Church,
K. UNIX for Poets. AT&T Research.
3.
Grefenstette, G. and P. Tapanainen (1994). What is
a word, what is a sentence? Problems of tokenization. In Proceedings of the Third
Conference on Computational Lexicography and Text Research
(COMPLEX-94).
4.
Lu, X. (2005). Candidacy exam. The
5.
Manning, C. D. and H. Schutze (1999). Foundations of Statistical Natural
Language Processing.
6.
Rayson
P., D. Berridge and B. Francis (2004).
Extending the Cochran rule for the comparison of word frequencies between
corpora. In Proceedings of the 7th International Conference on Statistical
Analysis of Textual Data, pp. 926 - 936.
7.
Schmid,
H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings
of the International Conference on New Methods in Language Processing,
pp 44-49.
8.
Teubert,
W. (2005). My version of corpus linguistics. International Journal of Corpus Linguistics 10(1), 1-13.