Probability in linguistic typology
Elena Maslova
University of Bielefeld (17-19 July, 2006)
C01-273, 9.00-18.00
Two major objectives of this course are
If you want to take this course for credit (3LP), a term paper (ca. 15 pages) is required. It can be either a critical analysis of linguistic inferences from statistical evidence in two or more thematically related typological studies, or a statistical typological investigation of your own (e.g. testing a specific hypothesis), which can be based on published typological databases. The paper can be written in English or in German. A preliminary version for comments must be submitted by the end of November (2006). The final version must be submitted by the end of January (2007).
The two introductory lectures give an overview of the history of quantitative typology and its major methodological issues. We will see how statistical tests and (mostly implicitly) probabilistic concepts have been invoked in typological research, and why we need a better understanding of the basics of probability theory in order to build a more solid methodological foundation for this branch of linguistics.
Judging from the titles, the next three lectures may look as though they come from an introduction to probability theory, rather than from a course in linguistic typology. The point is, however, to introduce the absolutely necessary probability-theoretic concepts with direct reference to their typological applications. So, for instance, we will discuss not just the abstract concept of independence, but also its relation to the problem of “independence of languages”, probably the most widely known methodological problem of linguistic typology; not just correlations, but also the probabilistic sense of implicational universals, etc. More generally, all examples and exercises will come directly from typological studies.
As a way to consolidate the probability-theoretic basics discussed so far, we will critically examine some influential typological studies which heavily rely on statistical inferences (and, sometimes implicitly, on probabilistic concepts). Among other things, this analysis will demonstrate that another probability-theoretic domain has to be explored, namely, random processes.
Having introduced the mathematical concept of random process in the first lecture, we will move to a discussion of random processes going on in the language population and various approaches to modelling these processes. In the final lecture of this part of the course, we'll see how the effects of different random processes might be reflected in the modern cross-linguistic distribution.
The last two lectures establish the missing links between the theoretical issues discussed in the course and tests and recommendations one finds find in statistical textbooks. The major goal is to give the students the basic tools to approach a typological study of their own in an informed manner, as well as to understand and analyze other typological studies relying on statistical evidence. In particular, we will return to methodological issues outlined in the introductory part of the course and see whether (and if yes, how) they can be resolved.
Bibliogrpahy
Bell Alan. "Language sampling." In Universals of human language, vol. 1, edited by Joseph H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik, 125-156. Stanford University Press, 1978.
Cysouw, Michael. "Quantitative Method in Typology." In Quantitative Linguistics: An International Handbook, edited by Gabriel Altmann, Reinhard Köhler, and R. Piotrowski, Berlin: Mouton de Gruyter, 2005.
Dryer, Matthew S. "Why Statistical Universals Are Better Than Absolute Universals." In Papers From the 33Rd Annual Meeting of the Chicago Linguistic Society, 123-45. 1998.
Dryer, Matthew S. "Large Linguistic Areas and Language Sampling." Studies in Language 13 (1989): 257-92.
Greenberg, Joseph H. "Diachrony, Synchrony and Language Universals." In Universals of Human Language, edited by Joseph Harold Greenberg, Charles Albert Ferguson, and Edith A Moravcsik, 61-91. Stanford, Calif: Stanford University Press, 1978.
Greenberg, Joseph H. "The Diachronic Typological Approach to Language." edited by Masayoshi Shibatani, and Theodora Bynon, 143-66. Oxford Oxford ; New York: Clarendon Press Oxford University Press, 1995.
Hawkins, John A. A Performance Theory of Order and Constituency. Vol. Cambridge studies in linguistics ; 73, Cambridge ; New York: Cambridge University Press, 1994.
Maddieson, Ian. "Investigating Linguistic Universals." In Proceedings of the Xiith International Congress of Phonetic Sciences, 346-54. 1991.
Maslova, Elena. "Meta-Typological Distributions." Sprachtypologie und Universalienvorshung
Maslova, Elena. "A Dynamic Approach to the Verification of Distributional Universals." Linguistic Typology 4-3 (2000):
Maslova, Elena. "Динамика типологических распределений и стабильность языковых типов." Вопросы языкознания 5 (2004).
Nichols, Johanna. Linguistic Diversity in Space and Time. Chicago: University of Chicago Press, 1992.
Perkins, Revere D. "Statistical Techniques for Determining Language Sample Size." Studies in Language 13 (1989): 293-315.
Perkins, Revere D. "Sampling Procedures and Statistical Methods." In Language Typology and Language Universals : An International Handbook, edited by Martin Haspelmath, Ekkehard König, Wulf Oesterreicher, and Wolfgang Raible, 419-34. 2001.
Rijkhoff, Jan, Bakker, Dik, Hengeveld, Kees, and Kahrel, Peter. "A Method of Language Sampling." Studies in Language 17-1 (1993): 169-203.
Rijkhoff, Jan, and Bakker, Dik. "Language Sampling." Lingustic Typology 2-3 (1998): 263-314.
Tomlin, Russell S. Basic Word Order : Functional Principles. Vol. Croom Helm linguistics series, London ; Wolfeboro, N.H: Croom Helm, 1986.
Course plan
17.07.2006 | 9.15-10.45 | 1. Introductory |
11.00-12.30 | 2. Statistical inferences | |
14.30-16.00 | 3. Probabilities | |
16.15-17.45 | 4. Random variables | |
18.07.2006 | 9.15-10.45 | 5. Correlations and dependencies. |
11.00-12.30 | 6.Analysis of cross-linguistic distributions | |
14.30-16.00 | 7. Random processes | |
16.15-17.45 | 8. Non-linguistic random processes. | |
19.07.2006 | 9.15-10.45 | 9. Language change as a random processes. |
11.00-12.30 | 10. Random processes and cross-lingusitic distributions. | |
14.30-16.00 | 11. Statistical convergence | |
16.15-17.45 | 12. Sampling, tests of hypotheses | |
17-19.07.2006 | 13.45-14.15, 18.00-18.30 | Questions (Sprechstunde), as needed |
Exact times are given (no academic delays)! |