Probability in linguistic typology

Elena Maslova

University of Bielefeld (17-19 July, 2006)

C01-273, 9.00-18.00

Two major objectives of this course are

- to bring the students up to date with the theoretical research in quantitative typology, that is, with the existing methods of drawing linguistic inferences from statistical cross-linguistic data, and
- to provide them with the basic probability-theoretic knowledge needed to understand how evidence from cross-linguistic distributions can and cannot be used in linguistic theory.

If you want to take this course for credit (3LP), a term paper (ca. 15 pages) is required. It can be either a critical analysis of linguistic inferences from statistical evidence in two or more thematically related typological studies, or a statistical typological investigation of your own (e.g. testing a specific hypothesis), which can be based on published typological databases. The paper can be written in English or in German. A preliminary version for comments must be submitted by the end of November (2006). The final version must be submitted by the end of January (2007).

**
**

- Introductory
- Overview of the course. Practical matters. A preliminary overview of research in quantitative typology
- Statistical data and statistical inference. Inferences and explanations in typology

*The
two introductory lectures give an overview of the history of
quantitative typology and its major methodological issues. We will
see how statistical tests and (mostly implicitly) probabilistic
concepts have been invoked in typological research, and why we need a
better understanding of the basics of probability theory in order to
build a more solid methodological foundation for this branch of
linguistics.*

**
**

**Probabilities and random variables**- Probability. Independence. Conditional probabilities
- Random variables and their distributions. Properties of expected values, means, variance etc.
- Correlations and dependencies.
- An interim summary: statistical analysis of cross-linguistic distributions

*Judging from the titles, the next three lectures may look as though they
come from an introduction to probability theory, rather than from a
course in linguistic typology. The point is, however, to introduce
the absolutely necessary probability-theoretic concepts with direct
reference to their typological applications. So, for instance, we
will discuss not just the abstract concept of independence, but also
its relation to the problem of “independence of languages”,
probably the most widely known methodological problem of linguistic
typology; not just correlations, but also the probabilistic sense of
implicational universals, etc. More generally, all examples and
exercises will come directly from typological studies.*

*As
a way to consolidate the probability-theoretic basics discussed so
far, we will critically examine some influential typological studies
which heavily rely on statistical inferences (and, sometimes
implicitly, on probabilistic concepts). Among other things, this
analysis will demonstrate that another probability-theoretic domain
has to be explored, namely, random processes.*

**
**

**Random processes**- Introduction to random processes
- Non-linguistic random processes in the language population
- Language change as a random process. Ergodic hypothesis.
- Random processes and cross-linguistic distributions

*Having introduced the mathematical concept of random process in the first lecture, we will move to a discussion of random processes going on in the language population and various approaches to modelling these processes. In the final lecture of this part of the course, we'll see how the effects of different random processes might be reflected in the modern cross-linguistic distribution.*

**
**

**From theory to applications**- Limiting distributions, statistical convergence, large numbers
- Sampling, statistics, tests of hypotheses

*The
last two lectures establish the missing links between the theoretical
issues discussed in the course and tests and recommendations one finds
find in statistical textbooks. The major goal is to give the students the
basic tools to approach a typological study of their own in an
informed manner, as well as to understand and analyze other
typological studies relying on statistical evidence. In particular,
we will return to methodological issues outlined in the introductory
part of the course and see whether (and if yes, how) they can be
resolved.
*

**
Bibliogrpahy**

Bell Alan. "Language sampling." In *Universals of human language, vol. 1*, edited by Joseph H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik, 125-156. Stanford University Press, 1978.

Cysouw, Michael. "Quantitative
Method in Typology." In *Quantitative
Linguistics: An International Handbook*, edited by Gabriel
Altmann, Reinhard Köhler, and R. Piotrowski, Berlin:
Mouton de Gruyter, 2005.

Dryer, Matthew S. "Why Statistical
Universals Are Better Than Absolute Universals." In *Papers
From the 33Rd Annual Meeting of the Chicago Linguistic Society*,
123-45. 1998.

Dryer, Matthew S. "Large Linguistic
Areas and Language Sampling." *Studies in Language* 13
(1989): 257-92.

Greenberg, Joseph H. "Diachrony,
Synchrony and Language Universals." In *Universals of Human
Language*, edited by Joseph Harold Greenberg, Charles Albert
Ferguson, and Edith A Moravcsik, 61-91. Stanford, Calif: Stanford
University Press, 1978.

Greenberg, Joseph H. "The Diachronic Typological Approach to Language." edited by Masayoshi Shibatani, and Theodora Bynon, 143-66. Oxford Oxford ; New York: Clarendon Press Oxford University Press, 1995.

Hawkins, John A. *A Performance Theory of
Order and Constituency*. Vol. Cambridge studies in linguistics ;
73, Cambridge ; New York: Cambridge University Press, 1994.

Maddieson, Ian. "Investigating
Linguistic Universals." In *Proceedings of the Xiith
International Congress of Phonetic Sciences*, 346-54. 1991.

Maslova, Elena. "Meta-Typological
Distributions." *Sprachtypologie und Universalienvorshung*

Maslova, Elena. "A Dynamic Approach to
the Verification of Distributional Universals." *Linguistic
Typology* 4-3 (2000):

Maslova, Elena. "Динамика
типологических распределений и
стабильность языковых типов." *Вопросы
языкознания* 5 (2004).

Nichols, Johanna. *Linguistic Diversity
in Space and Time*. Chicago: University of Chicago Press, 1992.

Perkins, Revere D. "Statistical
Techniques for Determining Language Sample Size." *Studies in
Language* 13 (1989): 293-315.

Perkins, Revere D. "Sampling
Procedures and Statistical Methods." In *Language Typology and
Language Universals : An International Handbook*, edited by Martin
Haspelmath, Ekkehard König, Wulf Oesterreicher, and Wolfgang
Raible, 419-34. 2001.

Rijkhoff, Jan, Bakker, Dik, Hengeveld, Kees, and Kahrel, Peter. "A Method of Language Sampling." *Studies in Language* 17-1 (1993): 169-203.

Rijkhoff, Jan, and Bakker, Dik. "Language
Sampling." *Lingustic Typology* 2-3 (1998): 263-314.

Tomlin, Russell S. *Basic Word Order :
Functional Principles*. Vol. Croom Helm linguistics series, London
; Wolfeboro, N.H: Croom Helm, 1986.

**
Course plan**

17.07.2006 | 9.15-10.45 | 1. Introductory |

11.00-12.30 | 2. Statistical inferences | |

14.30-16.00 | 3. Probabilities | |

16.15-17.45 | 4. Random variables | |

18.07.2006 | 9.15-10.45 | 5. Correlations and dependencies. |

11.00-12.30 | 6.Analysis of cross-linguistic distributions | |

14.30-16.00 | 7. Random processes | |

16.15-17.45 | 8. Non-linguistic random processes. | |

19.07.2006 | 9.15-10.45 | 9. Language change as a random processes. |

11.00-12.30 | 10. Random processes and cross-lingusitic distributions. | |

14.30-16.00 | 11. Statistical convergence | |

16.15-17.45 | 12. Sampling, tests of hypotheses | |

17-19.07.2006 | 13.45-14.15, 18.00-18.30 | Questions (Sprechstunde), as needed |

Exact times are given (no academic delays)! |