Probability in linguistic typology

Elena Maslova

University of Bielefeld (17-19 July, 2006)

C01-273, 9.00-18.00

Two major objectives of this course are

- to bring the students up to date with the theoretical research in quantitative typology, that is, with the existing methods of drawing linguistic inferences from statistical cross-linguistic data, and
- to provide them with the basic probability-theoretic knowledge needed to understand how evidence from cross-linguistic distributions can and cannot be used in linguistic theory.

If you want to take this course for credit (3LP), a term paper (ca. 15 pages) is required. It can be either a critical analysis of linguistic inferences from statistical evidence in two or more thematically related typological studies, or a statistical typological investigation of your own (e.g. testing a specific hypothesis), which can be based on published typological databases. The paper can be written in English or in German. A preliminary version for comments must be submitted by the end of November (2006). The final version must be submitted by the end of January (2007).

**
**

- Introductory
- Overview of the course. Practical matters. A preliminary overview of research in quantitative typology
- Statistical data and statistical inference. Inferences and explanations in typology

*The
two introductory lectures give an overview of the history of
quantitative typology and its major methodological issues. We will
see how statistical tests and (mostly implicitly) probabilistic
concepts have been invoked in typological research, and why we need a
better understanding of the basics of probability theory in order to
build a more solid methodological foundation for this branch of
linguistics.*

**
**

**Probabilities and random variables**- Probability. Independence. Conditional probabilities
- Random variables and their distributions. Properties of expected values, means, variance etc.
- Correlations and dependencies.
- An interim summary: statistical analysis of cross-linguistic distributions

*Judging from the titles, the next three lectures may look as though they
come from an introduction to probability theory, rather than from a
course in linguistic typology. The point is, however, to introduce
the absolutely necessary probability-theoretic concepts with direct
reference to their typological applications. So, for instance, we
will discuss not just the abstract concept of independence, but also
its relation to the problem of “independence of languages”,
probably the most widely known methodological problem of linguistic
typology; not just correlations, but also the probabilistic sense of
implicational universals, etc. More generally, all examples and
exercises will come directly from typological studies.*

*As
a way to consolidate the probability-theoretic basics discussed so
far, we will critically examine some influential typological studies
which heavily rely on statistical inferences (and, sometimes
implicitly, on probabilistic concepts). Among other things, this
analysis will demonstrate that another probability-theoretic domain
has to be explored, namely, random processes.*

**
**

**Random processes**- Introduction to random processes
- Non-linguistic random processes in the language population
- Language change as a random process. Ergodic hypothesis.
- Random processes and cross-linguistic distributions

*Having introduced the mathematical concept of random process in the first lecture, we will move to a discussion of random processes going on in the language population and various approaches to modelling these processes. In the final lecture of this part of the course, we'll see how the effects of different random processes might be reflected in the modern cross-linguistic distribution.*

**
**

**From theory to applications**- Limiting distributions, statistical convergence, large numbers
- Sampling, statistics, tests of hypotheses

*The
last two lectures establish the missing links between the theoretical
issues discussed in the course and tests and recommendations one finds
find in statistical textbooks. The major goal is to give the students the
basic tools to approach a typological study of their own in an
informed manner, as well as to understand and analyze other
typological studies relying on statistical evidence. In particular,
we will return to methodological issues outlined in the introductory
part of the course and see whether (and if yes, how) they can be
resolved.
*

**
Bibliogrpahy**

**
Course plan**

17.07.2006 | 9.15-10.45 | 1. Introductory |

11.00-12.30 | 2. Statistical inferences | |

14.30-16.00 | 3. Probabilities | |

16.15-17.45 | 4. Random variables | |

18.07.2006 | 9.15-10.45 | 5. Correlations and dependencies. |

11.00-12.30 | 6.Analysis of cross-linguistic distributions | |

14.30-16.00 | 7. Random processes | |

16.15-17.45 | 8. Non-linguistic random processes. | |

19.07.2006 | 9.15-10.45 | 9. Language change as a random processes. |

11.00-12.30 | 10. Random processes and cross-lingusitic distributions. | |

14.30-16.00 | 11. Statistical convergence | |

16.15-17.45 | 12. Sampling, tests of hypotheses | |

17-19.07.2006 | 13.45-14.15, 18.00-18.30 | Questions (Sprechstunde), as needed |

Exact times are given (no academic delays)! |