Bayesian Modelling of Spatial Typology
Emmy Noether Research Group funded by the German Research Foundation (DFG)
Host institution: Department of Linguistics, Albert-Ludwigs-Universität Freiburg
Start date: December 1, 2022
Duration: 6 years
Lead: Matías Guzmán Naranjo
Project team: I-Ying Lin and Marvin Martiny
Research assistants: Sarah Neitzel and Miriam Schiele
There is currently a considerable amount of research on spatial typology (dialectology, areal typology, and language diffusion) dealing with various topics, including the emergence of language areas, the necessity of considering contact effects, sociolinguistic factors that can play a role in language contact, as well as the impact of geographical features on contact between languages. However, attempts to develop computational models of these phenomena have so far remained largely isolated and are mostly based on suboptimal or incomplete assumptions as well as strongly simplified data.
To better understand spatial phenomena, these aspects need to be taken seriously. More realistic, generative models for spatial typology based on more comprehensive and detailed datasets are needed. The present project aims to improve the current state of research by building Bayesian generative models for spatial phenomena — both for identification and as typological control instruments — while using more realistic assumptions and higher-quality spatial data. We will develop models both to control spatially induced confounding factors and to investigate spatial effects themselves.
Better data:
- More realistic distances between languages: Most work in spatial typology assumes that the distance between language communities is either Euclidean or geodetic. However, both approaches ignore social and geographical factors such as mountain ranges, rivers, or trade routes. A central goal of the project is therefore to develop and calculate more accurate distance measures for spatial separations.
- More realistic representations of language areas: While many studies in spatial typology represent languages as points in space, this is a strong simplification. The project aims to create more realistic polygonal representations of language areas.
Better models:
Within the project, we will develop Bayesian models that consider the following aspects:
- Barriers and pathways: Contact between language communities is impeded by both natural (e.g., mountains, fast-flowing rivers, oceans) and socio-political barriers (e.g., borders, religions). At the same time, contact is facilitated by pathways (e.g., roads, trade routes, slowly flowing rivers). The project leadership will develop Bayesian models of language contact diffusion that take both barriers and pathways into account.
- Asymmetric language contact: Language contact is often not symmetrical. Larger languages often exert a stronger influence on smaller languages than vice versa. Lingua francas and imperial languages may have a much larger reach than languages of marginalized communities. Marvin Martiny will develop Bayesian models for asymmetric language contact.
- Models with polygon data: Since different languages are spoken over areas of varying size, they can influence more or fewer neighbors. I-Ying Lin will develop Bayesian models of language contact that incorporate polygonal representations of language areas.
Publications
M. Guzmán Naranjo and G. Jäger, “Euclide, the crow, the wolf and the pedestrian: distance metrics for linguistic typology [version 2; peer review: 1 approved, 2 approved with reservations]” Open Research Europe, vol. 3, p. 104, 2024, doi: 10.12688/openreseurope.16141.2.
M. Guzmán Naranjo and M. Mertner, “Estimating areal effects in typology: A case study of African phoneme inventories” Linguistic Typology, vol. 27, no. 2, pp. 455–480, 2023, doi: 10.1515/lingty-2022-0037.
Conference Contributions
M. Guzmán Naranjo and G. Jäger, “Euclide, the crow, the wolf and the pedestrian” 14th Conference of the Association for Linguistic Typology, University of Texas, Austin, 15–17 Dec. 2022.
M. Martiny, “Grammaticalization in Western Amazonia: A study on areal effects” Amazonicas IX, Universidad de los Andes, Bogotá, 5–9 June 2023.