Phoneme cooccurrence in Phoible inventories

A forum for discussing linguistics or just languages in general.
Post Reply
User avatar
Creyeditor
MVP
MVP
Posts: 5123
Joined: 14 Aug 2012 19:32

Phoneme cooccurrence in Phoible inventories

Post by Creyeditor »

I just wanted to share this graphic that I found on the conlangs subreddit. It shows how often certain phonemes cooccur in a phoneme inventory in the Phoible database.

https://www.reddit.com/r/conlangs/comme ... occurence/

There are some cool surprises there (of course, they don't need to be empirically real in any way).
  • /ɮ/ and /ŋgʷ/ frequently cooccur. Maybe because both are frequent in the Caucasus and North America?
  • Mid tones frequently cooccur with labiovelar plosives. Maybe because both are frequent in West Africa?
  • Lax mid vowels frequently cooccur with voiced fricatives. Maybe an effect of Bantu?
  • Voiced stops frequently cooccur with tense mid vowels. No real idea for an explanation here.
  • /ɲ/ and /h/ also frequently cooccur with them. Spanish influence?
What do you think. Did you notice something unusual. Any speculation on possible explanations is welcome.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :idn: 4 :fra: 4 :esp:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
Nel Fie
cuneiform
cuneiform
Posts: 156
Joined: 23 May 2022 15:18

Re: Phoneme cooccurrence in Phoible inventories

Post by Nel Fie »

Here's some more info about the first two you noticed.
  • The coocurence of /ɮ/ and /ŋɡʷ/ is all down to 16 Afro-Asiatic languages in PHOIBLE's data. The exact languages are Bana, Besleri, Buwal, Dghwede, Gavar, Hdi, Mbuko, Merey, Mofu-Gudur, Moloko, Ngizim, Daba, Tera, Vame, Wandala and Wuzlam. It's most likely an areal effect since, based on the map, they're all seem to be immediate neighbours of each other in the Far North region of Cameroon.
  • About the coocurence of mid tones (/˧/) and labio-velar plosives (well, /kp/ and /ɡb/ anyway), you're right. That's also down to languages in West Africa, although the number of languages is higher, at around 137, and they are more spread out and distributed across several families. Still probably something of an area effect, although much wider geographically speaking.
:deu: Native (Swabian) | :fra: Native (Belgian) | :eng: Fluent | :rus: Beginner
DeviantArt | YouTube | Tumblr
User avatar
Creyeditor
MVP
MVP
Posts: 5123
Joined: 14 Aug 2012 19:32

Re: Phoneme cooccurrence in Phoible inventories

Post by Creyeditor »

Neat, I wouldn't have guessed the Afro-Asiatic stuff but it certainly makes sense.
Edit: Ah, all of them are also Chadic and most of them are Central Chadic aka Biu-Mandara. Might be a genetic effect as the two segments are also reconstructed for Proto-Central Chadic per this list: https://en.m.wiktionary.org/wiki/Append ... structions
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :idn: 4 :fra: 4 :esp:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
Nel Fie
cuneiform
cuneiform
Posts: 156
Joined: 23 May 2022 15:18

Re: Phoneme cooccurrence in Phoible inventories

Post by Nel Fie »

Creyeditor wrote: 13 Apr 2024 22:04 [...]
[*]Lax mid vowels frequently cooccur with voiced fricatives. Maybe an effect of Bantu?
[*]Voiced stops frequently cooccur with tense mid vowels. No real idea for an explanation here.
[...]
Could you clarify which mid vowels you mean, exactly? I'm afraid I'm not familiar with the lax-tense classification practice here.
:deu: Native (Swabian) | :fra: Native (Belgian) | :eng: Fluent | :rus: Beginner
DeviantArt | YouTube | Tumblr
User avatar
Creyeditor
MVP
MVP
Posts: 5123
Joined: 14 Aug 2012 19:32

Re: Phoneme cooccurrence in Phoible inventories

Post by Creyeditor »

/ɛ ɔ/ are lax mid vowels and /e o/ are tense mid vowels.
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :idn: 4 :fra: 4 :esp:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
Nel Fie
cuneiform
cuneiform
Posts: 156
Joined: 23 May 2022 15:18

Re: Phoneme cooccurrence in Phoible inventories

Post by Nel Fie »

Creyeditor wrote: 16 Apr 2024 18:10 /ɛ ɔ/ are lax mid vowels and /e o/ are tense mid vowels.
Thank you. Sorry, I hadn't noticed yet that this is based on PHOIBLE's own system of features. Their whole approach as a whole is not quite what I'm used to.
About your last point, when you said "/ɲ/ and /h/ also frequently cooccur with them", does "them" refer to voiced stops, or to tense mid vowels?

In regard to point 3, "Lax mid vowels frequently cooccur with voiced fricatives. Maybe an effect of Bantu?":

PHOIBLE counts about 969 languages with any of /ɛ,ɔ/, of which 662 also have at least one voiced fricative (assuming /β,v,ð,z,ʒ,ʐ,ʝ,ɣ,ʁ,ʕ,ɦ/). So that's indeed a majority cooccurrence by about two thirds.

Not sure if and how much it has to do with Bantu. While it's true that Atlantic-Congo languages make up a majority of the result (with 254 languages), this family is in itself probably the largest in the dataset by far (with 423 entries out of 2186), and there are languages from families across the board that show a similar pattern at roughly the same proportion (with Indo-European at 80 out of 149, and Sino-Tibetan at 43 out of 100).

I can't claim any expertise in all this, but my guess is that it's more of a cross-linguistic pattern, with Bantu only being a potential culprit in the Atlantic-Congo family itself, due to how large it is.

As for point 4, "Voiced stops frequently cooccur with tense mid vowels. No real idea for an explanation here.":

The result seems to show a similar pattern. Languages with any permutation and number of /e,o/ and /b,d,ɖ,ɟ,ɡ,ɢ/ count 1198, versus 379 that have any of /e,o/ but no voiced stops, and 265 that have any voiced stop but neither of /e,o/. Languages come from a wide range of families, with Atlantic-Congo having a pretty large representation at 369 , but Indo-European with 103 and Sino-Tibetan at 77 come in second and third again.

All in all, it's probably also just a cross-linguistic pattern. Maybe there's something more interesting to be dug up by calculating how strong the pattern is proportionally within each family, but I can't afford the time to do that math right away.

Why the pattern would occur at all though, is entirely without my wheelhouse.
:deu: Native (Swabian) | :fra: Native (Belgian) | :eng: Fluent | :rus: Beginner
DeviantArt | YouTube | Tumblr
Post Reply