Semũr [sεmɯɾ] is a language spoken in a collab-world me and a few friends are working on, made for the sole purpose of becoming a substratum for my branch of our proto-language (and probably surviving Basque-style). The version of Semũr I’m working on right now represents the state of the language right before first contact. Some elements of the language are pretty experimental and may not be extremely naturalistic, although mostly I’m going for naturalism.
Phonology
I’ll try to keep this short. I generated one with gleb until I got one that looked good and played around with the allophony rules and frequencies until I liked them.
Edit: Been thinking about this for a bit after I was told that aspirates don’t occur in langs without /h/. Would replacing the aspirated stuff with ejectives make it more realistic or am I then violating some other universal?
First row represents phoneme, second row romanisation (language is not written at this stage but will in the future adapt a writing system). Sounds in brackets only occur allophonically. Not all sounds are equally common. Tenuis stops, voiced nasals and /s/ are the most common sounds. Voiced stops and other voiced sonorants are the next common, followed by /t͡s/ and /f/. Aspirated stops and /k͡x/ are relatively rare. The rarest consonant phonemes are voiceless nasals and aspirated affricates. (in the wordgen I’m using the frequencies are weighted 5:4:3:2:1)
The vowels are written <a e i o u õ ũ>, where the tilde indicates unroundedness.
/a/ is the most common vowel, followed by /i u/. /e o/ are the rarest ones and [ɯ ʌ] only appear as allophones of their rounded versions.
Syllable Structure
A syllable has the structure (C₁)V(C₂) where
-C₁ denotes any consonant
-V denotes any vowel
-C₂ denotes any of /n n̥ ɾ l s/ (allophonically also includes [m m̥ ŋ ŋ̊ z])
Words can theoretically have infinite syllables, but in reality most have two, three or sometimes four. Roots are almost never longer than two syllables. Two vowels may form a sequence, but there is always a hiatus. Diphthongs do not exist in Semũr.
Allophony Rules
Most of the rules that gleb gave me can’t actually occur due to the restricted syllable structure.
- Non-labial consonats become rounded before . This is not indicated in romanisation.
<mensu> [mεnsʷu] - /s/ assimilates in voicing to a following obstruent or nasal (even across word boundaries). This is indicated in romanisation within a word but not at a word boundary, where it is always written <s>
/vantʰes/ (sleep) + /de/ (dual) → [ʋantʰεzdε] <vantʰezde> (the two sleep) - Nasals assimilate in place of articulation with any following consonant. This is indicated in romanisation.
/inm̥os/ → [imm̥ʌs] <imhmõs> (some kind of spirit) - /u/ and /o/ become unrounded before non-rounded consonants. This is indicated in the romanisation by use of the tilde.
/semur/ → [sεmɯɾ] <semũr> (the name of this language)
Stress
Stress is not lexical. It is iambic (alterating between unstressed and stressed) and resets at phrase boundaries. There is a tendency to choose words so that phrases end on a stressed syllable, but it is not considered ungrammatical if this rule is violated (and often there is no alternative). This is probably the least natural part of the phonology, but also my favourite (iambs are fun).
Noun classes
These are very important so I’m giving them their own section.
There are a total of 14 noun classes in Semũr. The idea of their existence, as well as a few of them are inspired by Bantu languages, especially Swahili. The classes are ordered by their grade of animacy. This is important, as you will see later on. These are the classes:
- Humans — Includes all humans, kinship terms, actor and patient forms
- Animals — Includes all animals.
- Spiritual — Includes anything to do with dieties, spirits, magic.
- Plants — Includes anything that is not an animal or human, not man-made and shows signs of life like growth.
- Deadly Things — Includes anything that can kill or seriously harm you, such as fire or poisonous mushrooms. Taboos also go into this group.
- Foodstuff — Includes anything consumable.
- Tools — Includes anything which is used to achieve something else, such as tools, weapons or (in later times) machinery. Also includes means of transport which are not animals.
- Containers — Includes anything, regardless of material, that is used to store other things in.
- Textiles — Includes clothing and the materials they are made from.
- Rock — Includes anything natural that is neither animate nor falls into the plants class.
- Abstract — Includes anything you can’t directly point at that doesn’t fit in any other class.
- Diminutives — Includes notably small things, such as insects.
- Augmentatives — Includes notably large things, such as the sea or mountains.
- Duals — Includes things that usually come in pairs, such as eyes.
These classes can be divided into two rough subgroups: Immutable (1-11) and Mutable (12-14). Every noun has an inherent class. However, any noun may, if the situation calls for it, be treated like a noun from one of the mutable classes. Note that some nouns may be inherently placed in one of the mutable classes. Noun classes are entirely based on semantics and are in no way marked on the noun itself.
Syntax
Semũr is VSO with a fairly strict word order. It is fairly analytic, with some agglutinating affixes. Most grammatical meanings are conveyed via a large amount of particles.
A sentence can be subdivided into three parts. The first part is filled with modifiers. These indicate tense, aspect and modality of the verb. Tense and modality modifiers are open classes, aspects are pretty closed. It is mandatory to have a tense marker. Tense markers are comparable to adverbs of time in western languages and may be as long as need be. Modality markers are usually single words, there may be multiple affecting each other. There can only be one aspect marker.
The default tense marker is bo, meaning as much as “at the time which has been referred to”.
The second part is filled by the verb. I need to spend more time thinking about what to do if a situation calls for verbs being stacked or if that should be allowed at all. For things like “I want to be able to swim”. Probably I’ll just stack them in some order.
The third part is filled by all the arguments, aka noun phrases or subordinated stuff. These appear in a pretty fixed order:
The first argument is always the one in the nominative/ergative (I’ll get to this later). Following that, any other arguments are said in order of animacy, with the most animate coming first. If two arguments of the same class occur then they are interchangable in order.
A minimal noun phrase consists of a preposition and a noun or pronoun. Modifiers like adjectives follow the noun. Pronouns cannot take modifiers.
Morphology
So far, there are only two elements in the language which can be considered morphology: verb conjugation and prepositions.
Verb conjugation
Finite verbs take a simple suffix which agrees with the class of its first argument.
(not as pretty but I really didn’t feel like redoing it in LaTeX. Some of the sentences don’t make a lot of sense, it’s merely for showing the system. Words in {} are ones I haven’t defined well enough yet since I first need to flesh out the culture and region for this.)
Although separate pronouns exist for first and second person, these don’t have separate affixes but merely take the Human one.
Prepositions
You may have noticed in the picture above that the first two sentences feature a different preposition, mu, than the rest, which have o. These two prepositions show different cases: mu is the nominative, o the absolutive. There is a split between nom-acc and erg-abs based on animacy. Used with nouns, only the first two classes (humans and animals) trigger nom-acc agreement. If the subject/experiencer is a pronoun, however, the next three classes (Spiritual, Plants and Deadly Things) also trigger nom-acc. In any other case, erg-abs is used.
Each preposition can be broken down into three parts:
- The case — this first element can be a (accusative), o (absolutive), mu (nominative) or ni (ergative).
- The function — this is the part that corresponds to actual prepositions in western languages. Some examples are -rin- (dative), -võr- (outside of) or -pũl- (behind). With locative functions such as võr or pũl, -lu may be appended to indicate movement away from: -võrlu- (out), -pulu- (from behind)
- The article — the final element is a suffix which acts similar to western articles or determiners. It can be -u (definite), -as (indefinite), -qõr (partitive) or -a (negative). -u is dropped after a rounded vowel (hence mu, o instead of muu, ou in the table above). -as, -a become -s, -∅ after a. Any velar is dropped before -qõr.
That’s it so far. Now I’ve gone and done exactly what I dislike about conlang topics in this forum, namely that they’re too information dense to follow. I’ll probably go over it and add examples in the next few days to lighten it up a bit.
Btw, if I’ve missed something I’ve probably done it deliberately because I haven’t actually worked it out. However, feel free to ask questions :)