Notation of linguistical rules

Linguistical rules have the following format:

     A > B / C

where A, B and C mean:

     A: condition
     B: consequence
     C: context

Thus, the rule can be read as: "if A in context C then B" or "A becomes B in the context C"


About the sounds

Before you can describe the evolution of a language, you must define the sounds that you want to use. A sound definition consists of a character that represents the sound and a list of distinctive attributes that characterize it:

     a := [ voc, cent, bas ]

This line defines the character 'a' as a vowel (voc) that is characterized by the attributes "cent" (= in the middle of the vocalic triangle (central)) and "bas" (= "low" in comparison with i or u which are "high" vowels).

We call litterals the characters that have been defined as sounds. Capital letters cannot be defined as litterals.


About the attributes

In order to define the sounds, you must first define the attributes that you want to used for the description of the sounds. Attributes are organized by groups. The line

     type := { voc, cons, svoc }

defines the group "type" that can have three values: a sound can even be a vowel (voc), a consonant (cons) or a semi-vowel (svoc).

You can also define grammatical attributes, for example

     type := { verb, subst, adj, adv }

Grammatical attributes must have the priority 1 (see "priority") and can only be combined with the predefined litteral @.


Function of the groups

When you define a litteral, only one attribute can be taken from each group. In our example, a sound can EITHER be a vowel OR a consonant OR a semivowel (but it can not be at the same time a vowel and a consonant, for example).


Priority

For every group of attributes you must indicate its priority. When you define several groups of attributes, there are groups whose attributes cannot be combined with those from other groups. For example, if you have defined the following groups:

     2 : type := { voc, cons, svoc }
     3 : articulation := { ant, cent, post } ;
     attributes valid for vowels

     3 : articulation := { bil, ldent, dent, .. } ;
     attributes valid for consonants

The attributes "ant", "cent", "post" only make sense if they are combined with vowels (i.e. sounds that are characterized at the same time by the attribute "voc"). They don't make sense, however, if combined with a consonant because the articulation of consonants is different and must therefore be described by other attributes ("bil", "ldent", "dent" etc.)

The priority is important during the calculation: whenever you modify an attribute, all attributes that have a lower priority (i.e. a higher value) are invalidated. In our example: if you transform a vowel into a consonant, the attributes "ant", "cent", "post" will no longer be valid for the new sound.

This mechanism, however, doesn't work for the groups that have the priority 1 which is reserved for grammatical attributes (see there).


About the condition (A)

A condition describes a sequence of sounds. If the word contains this sequence of sounds, the consequence is applied to the word. The sounds can be described by concatenating two different types of characters: litterals (for the definition of litterals see "About the sounds") or symbols.


What are symbols?

Symbols are characters which represent a category of sounds. The category can be described by a list of attributes that all sounds that belong to this category must or may not contain. Like litterals, symbols must therefore be defined by a list of attributes. The only difference is that the attributes in a symbol definition can be preceeded by '+' or '-'. If an attribute is preceeded by '+', it means that the sound must contain this attribute. If the attribute is preceeded by '-', the sound may not contain this attribut. For example, the line

     C := [+cons]

defines the character 'C' as a symbol that represents any type of consonant (i.e. any type of sound that contains the attribute "cons"). You could also define a symbol

     O := [+cons, +ocl]

which would represent any oclusive consonant (e.g. p, k, t, b, g, d). In the following example

     P := [+voc, -post, -cent]

P represents all the vowels (voc) that are neither "post" nor "cent". The character '?' is predefined as a symbol in the program and means "any type of sound". Only capital letters can be defined as symbols.


Utilisation of '[]' (in the condition)

When you use a litteral or a symbol in the condition, it means that the sound must contain all the attributes that you specified in the definition:

a the sound must be "voc", "cent", "bas"
C the sound must be "cons"

As we've seen, a symbol can also specify in its definition the attributes that the sound may not contain:

P the sound must be "voc" and may not be "post" nor "cent"

You can add more attributes that the sound must or may not contain by using '[]' after a litteral or a symbol: for example

     C[+ocl, -son]

means that the sound must be a consonant (definition of 'C') that in addition must be "ocl" (oclusive) and may not be "son" (voiced).

     a[+ton, -long]

means that the sound must be the vowel a (definition of 'a') that must be "ton" (stressed) and may not be "long".


Frontiers

To express frontiers of morphems, syllables and words in the condition, you can use the following characters which must stand in the following order:

| frontier of syllable
+ " " morphem
# " " word

You can also have the following combinations: |+#, +# |# (combinations like "#|", "+|" however are incorrect because the order of the characters is not respected).

Every frontier can be negated by "!": "V!|" means that there may not be a frontier of a syllable after the vowel. "V!|!+" means that the there may NEITHER be a frontier of syllable NOR a frontier of morphem.

Important: Those frontiers are stored with the preceeding sound. In a word like "ca|sa", the frontier "|" is stored with the sound "a". In order to have a frontier before "c", we introduce a "dummy sound" that we express by "?". Internally, "ca|sa" is stored as "?#ca|sa#".


Jumpers

Sometimes, the sounds you want to modify don't stand besides each other. You can use '..' or '*' to "jump" the sounds in between:

.. none, one or several sounds; frontiers of syllables can not be passed
* none, one or several sounds; frontiers of syllables can be passed

In the condition, '..' and '*' can only stand between two sounds (represented either by a litteral or a symbel). In the consequence, the jumpers can stand everywhere, i.e. also at the beginning or in the end. Jumpers can even stand before or after jumpers (which is not possible in the consequence). In the context, the rules for the jumper position are the same as for the condition.


About the consequence (B)

In the consequence (B), you can have jumpers, litterals and symbols in any order. The consequence describes the sequence of sounds which is applied to the word if the condition is true. Since sounds can disappear, the consequence can also be empty. It must then contain the empty sound "&". A consequence can contain several consequences. In this case, the several consequence must stand inside of "{" .. "}". That's why there are basically two types of consequences:

B the condition has one consequence; B can be a sequence of sounds or '0'
{ B1, B2, ..., Bn } the condition has n consequences; one Bi can also be '0'

In the second case the calculation (evolution) bifurcates and an evolution tree is calculated.


About '+' / '-' in the consequence

When an attribut in the consequence is preceeded by '+', it means that the attribut is set. If the attribut is preceeded by '-' it is deleted. If you have defined the following attributes and sounds

     articulation := { ant, cent, post }
     aperture1 := { haut, moyen, bas } ; high, medium, low
     aperture2 := { ouv, ferm } ; open, closed
     a := [ voc, cent, bas ]
     à := [ voc, cent, bas, ouv ]
     á := [ voc, cent, bas, ferm ]

and have a rule that says

     V[+cent, +bas, +ouv] > V[+ferm]

the sound 'à' is transformed into 'á'. However, if you had the rule

     V[+cent, +bas, +ouv] > V[-ouv]

The attribute "ouv" is deleted (that means at the same time that the group aperture2 has no value set). Therefore, the second rule will transform the sound 'à' into 'a'.


Symbols as a temporal memory for sounds

When you use a symbol in the consequence, it refers to a sound expressed by the same symbol in the condition. If you want the program to add to the consequence the same sound that you refered to in the condition, you can just use the same symbolical character in the consequence, even without '[]'. For example, you can write a rule like

     n C[+alv] > C

which transforms the "mensa" into "mesa". However, you can use '[]' if you want to modify the sound. For example, the rule

transforms "muta" into "muda". The characters '+' / '-' have here the signification we have discussed in the preceeding chapter.


Markers

Markers are numbers that can be used to "mark" sounds that are represented by litterals, symbols or jumpers. Markes stand at the following positions:

Symbol: between the symbol and '[..]', e.g. V1[+ton]
Litteral: the same, e.g. a1[+ton] Jumper: after the jumper: ..1, *1

The same marker can only be used once in a condition. In a consequence it can be used several times in order to add several times the same sound to the word:

     V1[+ant, +ouv, +ton] > V1[+ferm, +atone] V1

(V1 is added twice in the consequence)

Markers also allow you to change the order of sounds. The rule

     C1[+dent, +son] C2[+liq] > C2 C1

for example, transforms "espadla" into "espalda".


Comparison between two sounds

(WILL PROBABLY NOT BE IMPLEMENTED)

If you want to express that to sounds must be identical or different from each other, you can use apostrophs ( ' or " ).

     V' V' > V' or: V" V" > V"

means that there must be to identical vowels which are transformed into one (e.g. "preendere" > "prendere").

     V' V" > ... or: V" V' > ...

means that the two vowels must be different.

Like the numbers, the apostrophes mark the sound represented by the symbol so that you can access to it in the consequence.


Partial comparison

(WILL PROBABLY NOT BE IMPLEMENTED)

It is difficult to have two sounds that are exactly identical. In the word "preéndere", for example, even if we have indeed two vowels "e", they are not exactly the same because the first one is stressed, the second one is unstressed. In order to verify if two sounds are partially identical, you can use the apostrophes in combination with '[]'. The execution of the rule

     V' V'[+atone] > V'

passes through the following steps:

1) The first V' is identified with 'e', i.e. a sound which can be defined as [ voc, ant, moyen, atone ]

2) The second V' is identified wiht 'é', i.e. a sound which can be defined as [ voc, ant, moyen, ton ]. The program verifies first the condition(s) expressed by the symbol: since V stands for "voc", 'é' must be a vowel. Then, the set of attributs that defines 'é' - [ voc, ant, moyen, ton ] - is modified: "ton" is replaced by "atone". The new set - [ voc, ant, moyen, atone ] is compared to the set stored for the first V'. Since they are the same, the condition of the rule is true.


About the context (C)

The context has the format

     CL -- CR

where CL, CR mean:

     CL = left context      CR = right context

The context forms part of the condition: A > B / CL -- CR means "A becomes B if it is preceeded by CL and followed by CR". The only difference is that the sounds in the condition can be modified (while the sounds in the context are invariable).

Like the condition, CL and CR describe sequences of sounds and the same rules must be observed for jumpers (i.e. they can only stand between sounds and can not follow each other). CL and CR can be empty: in this case, thei must contain the empty sound "&".

The main difference between the condition and the context is that you can use curls, i.e. '{}'. When a sound is followed by a curl, it means that it can be followed by one of the sounds that stand inside '{}':

     x { y1, y2, ..., yn }

means that the sound x can be followed either by y1, y2, ... or yn. Curls CANNOT stand inside of curls: therefore a context like

     x { y1 { z1, z2 }, y2 }

is incorrect!


Last updated: October 2000 Back to main page