in reply to unglue words joined together by juncture rules
salutations,
we shall give an actual example of the problem we are trying to solve, based on Sanskrit (which is the best language we can think of for this particular problem, for the many euphony rules it has). consider the lexicon of wordforms (which could be in the form of a hash, with associated meaning) where letter "A" (long vowel) is different from letter "a" (short vowel):also note that the words are in isolated forms, i. e. without any juncture rules.ziva => Shiva (a name for god) azvas => horse zivA => auspicious (f.) Azvas => equestrian
consider the following word: zivAzvaH
and the phonetic rules, which occur between words and/or in the final of the sentence:so, for example, ziva + azvas would give zivAzvaH. zivA + azvas would give zivAzvaH. zivA + Azvas would give zivAzvaH. thus, the possible segmentations of zivAzvaH would be:a|a => A A|a => A a|A => A A| => A |A => A A|A => A s| => H
of course, we only want to separate the possible words; whether it makes sense or not in the language is another story.ziva-azvas #meaning: Shiva's horse zivA-azvas #meaning: auspicious' horse ziva-Azvas #meaning: Shiva's equestrian zivA-Azvas #meaning: auspicious' equestrian
there is yet another example of something we want it to be able to do (NOTE: this second example may be left as something to work on later, maybe): consider a language (which is actually what we are willing to experiment) with a word "abaca", and which has the following rules for joining words:
exemplifying:a + a = A. last consonant of first word + first consonant of last word swap.
we would like to analyse "ababAcaba" and get:abaca + abaca = abaCa + aBaca = abaBa + aCaba (consonants swap) ababAcaba (final form)
this second situation seems much more complicated, but is not prioritary, maybe we should first concentrate on the first one.abaca-abaca
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: unglue words joined together by juncture rules
by BrowserUk (Patriarch) on Mar 30, 2008 at 18:09 UTC | |
by pc2 (Beadle) on Apr 01, 2008 at 00:37 UTC | |
by BrowserUk (Patriarch) on Apr 01, 2008 at 00:51 UTC | |
|
Re^2: unglue words joined together by juncture rules
by mobiusinversion (Beadle) on Mar 31, 2008 at 00:11 UTC | |
|
Re^2: unglue words joined together by juncture rules
by mobiusinversion (Beadle) on Mar 31, 2008 at 05:21 UTC | |
by pc2 (Beadle) on Mar 31, 2008 at 22:26 UTC | |
by mobiusinversion (Beadle) on Apr 01, 2008 at 04:21 UTC |