in reply to Re: unglue words joined together by juncture rules
in thread unglue words joined together by juncture rules
These are the input to and outputs from my code somewhere above, for the 3 examples you;ve supplied so far:
{ my %morphs = ( t => { d => 'd' }, ); my @lex = qw[ cowboy cow boy cat do dog ]; my $input = 'cowboycaddog'; print "\n$input\n------------"; deGlue{ print join '-', @_ } @lex, %morphs, $input; } { my %morphs = ( aH => { o => 'dh', as => 't' }, as => { aH => '' }, ); my @lex = qw[ krishnaH dhaavati naH dhaa namaH te ]; my $input = 'Krishnodhaavatinamaste'; print "\n$input\n----------"; deGlue{ print join '-', @_ } @lex, %morphs, $input; } { my %morphs = ( A => { a => 'a', A => 'a', a => 'A', A => 'A', '' => 'A', 'A' +=> '' }, s => { H => '' }, ); my @lex = qw[ ziva Shiva azvas zivA Azvas ]; my $input = 'zivAzvaH'; print "\n$input\n-------------"; deGlue{ print join '-', @_ } @lex, %morphs, $input; } __END__ c:\test>675520 cowboycaddog ------------ cowboy-cad-dog cow-boy-cad-dog Krishnodhaavatinamaste ---------- Krishno-dhaavati-namas-te zivAzvaH ------------- ziv-AzvaH ziv-AzvaH-AzvaH ## I'M investigating this anomoly.
The main point of that code is that it constructs regexes to parse the data from the supplied lexicon and morpheme rules automatically.
Incomplete yet, and currently leave work still to be done, but a starting point? The more examples it is tried with, the better the code generation can be tailored.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: unglue words joined together by juncture rules
by pc2 (Beadle) on Apr 01, 2008 at 00:37 UTC | |
by BrowserUk (Patriarch) on Apr 01, 2008 at 00:51 UTC |