Pattern Matching, left-to-right

TrekNoid has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Pattern Matching, left-to-right by antirice (Priest) on Aug 20, 2004 at 20:11 UTC
Perhaps something like... `my %map = ( F => "\|", S => "^", T => "&", R => "~", E => "\\" ); $msgu =~ s/\\([FSTRE])\\/$map{$1}/g;` [download] antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1	[reply] [d/l]
•Re^2: Pattern Matching, left-to-right by merlyn (Sage) on Aug 20, 2004 at 20:28 UTC
This code blows up on the example. I thought of that too. {grin} -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply. update: Ooops. My bad. You got it right.	[reply]
Re^3: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 20, 2004 at 20:44 UTC
I get `AB&F^CD` when I run it, which the OP says is what should happen. By his description of the task the code is also exactly what he's after. Which case do you have in mind that it fails for? Makeshifts last the longest.	[reply]
Re: Pattern Matching, left-to-right by ysth (Canon) on Aug 20, 2004 at 20:13 UTC
`%trans = qw/ F \| S ^ T & R ~ E \\ /; $string =~ s/\\([FSTRE])\\/$trans{$1}/g;` [download]	[reply] [d/l]
•Re^2: Pattern Matching, left-to-right by merlyn (Sage) on Aug 20, 2004 at 20:29 UTC
This code fails on the example. See my answer. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply. update: Ooops. My bad. You got it right.	[reply]
•Re: Pattern Matching, left-to-right by merlyn (Sage) on Aug 20, 2004 at 20:05 UTC
Untested: `$string =~ s/(\$?:[FSTRE]\$+)/ my $x = $1; $x =~ tr#FSTRE\\#\|^&-\\#d; $x; /ge;` [download] -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply. update: Ooops... I saw the `\T\F\S\` in the source string, and figured that this was supposed to translate to `&^\|`, when in fact the example says it translates to `&F\|`. Ignore me. Other answers are better.	[reply] [d/l] [select]
Re: Pattern Matching, left-to-right by ambrus (Abbot) on Aug 20, 2004 at 22:10 UTC
The easy way is of course to use a DFA matcher. For example, with flex: `$ cat hl.l %option noyywrap %% \\F\\ putchar('\|'); \\S\\ putchar('^'); \\T\\ putchar('&'); \\R\\ putchar('~'); \\E\\ putchar('\\'); $ flex hl.l && gcc -lfl -O lex.yy.c -o hl $ ./hl AB\T\F\S\CD AB&F^CD $` [download] There must be lots of DFA regexp packages available for perl on CPAN.	[reply] [d/l]
Re: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 21, 2004 at 15:01 UTC
Here's another way to do it, with distinct patterns instead of a hash map: `while( m/ \G .? (?= \\[FSTRE]\\ ) /gx ) { my $pos = pos; s{ \G \\F\\ }{ "\|" }egx or s{ \G \\S\\ }{ "^" }egx or s{ \G \\T\\ }{ "&" }egx or s{ \G \\R\\ }{ "~" }egx or s{ \G \\E\\ }{ "\\" }egx; pos() = $pos + 1; }` [download] In this code, `m//g` does the actual work of finding the control sequences in the string. The trick is to anchor all patterns with `\G`, which makes sure each pattern starts off where the last successful pattern stopped matching. That way you go through the string left-to-right. In case of `s///g`, this means `s///g` will usually be replacing exactly one single occurence, despite the `/g`. `\G \\S\\` will only match multiple times if the control sequences appears multiple times back-to-back as in the string `\S\\S\\S\`. Unfortunately, if a `s///g` matches at least once, it also resets the end-of-last-successful-match position when it eventually fails. Therefore, the next match would normally start over at the beginning of the string, leading to recursive replacement problems with `\E\S\` getting doubly translated to `^` instead of becoming just `\S\` as per the spec. That explains the manual bookkeeping with pos, which queries and sets that position. With `m//`, this is elegantly avoidable by use of the `/c` modifier which means "do not reset end-of-last-position on failure". All that said and done, for this problem, this solution is both much less efficient and harder to understand than the hash map based ones given by others. I post this merely as a trivial demonstration of `\G`, which really shines when the patterns you want to match in a coordinated fashion are very non-uniform, unlike the ones in this case. `m//gc` is how you build true parsers in Perl. Makeshifts last the longest.*	[reply] [d/l]
Re^2: Pattern Matching, left-to-right by ccn (Vicar) on Aug 22, 2004 at 08:52 UTC
fun with perl: `substr($_, pos() - 3, 3) =~ tr/FSTRE\\/\|^&~\\/d while m/\G.?\\[FSTRE]\\/gs;` [download] Note: s modifier is required in the regexp* Update: regexp was simplified `substr($_, pos() - 3, 3) =~ tr/FSTRE\\/\|^&~\\/d while m/\\[FSTRE]\\/g;` [download]	[reply] [d/l] [select]
Re^3: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 22, 2004 at 09:07 UTC
Yep, although that wasn't really the point of my excercise. `:-)` I'd write that one a little differently: `my %xlat = ( F => '\|', S => '^', T => '&', R => '~' E => '\\', ); while( m/ \G .? \\([FSTRE])\\ /gsx ){ substr $_, pos() - 3, 3, $xlat{ $1 }; }` [download] Update:* same change as in parent node update applies. Makeshifts last the longest.	[reply] [d/l]
Re: Pattern Matching, left-to-right by chance (Beadle) on Aug 20, 2004 at 21:38 UTC
My bad. All I was shooting for was that if you want 'left to right' and you are having trouble getting something to work, I've often found that making explicitly 'left to right' (by putting it a while loop) can help me understand it. Once understanding has occurred, it can usually be re-written in a cooler way.	[reply]
Re^2: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 21, 2004 at 13:31 UTC
I don't know how `1 while s///` is clearer than `s///g`. What's worse, it does not work correctly: `my %map = ( F => "\|", S => "^", T => "&", R => "~", E => "\\" ); $_ = "\\E\\S\\"; s/\\([FSTRE])\\/$map{$1}/g; print "s///g: [$_]\n"; $_ = "\\E\\S\\"; 1 while s/\\([FSTRE])\\/$map{$1}/; print "1 while s///: [$_]\n"; __END__ s///g: [\S\] 1 while s///: [^]` [download] `\E\S\` gets translated to `\S\`, as per the OP's spec, but the `1 while s///` solution then blithely goes on to translate that as well. Oops. (Note that there's a bunch of minor mistakes in your code. The last slash in your `s///` is missing and `(FSTRE)` should be `([FSTRE])`.) Makeshifts last the longest.	[reply] [d/l]
Re: Pattern Matching, left-to-right by ambrus (Abbot) on Aug 20, 2004 at 20:41 UTC
I offer a simple solution. I suppose that there are backslashes in the original string that are not part of an escape sequence you want to transform, this won't work at least not this way. `s/(\\.)\\/$1@/g; s/\\F@/\|/g; s/\\S@/^/g; s/\\T@/&/g; s/\\R@/-/g; s/\\E@/\\/g;` [download]	[reply] [d/l]
Re^2: Pattern Matching, left-to-right by antirice (Priest) on Aug 20, 2004 at 21:33 UTC
Don't know if it's legal but what about this situation? `$msgu = "\\F\\\\F@";` [download] antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1	[reply] [d/l]
Re: Pattern Matching, left-to-right by Prior Nacre V (Hermit) on Aug 21, 2004 at 13:50 UTC
Tested successfully with the input you supplied: Update The last version did what I claimed but was still bad. Fair enough, thanks for spotting the problem Aristotle. The last version is behind the Read More .... The new version hopefully does what is wanted; here's my tests before and after changes: Update 2 But that version was the same solution that others had :-( Here's a third version that doesn't use regexes. TMTOWTDI! # hl7_conv use strict; use warnings; my $input = q(AB\T\F\S\CD\E\E\E\R\R\R); my $gen_out = ''; my $exp_out = q(AB&F^CD\\E\\R~R); my %convs = ( E => '\\', F => '\|', R => '~', S => '^', T => '&', ); $gen_out = hl7_replace($input); print "INPUT: $input\n"; print "GEN_OUT: $gen_out\n"; print "EXP_OUT: $exp_out\n"; print $0, ': ', $gen_out eq $exp_out ? 'SUCCESS!' : ' Z z . c8o, ', +"\n"; exit 0; sub hl7_replace { my $in = shift; my @input = split //, $in; my @output = (); for (my $i = 0; $i <= $#input; ++$i) { if ($input[$i] eq "\\" && ($i + 2) <= $#input && $convs{$input[$i + 1]} && $input[$i + 2] eq "\\") { push @output, $convs{$input[$i + 1]}; $i += 2; } else { push @output, $input[$i]; } } return join '', @output; } [download] Read more... (3 kB) Regards, PN5	[reply] [d/l] [select]
Re^2: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 21, 2004 at 14:11 UTC
If you had tested it with more input, you'd have found it is wrong. You wrote: `foreach my $key (keys %convs) { $in =~ s/$convs{$key}/$key/g; }` [download] The OP wrote: And that almost works perfectly. Where it doesn't work is if there's two of these special characters grouped around a normal character... [...] And since I can get these in any order, there's no way to put them in an order that always works. Since you stored the patterns as values in a hash, the order in which they're substituted gets randomized such that with the input you tested with, the code happens to work. Makeshifts last the longest.	[reply] [d/l]
Re^3: Pattern Matching, left-to-right by Prior Nacre V (Hermit) on Aug 21, 2004 at 14:54 UTC
Thanks - fixed - see update. Regards, PN5	[reply]
Re^4: Pattern Matching, left-to-right by Aristotle (Chancellor) on Aug 21, 2004 at 15:09 UTC
Re^5: Pattern Matching, left-to-right by Prior Nacre V (Hermit) on Aug 21, 2004 at 15:29 UTC
Re: Pattern Matching, left-to-right by Anonymous Monk on Aug 21, 2004 at 12:18 UTC
Greetings, oh high ones... No drugs allowed in monastery	[reply]