Re: Pattern Matching, left-to-right
by antirice (Priest) on Aug 20, 2004 at 20:11 UTC
|
my %map = (
F => "|",
S => "^",
T => "&",
R => "~",
E => "\\"
);
$msgu =~ s/\\([FSTRE])\\/$map{$1}/g;
antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1
| [reply] [d/l] |
|
|
This code blows up on the example. I thought of that too. {grin}
update: Ooops. My bad. You got it right.
| [reply] |
|
|
| [reply] |
Re: Pattern Matching, left-to-right
by ysth (Canon) on Aug 20, 2004 at 20:13 UTC
|
%trans = qw/ F | S ^ T & R ~ E \\ /;
$string =~ s/\\([FSTRE])\\/$trans{$1}/g;
| [reply] [d/l] |
|
|
This code fails on the example. See my answer.
update: Ooops. My bad. You got it right.
| [reply] |
•Re: Pattern Matching, left-to-right
by merlyn (Sage) on Aug 20, 2004 at 20:05 UTC
|
$string =~ s/(\\(?:[FSTRE]\\)+)/
my $x = $1;
$x =~ tr#FSTRE\\#|^&-\\#d;
$x;
/ge;
update:
Ooops... I saw the \T\F\S\ in the source string, and figured that this was supposed to translate to &^|, when in fact the example says it translates to &F|. Ignore me. Other answers are better. | [reply] [d/l] [select] |
Re: Pattern Matching, left-to-right
by ambrus (Abbot) on Aug 20, 2004 at 22:10 UTC
|
$ cat hl.l
%option noyywrap
%%
\\F\\ putchar('|');
\\S\\ putchar('^');
\\T\\ putchar('&');
\\R\\ putchar('~');
\\E\\ putchar('\\');
$ flex hl.l && gcc -lfl -O lex.yy.c -o hl
$ ./hl
AB\T\F\S\CD
AB&F^CD
$
There must be lots of DFA regexp packages available for perl on CPAN.
| [reply] [d/l] |
Re: Pattern Matching, left-to-right
by Aristotle (Chancellor) on Aug 21, 2004 at 15:01 UTC
|
Here's another way to do it, with distinct patterns instead of a hash map:
while( m/ \G .*? (?= \\[FSTRE]\\ ) /gx ) {
my $pos = pos;
s{ \G \\F\\ }{ "|" }egx
or s{ \G \\S\\ }{ "^" }egx
or s{ \G \\T\\ }{ "&" }egx
or s{ \G \\R\\ }{ "~" }egx
or s{ \G \\E\\ }{ "\\" }egx;
pos() = $pos + 1;
}
In this code, m//g does the actual work of finding the control sequences in the string. The trick is to anchor all patterns with \G, which makes sure each pattern starts off where the last successful pattern stopped matching. That way you go through the string left-to-right.
In case of s///g, this means s///g will usually be replacing exactly one single occurence, despite the /g. \G \\S\\ will only match multiple times if the control sequences appears multiple times back-to-back as in the string \S\\S\\S\.
Unfortunately, if a s///g matches at least once, it also resets the end-of-last-successful-match position when it eventually fails. Therefore, the next match would normally start over at the beginning of the string, leading to recursive replacement problems with \E\S\ getting doubly translated to ^ instead of becoming just \S\ as per the spec. That explains the manual bookkeeping with pos, which queries and sets that position. With m//, this is elegantly avoidable by use of the /c modifier which means "do not reset end-of-last-position on failure".
All that said and done, for this problem, this solution is both much less efficient and harder to understand than the hash map based ones given by others. I post this merely as a trivial demonstration of \G, which really shines when the patterns you want to match in a coordinated fashion are very non-uniform, unlike the ones in this case. m//gc is how you build true parsers in Perl.
Makeshifts last the longest.
| [reply] [d/l] |
|
|
substr($_, pos() - 3, 3) =~ tr/FSTRE\\/|^&~\\/d
while m/\G.*?\\[FSTRE]\\/gs;
Note: s modifier is required in the regexp
Update: regexp was simplified
substr($_, pos() - 3, 3) =~ tr/FSTRE\\/|^&~\\/d
while m/\\[FSTRE]\\/g;
| [reply] [d/l] [select] |
|
|
my %xlat = (
F => '|',
S => '^',
T => '&',
R => '~'
E => '\\',
);
while( m/ \G .*? \\([FSTRE])\\ /gsx ){
substr $_, pos() - 3, 3, $xlat{ $1 };
}
Update: same change as in parent node update applies.
Makeshifts last the longest.
| [reply] [d/l] |
Re: Pattern Matching, left-to-right
by chance (Beadle) on Aug 20, 2004 at 21:38 UTC
|
My bad.
All I was shooting for was that if you want 'left to right' and you are having trouble getting something to work, I've often found that making explicitly 'left to right' (by putting it a while loop) can help me understand it. Once understanding has occurred, it can usually be re-written in a cooler way. | [reply] |
|
|
my %map = (
F => "|",
S => "^",
T => "&",
R => "~",
E => "\\"
);
$_ = "\\E\\S\\";
s/\\([FSTRE])\\/$map{$1}/g;
print "s///g: [$_]\n";
$_ = "\\E\\S\\";
1 while s/\\([FSTRE])\\/$map{$1}/;
print "1 while s///: [$_]\n";
__END__
s///g: [\S\]
1 while s///: [^]
\E\S\ gets translated to \S\, as per the OP's spec, but the 1 while s/// solution then blithely goes on to translate that as well. Oops.
(Note that there's a bunch of minor mistakes in your code. The last slash in your s/// is missing and (FSTRE) should be ([FSTRE]).)
Makeshifts last the longest.
| [reply] [d/l] |
Re: Pattern Matching, left-to-right
by ambrus (Abbot) on Aug 20, 2004 at 20:41 UTC
|
I offer a simple solution. I suppose that there are backslashes in the original string that are not part of an escape sequence you want to transform, this won't work at least not this way.
s/(\\.)\\/$1@/g;
s/\\F@/|/g; s/\\S@/^/g; s/\\T@/&/g; s/\\R@/-/g; s/\\E@/\\/g;
| [reply] [d/l] |
|
|
$msgu = "\\F\\\\F@";
antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1
| [reply] [d/l] |
Re: Pattern Matching, left-to-right
by Prior Nacre V (Hermit) on Aug 21, 2004 at 13:50 UTC
|
Tested successfully with the input you supplied:
Update
The last version did what I claimed but was still bad. Fair enough, thanks for spotting the problem Aristotle. The last version is behind the Read More .... The new version hopefully does what is wanted; here's my tests before and after changes:
Update 2
But that version was the same solution that others had :-(
Here's a third version that doesn't use regexes. TMTOWTDI!
# hl7_conv
use strict;
use warnings;
my $input = q(AB\T\F\S\CD\E\E\E\R\R\R);
my $gen_out = '';
my $exp_out = q(AB&F^CD\\E\\R~R);
my %convs = (
E => '\\',
F => '|',
R => '~',
S => '^',
T => '&',
);
$gen_out = hl7_replace($input);
print "INPUT: $input\n";
print "GEN_OUT: $gen_out\n";
print "EXP_OUT: $exp_out\n";
print $0, ': ', $gen_out eq $exp_out ? 'SUCCESS!' : ' Z z . c8o, ',
+"\n";
exit 0;
sub hl7_replace
{
my $in = shift;
my @input = split //, $in;
my @output = ();
for (my $i = 0; $i <= $#input; ++$i)
{
if ($input[$i] eq "\\" && ($i + 2) <= $#input
&& $convs{$input[$i + 1]} && $input[$i + 2] eq "\\")
{
push @output, $convs{$input[$i + 1]};
$i += 2;
}
else
{
push @output, $input[$i];
}
}
return join '', @output;
}
| [reply] [d/l] [select] |
|
|
If you had tested it with more input, you'd have found it is wrong.
You wrote:
foreach my $key (keys %convs)
{
$in =~ s/$convs{$key}/$key/g;
}
The OP wrote:
And that *almost* works perfectly. Where it doesn't work is if there's two of these special characters grouped around a normal character... [...] And since I can get these in any order, there's no way to put them in an order that always works.
Since you stored the patterns as values in a hash, the order in which they're substituted gets randomized such that with the input you tested with, the code happens to work.
Makeshifts last the longest.
| [reply] [d/l] |
|
|
| [reply] |
|
|
|
|
Re: Pattern Matching, left-to-right
by Anonymous Monk on Aug 21, 2004 at 12:18 UTC
|
Greetings, oh high ones...
No drugs allowed in monastery
| [reply] |