Re: perl pattern match for end of string using STDIN and chomp
by ikegami (Patriarch) on Oct 08, 2009 at 17:49 UTC
|
You are mistaken. Your first program works fine for the input you specified:
$ perl 800041a.pl
get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
ZZZ
CCC
/xxxx/yyyy/ZZZ_xxxx
Note that you probably want to use \Q$type\E in patterns, not just $type, since $type doesn't contains a regex pattern but text to match literally.
| [reply] [d/l] [select] |
Re: perl pattern match for end of string using STDIN and chomp
by kennethk (Abbot) on Oct 08, 2009 at 17:37 UTC
|
Running the first example code you posted and entering your string does not yield your posted output; rather, it yields the correct output from your second code sample. Have you tested your posted code as written?
#!/usr/bin/perl
print "get some string: ";
chomp($string = <STDIN>);
#$string = $ARGV[0];
#chomp($string);
$string =~ m/\/([[:alnum:]]+)_.*\.(.+)$/;
print "$1\n";
$type = $2;
print "$type\n";
$string =~ m/(.+)\.${type}$/;
#$string =~ m/(.+)\.${type}\Z/;
#$string =~ m/(.+)\.${type}\z/;
#$string =~ m/(.+)\.${type}/;
print "$1\n";
exit 0;
~/sandbox$ perl junk.pl
get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
ZZZ
CCC
/xxxx/yyyy/ZZZ_xxxx
| [reply] [d/l] [select] |
|
|
perl 5.8.0 run on Linux 2.4.21-40.ELsmp DOES NOT WORK,
you are correct I ran on perl 5.8.7 on my windows machine and it worked fine, going to try and find a more current version on our Linux machine and try and find if it is a perl version problem or if it is directly related to linux
| [reply] |
|
|
get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
ZZZ
CCC
/xxxx/yyyy/ZZZ_xxxx
And there's no reason it shouldn't. $ not matching the end of the string would have been caught by tests. Your build is very broken if the last pattern doesn't match given the specified input.
| [reply] [d/l] [select] |
|
|
Ditto on ikegami above, tested v5.8.8 built for x86_64-linux-gnu-thread-multi on Ubuntu 8.04 LTS as well as Windows. What happens when you run this?
#!/usr/bin/perl
print "get some string: ";
$dropped = chop($string = <STDIN>);
print ord $dropped, " ", ord $/, "\n";
#$string = $ARGV[0];
#chomp($string);
$string =~ m/\/([[:alnum:]]+)_.*\.(.+)$/;
print "$1\n";
$type = $2;
print "$type\n";
$string =~ m/(.+)\.${type}$/;
#$string =~ m/(.+)\.${type}\Z/;
#$string =~ m/(.+)\.${type}\z/;
#$string =~ m/(.+)\.${type}/;
print "$1\n";
exit 0;
| [reply] [d/l] |
|
|
Re: perl pattern match for end of string using STDIN and chomp
by AnomalousMonk (Archbishop) on Oct 08, 2009 at 21:48 UTC
|
I tend to agree with the suggestions of others that the second regex in the chomped <STDIN> section of the OPed code is simply failing to match, and the previous value of $1 is persisting.
Try inserting the statement
print qq{\$1 reset: '$1'} if 'foo' =~ /(foo)/;
between the first and second regex in that section and seeing what happens. If the second 'ZZZ' becomes 'foo' you will know what is happening (although not why the second regex fails to match, which I cannot understand myself). | [reply] [d/l] [select] |
|
|
This "previous value of $1" can be problematic. One of the practices that I often use in my code is to NOT use $1 or $2, etc. I like to assign $1 right away to a variable that has more contextual meaning (like $name, $cust_id) or whatever. Why have $name = $1;?
One way to do this is illustrated below, put the match into a list context and use list slice to get $1,$2, etc. If "$1" is undef, then in this case $thing gets undef, not the previous value of $1.
print "match failed\n" unless $string =~ m/(.+)\.BBB$/;
print "dollar $1:\n"; #prints previous $1 value
my ($thing) = ($string =~ m/(.+)\.BBB$/)[0];
print "thing =$thing\n"; #$thing is undef
| [reply] [d/l] |
|
|
it seems pretty clear that the $1 in the code is falling through since the second regex is not matching, so i will use some of the code examples to avoid that in the future.
However, the big question is why the second regex does not match with a chomped <STDIN> but will match a chomped $ARGV[0], it only appears to be an issue with 5.8.0, and appears to have been fixed in atleast 5.8.3, can anyone explain this behavior, may need some ammo to get our Linux support to upgrade our standard Perl version
| [reply] |
Re: perl pattern match for end of string using STDIN and chomp
by Marshall (Canon) on Oct 08, 2009 at 21:06 UTC
|
I did some testing on my Perl 5.10 Win XP system. At first all appeared to be ok. Then I tried to replicate the output of the poster and found that I could do it if the second match failed.
It appears that if match fails, $1 remains the same as it was, i.e. $1 is only valid if match succeeds.
I experimented with ${type} vs just $type and found that both worked on my Perl version. I am wondering if something about this on Perl 5.8.0 is somehow different? That this second match is not succeeding and old $1 is still there?
I would suggest trying some of my experiments below and see what happens on the OP's system. I think the second match is failing for some reason and the "old $1" is getting printed.
#!/usr/bin/perl -w
print "get some string: ";
($string = <STDIN>);
#note: chomp not necessary, $ should count \n as
#end of string. should work with or without chomp.
$string =~ m/\/([[:alnum:]]+)_.*\.(.+)$/;
print "dollar 1: $1\n";
$type = $2;
print "type: $type\n";
#something weird here....
$string =~ m/(.+)\.BBB$/;
#get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
#dollar 1: ZZZ
#type: CCC
#dollar 1:ZZZ
print "match failed\n" unless $string =~ m/(.+)\.BBB$/;
#$string =~ m/(.+)\.${type}$/; #ok
#$string =~ m/(.+)\.$type$/; #ok also
print "dollar 1:$1\n";
exit 0;
__END__
This is with the match failed code:
C:\TEMP>regextest.pl
get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
dollar 1: ZZZ
type: CCC
match failed
dollar 1:ZZZ
This is from:
#$string =~ m/(.+)\.${type}$/; #ok
#$string =~ m/(.+)\.$type$/; #ok also
C:\TEMP>regextest.pl
get some string: /xxxx/yyyy/ZZZ_xxxx.CCC
dollar 1: ZZZ
type: CCC
dollar 1:/xxxx/yyyy/ZZZ_xxxx
C:\TEMP>perl -v
This is perl, v5.10.0 built for MSWin32-x86-multi-thread
(with 5 registered patches, see perl -V for more detail)
| [reply] [d/l] [select] |