in reply to regular expression.
AnonyMonk Update: and JediWizard have accurate answers has an accurate answer </update> for the question you appear to be asking but perhaps the question you intended is about why $& has values.
If so, note that $& is a special regex variable that -- quoting from Chapter 7 in "Mastering Regular Expressions" (page 299 in my 2nd Ed. paperback) -- "A successful match or substitution sets a variety of global, read-only variables that are always automatically, dynamically scoped. These values never change if a match attempt is unsuccessful, and are alwaysset when a match is successful." (emphasis in the original; but note especially the first clause of the last sentence)
Update#2 (See AnomalousMonk's below): Your line 8, if(~/^<dsk1/) matches on "<dsk1" and therefore sets $&. Your line 8 merely tests whether bitwise negation of the string is possible. Since there is no match the prior match is retained in $& (I think). </Update#2> Execution (modified code below) produces this:
perl 851032-orig.pl $& at line 9: <dsk1 ********** 1 $& at line 14: <dsk1 line1 id="123" lo="1" to="abc" rb="This is a long + rb." ------------------ $& at line 9: <dsk1 ********** 2 $& at line 14: <dsk1 line2 id="456" lo="2" to="def" rb=Short rb" ------------------ $& at line 9: <dsk1 ********** 3 $& at line 14: <dsk1 line3 id="789" lo="3" to="ghi" rb="Medium long rb +" ------------------ $& at line 9: <dsk1 ********** Use of uninitialized value in concatenation (.) or string at 851032-or +ig.pl line 13, <DATA> line 4. $& at line 14: <dsk1 ------------------ $& at line 9: <dsk1 ********** 5 $& at line 14: <dsk2 line5 id="555" lo="5" to="jkl" rb="should not mat +ch" ------------------ $& at line 9: <dsk1 ********** 6 $& at line 14: <dsk1 Line6 id="987" lo="6" to="mno" rb="This should be + a match" ------------------ $& at line 9: <dsk1 ********** 7 $& at line 14: <dsk1 line7 id="FFF" lo="7" to="pqr" rb="This is a very +, very, very long are-bee." ------------------
using this, slightly modified code:
#!/usr/bin/perl use strict; use warnings; # 851032-orig (but using data) # open FH,"data" or die "can't open the file"; while(<DATA>) { if(~/^<dsk1/) { print "\$& at line 9: $& \n **********\n"; # added my $line=$_; # added $line=~/.*\sid=\"(.*)\"\slo=\"(.*)\"\sto=\"(.*)\"\srb=.*$/; print "$2\n\n"; print "\$& at line 14: $&\n ------------------\n"; } } __DATA__ <dsk1 line1 id="123" lo="1" to="abc" rb="This is a long rb." <dsk1 line2 id="456" lo="2" to="def" rb=Short rb" <dsk1 line3 id="789" lo="3" to="ghi" rb="Medium long rb" <dsk2 line4 <dsk2 line5 id="555" lo="5" to="jkl" rb="should not match" <dsk1 Line6 id="987" lo="6" to="mno" rb="This should be a match" <dsk1 line7 id="FFF" lo="7" to="pqr" rb="This is a very, very, very lo +ng are-bee."
Note also that you are using conventional regex notation at line 12 but not in line 9. Update2: I'm unclear why line 9 passes a syntax check... but that just means I have more fun hunting up the answer on that. (As noted, Anomalous Monk answers below.)
Hence, I'm posting code and output with the match syntax (rather than bitwise negation):
#!/usr/bin/perl use strict; use warnings; # 851032 #open FH,"data" or die "can't open the file"; while(<DATA>) { my $line=$_; if( $line =~ /^<dsk1/ ) { print "\$line: $line"; if ($line =~ /.*\sid=\"(.*)\"\slo=\"(.*)\"\sto=\"(.*)\"\srb=.* +$/) { print "\$1: $1, \$2: $2\, \$3: $3 \n"; print "-----------------\n"; } else { print "Some of this will be uninitialized: "; print "\$1: $1, \$2: $2\, \$3: $3 \n"; # print "\$&: $& \n\n---\n"; } } else { print "\$line did NOT start with '^dsk1': $line \n =========== +======\n\n"; } } =head Output: (see also 851032-orig.pl) perl 851032.pl $line: <dsk1 line1 id="123" lo="1" to="abc" rb="This is a long rb." $1: 123, $2: 1, $3: abc ----------------- $line: <dsk1 line2 id="456" lo="2" to="def" rb=Short rb" $1: 456, $2: 2, $3: def ----------------- $line: <dsk1 line3 id="789" lo="3" to="ghi" rb="Medium long rb" $1: 789, $2: 3, $3: ghi ----------------- $line did NOT start with '^dsk1': <dsk2 line4 ================= $line did NOT start with '^dsk1': <dsk2 line5 id="555" lo="5" to="jkl" + rb="should not match" ================= $line: <dsk1 Line6 id="987" lo="6" to="mno" rb="This should be a match +" $1: 987, $2: 6, $3: mno ----------------- $line: <dsk1 line7 id="FFF" lo="7" to="pqr" rb="This is a very, very, +very long are-bee." $1: FFF, $2: 7, $3: pqr ----------------- =cut __DATA__ <dsk1 line1 id="123" lo="1" to="abc" rb="This is a long rb." <dsk1 line2 id="456" lo="2" to="def" rb=Short rb" <dsk1 line3 id="789" lo="3" to="ghi" rb="Medium long rb" <dsk2 line4 <dsk2 line5 id="555" lo="5" to="jkl" rb="should not match" <dsk1 Line6 id="987" lo="6" to="mno" rb="This should be a match" <dsk1 line7 id="FFF" lo="7" to="pqr" rb="This is a very, very, very lo +ng are-bee."
</update2a>
A few other suggestions:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: regular expression.
by AnomalousMonk (Archbishop) on Jul 23, 2010 at 22:34 UTC |