Re: Re: Re: How to remove the $1 hard coding

Replies are listed 'Best First'.
Re: Re: Re: Re: How to remove the $1 hard coding by graff (Chancellor) on Aug 22, 2003 at 23:56 UTC
The split is not the same as the pattern. The pattern uses \s as the delimiter* Well, actually, since the regex in the OP was: `m/\s(\S+)\s(\S+)\s(\S+)\s(\S+)\s/` [download] I was going to assert that this would generally be equivalent to splitting on whitespace, with the obvious difference that, if the string began with whitespace, split would return a list that included an empty string as the first element -- the first element returned by the regex would be the second element returned by split. But then I noticed another difference, which gave me pause, and I wondered if the OP had a clear grasp of the relevant detail -- that is, whether this regex is really doing what was intended. Consider the following: `$s1="ABC D E"; $s2=" ABCD E "; # (leading and trailing spaces) print join( ":", split /\s+/, $s1 ), $/; print join( ":", split /\s+/, $s2 ), $/; print $/; print join( ":", ($s1=~/\s(\S+)\s(\S+)\s(\S+)\s/)), $/; print join( ":", ($s2=~/\s(\S+)\s(\S+)\s(\S+)\s*/)), $/; __OUTPUT__ ABC:D:E :ABCD:E ABC:D:E ABC:D:E` [download] The first two lines of output show that split will return an empty string as the first list item if the string begins with a delimiter, whereas it will (by default) ignore trailing delimiters (but you can control that). The last two lines demonstrate the tenacity of the regex engine -- it does its best to match as much of the regex as possible. In this case, it takes the liberty of breaking up the "ABCD" portion of $s2, so that it can have non-empty values inside every set of capturing parens. The behavior is very different from split, indeed! Personally, I wouldn't feel comfortable using that particular regex pattern -- split seems more suitable.	[reply] [d/l] [select]
Re: Re: Re: Re: Re: How to remove the $1 hard coding by tachyon (Chancellor) on Aug 23, 2003 at 10:28 UTC
I was going to assert that this would generally be equivalent to splitting on whitespace, with the obvious difference that, if the string began with whitespace, split would return a list that included an empty string as the first element -- the first element returned by the regex would be the second element returned by split. You obviously didn't test it :-) split ' ' (as posted) is MAGICAL. It is not the same as split /\s*/. `@v = split ' ', ' watch the magic '; print "$_: $v[$_]\n" for 0..$#v; print $/; @v = split /\s+/, ' lost the magic '; print "$_: $v[$_]\n" for 0..$#v; __DATA__ 0: watch 1: the 2: magic 0: 1: lost 2: the 3: magic` [download] As you see it is absolutely identical to the posted RE and drops leading whitespace. cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]
('Re: ' x 6) How to remove the $1 hard coding by MidLifeXis (Monsignor) on Aug 23, 2003 at 20:40 UTC
The original RE is `m/\s(\S+)\s(\S+)\s(\S+)\s(\S+)\s*/`. If this is equivilent to splitting on `' '`, then I have some serious remedial RE work to do. The original RE will match 'ABCD' and split it into 4 characters. Your solution using split will not. I am not sure that the original poster is 100% certain about what the initial RE does, and your solution may match intent, but as written the are not the same. Now, since you are using the pattern `\s+` as your split, then your splits are nearly equivilant (leading space and trailing space aside). However, your split is not equivilant to the original RE. Nothing personal, just technically misleading. All of the info was correct (++), just not related to the OP (--).	[reply] [d/l] [select]
Re: ('Re: ' x 6) How to remove the $1 hard coding by tachyon (Chancellor) on Aug 25, 2003 at 12:25 UTC
Re: Re: Re: Re: How to remove the $1 hard coding by chunlou (Curate) on Aug 22, 2003 at 19:37 UTC
Probably you meant this. `push @a, ('a b c'=~/(\w) /g)[1]; print "@a";` [download] You need `()` to capture something into the array. You need the switch `/g` if you want to put every match into an array, not just the first match (hence for the index `[$i]` to work).	[reply] [d/l] [select]