Re^2: Split ( ) is Giving Me a Splitting Headache

What's the reasoning behind that general recommendation? From your description, it appears that splitting on 'foo' and /foo/ (or even on $bar, if $bar eq 'foo') would be equivalent, so why prefer the pattern version?

(My current practice is to use 'foo' on the principle of never using regexes where a simple comparison is sufficient. On the surface, at least, it seems that splitting on /foo/ is the same manner of overkill as using if $bar =~ /^foo$/ instead of if $bar eq 'foo'.)

Comment on Re^2: Split ( ) is Giving Me a Splitting Headache Select or Download Code

Replies are listed 'Best First'.
Re^3: Split ( ) is Giving Me a Splitting Headache by blazar (Canon) on May 25, 2006 at 11:01 UTC
What's the reasoning behind that general recommendation? From your description, it appears that splitting on 'foo' and /foo/ (or even on $bar, if $bar eq 'foo') would be equivalent, so why prefer the pattern version? Because the documentation says `split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ split` [download] In fact splitting on 'foo' and /foo/ are (currently) equivalent because perl converts the string to a regex: `perl -MO=Deparse -ne 'split "foo"' LINE: while (defined($_ = <ARGV>)) { split(/foo/, $_, 0); } -e syntax OK` [download] Who knows? I doubt it, but the docs do not specify what should happen with generic strings other than `' '` and maybe one day special behaviour may be implemented for them along the lines of what is currently available for `' '`. (My current practice is to use 'foo' on the principle of never using regexes where a simple comparison is sufficient. On the surface, at least, it seems that splitting on /foo/ is the same manner of overkill as using if $bar =~ /^foo$/ instead of if $bar eq 'foo'.) Your principle is based on an assumption that is simply not true for as we've seen before, your string is converted to a regex anyway. And OTOH matches like `/foo/` are internally optimized to basically do an index. So no overkill. If you don't trust me, just trust B::Concise: `$ perl -MO=Concise -pe '$_=split "foo"' >1.txt -e syntax OK $ perl -MO=Concise -pe '$_=split /foo/' >2.txt -e syntax OK $ diff [12].txt 20c20 < 6 </> pushre(/"foo"/) s/64 ->7 --- > 6 </> pushre(/"foo"/) s/64 ->7` [download] Notice that the only difference is an asterisk, which means "Do something weird for this op". I'm not saying that I don't agree with you on the `$bar =~ /^foo$/` vs. `$bar eq 'foo'` issue. Because I do agree. (Also because they're not strictly equivalent: `$ perl -le 'print "bar\n" =~ /^bar$/ ? "ok" : "not ok"' ok` [download] and people happen to use the former whereas they really want the latter.) Simply, that's not the same for split's first argument*.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Split ( ) is Giving Me a Splitting Headache
by blazar (Canon) on May 25, 2006 at 11:01 UTC

What's the reasoning behind that general recommendation? From your description, it appears that splitting on 'foo' and /foo/ (or even on $bar, if $bar eq 'foo') would be equivalent, so why prefer the pattern version?

Because the documentation says

       split /PATTERN/,EXPR,LIMIT
       split /PATTERN/,EXPR
       split /PATTERN/
       split
[download]

In fact splitting on 'foo' and /foo/ are (currently) equivalent because perl converts the string to a regex:

perl -MO=Deparse -ne 'split "foo"'
LINE: while (defined($_ = <ARGV>)) {
    split(/foo/, $_, 0);
}
-e syntax OK
[download]

Who knows? I doubt it, but the docs do not specify what should happen with generic strings other than ' ' and maybe one day special behaviour may be implemented for them along the lines of what is currently available for ' '.

(My current practice is to use 'foo' on the principle of never using regexes where a simple comparison is sufficient. On the surface, at least, it seems that splitting on /foo/ is the same manner of overkill as using if $bar =~ /^foo$/ instead of if $bar eq 'foo'.)

Your principle is based on an assumption that is simply not true for as we've seen before, your string is converted to a regex anyway. And OTOH matches like /foo/ are internally optimized to basically do an index. So no overkill. If you don't trust me, just trust B::Concise:

$ perl -MO=Concise -pe '$_=split "foo"' >1.txt
-e syntax OK
$ perl -MO=Concise -pe '$_=split /foo/' >2.txt
-e syntax OK
$ diff [12].txt
20c20
< 6                          </> pushre(/"foo"/) s*/64 ->7
---
> 6                          </> pushre(/"foo"/) s/64 ->7
[download]

Notice that the only difference is an asterisk, which means "Do something weird for this op".

I'm not saying that I don't agree with you on the $bar =~ /^foo$/ vs. $bar eq 'foo' issue. Because I do agree. (Also because they're not strictly equivalent:

$ perl -le 'print "bar\n" =~ /^bar$/ ? "ok" : "not ok"'
ok
[download]

for split's first argument

[reply]
[d/l]
[select]