Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I am just really puzzled by this... When I have a string that I want to split by '|' I have to escape it which is really weird... Why is that let me give you an example..
my $string = user|name|password; # doesnt work #my @field = split ('|', $string); #works my @field = split ('\|', $string);
Now why is that, as far as I knew any character within ' ' will get interpreted literally. So whats up with that?

Replies are listed 'Best First'.
Re: '|' need to be escaped?
by ikegami (Patriarch) on Sep 11, 2011 at 18:32 UTC
    The first argument of split is a regular expression. It is not weird that you need to escape "|" if you want it to match "|".

    You're probably confusing yourself by writing

    my @field = split ('\\|', $string);
    instead of
    my @field = split (/\|/, $string);
    or
    my @field = split (qr/\|/, $string);
      This split() stuff can get very confusing.

      Basically a split on (' ',$_) suppresses the leading NULL field if there is a leading white space character. I don't know of a split on a regex variation of \s that can do the same thing. Do you?

      #!/usr/bin/perl -w use strict; my $string = " abc xyz"; my @tokens = split(' ',$string); print "There are ".@tokens." tokens in \'$string\'\n"; print "Using split on ' '\n"; print join("|",@tokens),"\n\n"; my @tokens2 = split(/\s+/,$string); print "There are ".@tokens2." tokens in \'$string\'\n"; print "Using split on /\\s+/\n"; print join("|",@tokens2),"\n\n"; my @tokens3 = split(/ /,$string); print "There are ".@tokens3." tokens in \'$string\'\n"; print "Using split on / /\n"; print join("|",@tokens3),"\n\n"; my @tokens4 = split(/\s/,$string); print "There are ".@tokens4." tokens in \'$string\'\n"; print "Using split on /\\s/\n"; print join("|",@tokens4),"\n"; __END__ The above code prints: There are 2 tokens in ' abc xyz' Using split on ' ' abc|xyz There are 3 tokens in ' abc xyz' Using split on /\s+/ |abc|xyz There are 6 tokens in ' abc xyz' Using split on / / ||||abc|xyz There are 6 tokens in ' abc xyz' Using split on /\s/ ||||abc|xyz
      So when the first arg to split() is in single quotes ...split (' ',$_) it is different than a regex...
        ' ' is a special case, and no, there's no equivalent using a regex.
      thanks :) that was it