mariaprabudass has asked for the wisdom of the Perl Monks concerning the following question:

I am splitting the record using the delimiter '|'. Encounter a scenario where pipe symbol(delimiter) is preceded by escape sequence,in that case pipe symbol couldn't be consider as a delimiter. How do i resolve it using split? Posted below the sample piece of code

#!/usr/bin/perl use strict; my $id = 'Hi|Hello\|Sir'; my @code = split(/\|/,$id); print $code[1]."\n";

The expected output for the above program is "Hello\|Sir" but the actual output is "**Hello**".How do i handle the delimiter preceded by escape sequence using split.

Thank You

Replies are listed 'Best First'.
Re: Splitting the record using the delimiter
by kcott (Archbishop) on Sep 30, 2015 at 18:29 UTC

    G'day mariaprabudass,

    Welcome to the Monastery.

    The output I get is 'Hello\'; not the '"**Hello**"' you show.

    What you actually want to split on is a pipe (|) that is not preceded by a backslash (\). You can do this with, what is called, a negative look-behind assertion which looks like: (?<!PATTERN). See "perlre: Extended Patterns" for more details.

    Both the pipe and backslash characters are special in regexes and need to be escaped. This gives a split pattern of /(?<!\\)\|/.

    Here's my test code:

    #!/usr/bin/env perl -l use strict; use warnings; my $id = 'Hi|Hello\|Sir'; my @code = split /\|/, $id; print $code[1]; @code = split /(?<!\\)\|/, $id; print $code[1];

    Output:

    Hello\ Hello\|Sir

    — Ken

Re: Splitting the record using the delimiter
by Corion (Patriarch) on Sep 30, 2015 at 18:06 UTC

    Instead of splitting away what you don't want, consider matching what you want to keep:

    my @code = ($id =~ /([^|\\]+|\\[|\\])/g);

    Also see Text::CSV_XS.

Re: Splitting the record using the delimiter
by CountZero (Bishop) on Sep 30, 2015 at 20:22 UTC
    Or just hand your records over to Text::CSV and set escape_char to '\' and sep_char to '|'.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Splitting the record using the delimiter
by KurtSchwind (Chaplain) on Sep 30, 2015 at 18:29 UTC

    Above you told it to split on pipes. And what I get when I run it is "Hello\" which is split on pipes.

    Are you trying to split on pipes except when preceded by a backslash? If so:

    #!/usr/bin/perl use strict; my $id = 'Hi|Hello\|Sir'; my @code = split(/[^\\]\|/,$id); print $code[1]."\n";

    Note: That since this now splits on a character followed by a pipe that $code[0] would be "H" and not "Hi". The short answer is you are looking for split to have a variable delimiter in this case. A pipe sometimes and no pipe at other times.

    --
    “For the Present is the point at which time touches eternity.” - CS Lewis
      This is a case for a negative look-behind assertion:
      #!/usr/bin/perl use strict; my $id = 'Hi|Hello\|Sir'; my @code = split(/(?<!\\)\|/,$id); print $code[1]."\n";
      UPDATE: Sorry, I should have realized that I just repeated what kcott said in the previous reply.