in reply to Re: Multiline Regex
in thread Multiline Regex

Note: split /\n/ is only an acceptable of splitting text into lines if you don't care about eliminating trailing blank lines.

$str = "this\nis my short\nstring\n\n"; $i=0; for (split /\n/, $str) { print(++$i, ": $_\n"); } print("\n"); $i=0; for ($str =~ /.*\n|.+/g) { # Just like <> print(++$i, ": $_"); }

Replies are listed 'Best First'.
Re^3: Multiline Regex
by johngg (Canon) on Oct 09, 2008 at 10:43 UTC
    Note: split /\n/ is only an acceptable of splitting text into lines if you don't care about eliminating trailing blank lines.

    You can perhaps get around that by supplying a third argument to split of -1. The method does point up a non-existant empty line at the end of the file but that can be coped with by spliting to an array and poping if necessary.

    use strict; use warnings; my $str = qq{this\nis my short\nstring\n\n}; my $count = 0; for ( split m{\n}, $str, -1 ) { print ++ $count, qq{: $_\n}; }

    Produces

    1: this 2: is my short 3: string 4: 5:

    I hope this is of interest.

    Cheers,

    JohnGG

Re^3: Multiline Regex
by moritz (Cardinal) on Oct 09, 2008 at 09:56 UTC
    If the regex can match the empty string, the OP has likely other problems than trailing empty lines. If it can't match the empty string, the example code will never fail.

      I don't know to what kind of problems you are referring.

      And for an extra character, I'd use the more versatile code over the code that probably works in this situation.

        I don't know to what kind of problems you are referring.

        Many beginners accidentally write regexes that match the empty string (I've seen several such regexes that should match floating point numbers, for example).

        And for an extra character, I'd use the more versatile code over the code that probably works in this situation.

        Your code might be more versatile, but it's rather non-obvious. It took me perhaps 10 to 15 seconds to figure out what it does, because it relies on two non-obvious features (return of a list of captures in list context and . not matching newline) and has a special case for the end of the string.

        Maybe a good compromise would be to use split m/\n/, $str, -1;, where you can see at first glance what it (roughly) does, and it relies only on one non-obvious behaviour (negative limits suppress trimming of trailing empty list items).