in reply to Bother (Re: Re: Split and empty strings)
in thread Split and empty strings

It's probably a Death to Dot Star! issue. You should read that node and see if you can recast your regex to better define the data that you're trying to match. At the moment, it's matching "string\n\n" as three separate strings when you probably only want two.

--
<http://www.dave.org.uk>

"Perl makes the fun jobs fun
and the boring jobs bearable" - me

  • Comment on Re: Bother (Re: Re: Split and empty strings)

Replies are listed 'Best First'.
Re: Re: Bother (Re: Re: Split and empty strings)
by Rhandom (Curate) on Apr 10, 2001 at 20:45 UTC
    Read the article, but it didn't seem to quell the question. Why is it matching twice, and more particularly, how did it match twice at the end of the string when there isn't a newline at the end of the string. And why does a '^' work (see next node) but not the '$'?

    UPDATE
    Nope, it isn't a dot star issue. If anything, it is a '*' issue, but what is the issue?
    #!/usr/bin/perl -w $_ = "The quick\n\nbrown fox\njumped."; print "-------------\n"; foreach (/([^\n]*)/gm){ print "[$_]\n"; } print "-------------\n";
    Gave the following:
    ------------- [The quick] [] [] [brown fox] [] [jumped.] [] -------------
    Same old problem ... new clothes.
      * matches 'zero or more' items. Your results are consistent with that.

      From the start of the string, first the regex encounters 'The quick'. That's zero or more elements followed by a \n. Next, it finds zero or more characters between the end of the previous chunk and the \n -- that is, zero characters. So that's also a match.

      Because there's another \n after the first, there's also a match for the zero non-newlines between the two newlines.

      It continues in that fashion.

      That's the problem with dot-star -- if you ask it to match nothing, it will happily match all nothings, even those you don't normally see.