in reply to Special behavior for LF and CR in RegExs?

The “doesn’t DWIM” snip seems to be missing /g modifiers. Posting accident, or is that so in your code as well?

Anyway, if split /^/m works, it seems that split /\n/ also should. Does it not?

You can minimise that code quite a bit, btw, by simply saying chomp( @records = split /^/xms, $data );

Makeshifts last the longest.

Replies are listed 'Best First'.
Re^2: Special behavior for LF and CR in RegExs?
by Argel (Prior) on Jan 05, 2006 at 01:19 UTC
    Good catch on the missing 'g'!! You are right, that did work.

    I have seen splitting on a \n work and also seen it not work. I'm using a compiled by myself perl 5.8.0 on Solaris 8 so perhaps there is a bug buried away in there?

    Looks like davidrw's $/ suggestion also works. Given the above \n problem I think I will use that instead.

    Thanks for all the help!!

    -- Argel

      Well, $/ is the input record separator; generally, in strings and patterns, \n is magically mapped to that behind the scenes – even if it consists of multiple characters on the platform in question, such as CR/LF on DOS.

      Basically, using \n will always work so long as the data you’re processing comes from the same platform that you’re running on. If not, you’ll need to convert end-of-line markers. There’s no way to avoid this.

      So outside specific scenarios, you should use \n or $/ and let Perl handle the specifics. That will also yield the most portable scripts.

      Makeshifts last the longest.

        \n is not magically mapped to $/ "behind the scenes". There are so many misconceptions combined in that sentence that I'm at a loss at where to start.

        My reaction is strong because these many misconceptions are common, I've railed against them several times, and I respect you enough to be truely shocked to hear this from you.

        It's late and I'm very tired and yet also rather busy so I'll make the rude suggestion that you might want to super search for nodes by me regarding newline + $/ + \n + \r (not because I'm the only one who has anything useful to say on that subject, but because even using all of those terms, I suspect you'd otherwise get a lot to sort through while I'm sure I have several treatments of these all-too-common misconceptions under my name).

        Update: Struck out one search term to yield a more interesting search. Though the meat of it is mostly collected in Re^4: Line Feeds (rumor control).

        - tye