Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Given a string that may or may not have a platform specific line-ending (\n), I need a regex to, A. add \015\012 if there is no line-ending, B. convert any existing line-ending to the network line-ending, C. add nothing if the correct line-ending is already there?

  • Comment on regex to add network line-endings if required.

Replies are listed 'Best First'.
Re: regex to add network line-endings if required.
by ww (Archbishop) on Feb 15, 2005 at 14:44 UTC
    Your may find help in a recent discussion, Quick and portable way to determine line-ending string? You may also wish to clarify the wording of your post. You use both "platform specific line-ending" and "network line-ending." Is the use of two different phrases accidental or purposeful (and if the later, please explain)?

      " You use both "platform specific line-ending" and "network line-ending." Is the use of two different phrases accidental or purposeful?"

      Very purposeful.

      The sub will receive strings for transmission across the network. The caller may have terminated the string with "\n" (which will be the platform-sepecific line-ending dependant upon what the current platform is), or they may not have done so.

      Before transmission, I need to:

      1. Add a network-specific line-ending (\015\012) if no line-ending is present.
      2. Convert whatever (platform-specific) line-ending is present, to the network line-ending is a line-ending, if one is present.
      3. Leave the string alone, if the correct line-ending is already present.

      It seems like a single regex that used look-behind assertions properly would be able to do this, thereby avoiding messy conditional logic. It also seems like this is an oft-called for requirement and a well tried solution is probably already known.

      I've had a couple of attempts at constructing the regex, but in each case, it falls over in one of the above three cases. Either adding an extra, unnecessary line-ending or omitting to add one.

      I hoped someone in the know would point me at the correct regex to use?

        I would do that this way:

        s/\015\012|\012\015|\012|\015/\015\012/g;

        Update: I got that order wrong, as BrowserUk suggests. I had the shorter versions on the left, and the longer on the right. The above is fixed.

        It should help you with things like this to take a good, hard look at the perlre section of the standard Perl docs. You should have a copy of the standard Perl docs on your system, but if not, perlre is on search.cpan.org too.



        Christopher E. Stith
        Update: Deleted this and put it as a response to the main question. Do I need to do something to get this reaped or can I just delete the text and the title?
      I'm not the OP, but I took the two phrases to mean very different things.

      "platform specific line-ending" could be "\012" ("\l") for Unix, "\015\012" ("\r\l") or "\012\015" ("\l\r") for DOS, or "\015" ("\r") for at least older Macs. All of these, on their own platforms, are represented by the C library as "\n" (newline).

      "Network line-ending" means "\015\012", which is the standard for things like Telnet and HTTP.

      Any telnet or web implementation on any platform uses "\015\012" (carriage return followed by linefeed) as the end-of-line marker.

      See RFC 1945 for HTTP and RFC 854 (STD 0008) for Telnet for examples.



      Christopher E. Stith
        precisely my point.
Re: regex to add network line-endings if required.
by Corion (Patriarch) on Feb 15, 2005 at 14:40 UTC

    That's great! Now, what have you already tried?

    Personally, I would attack the problem through the following two steps:

    1. Remove any line ending, whatever it is. perldoc -f chomp and perldoc perlvar for $/ might be interesting reading for you.
    2. Add \015\012 to the string.
Re: regex to add network line-endings if required.
by Roy Johnson (Monsignor) on Feb 15, 2005 at 14:42 UTC
    You can do it with a simple replacement, where the pattern is any number of possible line-ending characters, and the replacement is the ending you want.

    Caution: Contents may have been coded under pressure.
Re: regex to add network line-endings if required.
by BrowserUk (Patriarch) on Feb 15, 2005 at 18:05 UTC

    Something like this?

    #! perl -slw use strict; sub fixLE { my $string = shift; $string =~ s[(?:\n?\r?\n?)$][\015\012]; return $string; } print join('|', unpack 'C*', substr $_ , -8 ), $/, join('|', unpack 'C*', substr fixLE( $_ ), -8 ), $/ for 'No line ending needs one', "unix style needs fixing\x0a", "mac style needs fixing\x0d", "windows style needs fixing\x0d\x0a", "network style: leave alone\015\012"; __END__ P:\test>junk 101|101|100|115|32|111|110|101 100|115|32|111|110|101|13|10 32|102|105|120|105|110|103|10 102|105|120|105|110|103|13|10 32|102|105|120|105|110|103|13 102|105|120|105|110|103|13|10 102|105|120|105|110|103|13|10 102|105|120|105|110|103|13|10 32|97|108|111|110|101|13|10 32|97|108|111|110|101|13|10

    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
Re: regex to add network line-endings if required.
by tphyahoo (Vicar) on Feb 15, 2005 at 17:37 UTC
    I recommend you post a script with your regex attempts, matching against various test input data. And comments on what is wrong with each of the regexes. And ideally the output.

    Ideally all in a script you post in "code" tags. Then it would be easier for someone on PM to write one regex that does everything you want to do.

    Celsius to Fahrenheit using Regexp::Common has some simple code that would get you started reformulating your question in a more helpful way.