in reply to regex to add network line-endings if required.

Your may find help in a recent discussion, Quick and portable way to determine line-ending string? You may also wish to clarify the wording of your post. You use both "platform specific line-ending" and "network line-ending." Is the use of two different phrases accidental or purposeful (and if the later, please explain)?
  • Comment on Re: regex to add network line-endings if required.

Replies are listed 'Best First'.
Re^2: regex to add network line-endings if required.
by Anonymous Monk on Feb 15, 2005 at 17:16 UTC

    " You use both "platform specific line-ending" and "network line-ending." Is the use of two different phrases accidental or purposeful?"

    Very purposeful.

    The sub will receive strings for transmission across the network. The caller may have terminated the string with "\n" (which will be the platform-sepecific line-ending dependant upon what the current platform is), or they may not have done so.

    Before transmission, I need to:

    1. Add a network-specific line-ending (\015\012) if no line-ending is present.
    2. Convert whatever (platform-specific) line-ending is present, to the network line-ending is a line-ending, if one is present.
    3. Leave the string alone, if the correct line-ending is already present.

    It seems like a single regex that used look-behind assertions properly would be able to do this, thereby avoiding messy conditional logic. It also seems like this is an oft-called for requirement and a well tried solution is probably already known.

    I've had a couple of attempts at constructing the regex, but in each case, it falls over in one of the above three cases. Either adding an extra, unnecessary line-ending or omitting to add one.

    I hoped someone in the know would point me at the correct regex to use?

      I would do that this way:

      s/\015\012|\012\015|\012|\015/\015\012/g;

      Update: I got that order wrong, as BrowserUk suggests. I had the shorter versions on the left, and the longer on the right. The above is fixed.

      It should help you with things like this to take a good, hard look at the perlre section of the standard Perl docs. You should have a copy of the standard Perl docs on your system, but if not, perlre is on search.cpan.org too.



      Christopher E. Stith

        Isn't that always going to match \012 in preference to \012\015?

        Also, from the way I read the OP, they are only interested in fixing line-endings; ie. those at the end of the string rather than any embedded, but I could be wrong on that.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
      Update: Deleted this and put it as a response to the main question. Do I need to do something to get this reaped or can I just delete the text and the title?
Re^2: regex to add network line-endings if required.
by mr_mischief (Monsignor) on Feb 15, 2005 at 17:45 UTC
    I'm not the OP, but I took the two phrases to mean very different things.

    "platform specific line-ending" could be "\012" ("\l") for Unix, "\015\012" ("\r\l") or "\012\015" ("\l\r") for DOS, or "\015" ("\r") for at least older Macs. All of these, on their own platforms, are represented by the C library as "\n" (newline).

    "Network line-ending" means "\015\012", which is the standard for things like Telnet and HTTP.

    Any telnet or web implementation on any platform uses "\015\012" (carriage return followed by linefeed) as the end-of-line marker.

    See RFC 1945 for HTTP and RFC 854 (STD 0008) for Telnet for examples.



    Christopher E. Stith
      precisely my point.