in reply to Re: regex to add network line-endings if required.
in thread regex to add network line-endings if required.

" You use both "platform specific line-ending" and "network line-ending." Is the use of two different phrases accidental or purposeful?"

Very purposeful.

The sub will receive strings for transmission across the network. The caller may have terminated the string with "\n" (which will be the platform-sepecific line-ending dependant upon what the current platform is), or they may not have done so.

Before transmission, I need to:

  1. Add a network-specific line-ending (\015\012) if no line-ending is present.
  2. Convert whatever (platform-specific) line-ending is present, to the network line-ending is a line-ending, if one is present.
  3. Leave the string alone, if the correct line-ending is already present.

It seems like a single regex that used look-behind assertions properly would be able to do this, thereby avoiding messy conditional logic. It also seems like this is an oft-called for requirement and a well tried solution is probably already known.

I've had a couple of attempts at constructing the regex, but in each case, it falls over in one of the above three cases. Either adding an extra, unnecessary line-ending or omitting to add one.

I hoped someone in the know would point me at the correct regex to use?

  • Comment on Re^2: regex to add network line-endings if required.

Replies are listed 'Best First'.
Re^3: regex to add network line-endings if required.
by mr_mischief (Monsignor) on Feb 15, 2005 at 18:28 UTC
    I would do that this way:

    s/\015\012|\012\015|\012|\015/\015\012/g;

    Update: I got that order wrong, as BrowserUk suggests. I had the shorter versions on the left, and the longer on the right. The above is fixed.

    It should help you with things like this to take a good, hard look at the perlre section of the standard Perl docs. You should have a copy of the standard Perl docs on your system, but if not, perlre is on search.cpan.org too.



    Christopher E. Stith

      Isn't that always going to match \012 in preference to \012\015?

      Also, from the way I read the OP, they are only interested in fixing line-endings; ie. those at the end of the string rather than any embedded, but I could be wrong on that.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        I updated my original post to have the shorter versions after the longer ones, to alleviate the problem you mentioned of matching the "\012" instead of "\012\015".

        As for only needing line endings, since each of these could be the marker for a line ending, one could only know the difference if the source of the original line is known. In the case that one knows the line ending which needs to be fixed, one would only need to fix that type of line ending. Since multiple types of line endings need to be addressed, there must not be an idea of "end of line" within the strings themselves other than the line ending sequences. If there's some other way to know what is a line, via some outside data, then things could be different. The whole idea of what is a line according to the application in question may require more thought, but that information has not been provided here.



        Christopher E. Stith
Re^3: regex to add network line-endings if required.
by tphyahoo (Vicar) on Feb 15, 2005 at 17:30 UTC
    Update: Deleted this and put it as a response to the main question. Do I need to do something to get this reaped or can I just delete the text and the title?