in reply to Re: CR-LF Newlines as 2 distinct characters
in thread CR-LF Newlines as 2 distinct characters

My problem has been integrating "my" newline sequence with the rest of the string, which also needs to get mangled. If I stringify it all as is, the CR-LF gets treated as a single character instead of being split with a \x00. If I pre-treat the newlines(s/\n/\x{0D}\x{00}\x{0A}/g) , it gets an extra \x00 in between when I mangle it.
As you suggest, I'll try a more explicit approach.

"One is enough. If you are acquainted with the principle, what do you care for the myriad instances and applications?"
- Henry David Thoreau, Walden

Replies are listed 'Best First'.
Re^3: CR-LF Newlines as 2 distinct characters
by radiantmatrix (Parson) on May 18, 2006 at 21:20 UTC

    Ah, I see. I don't know what you need to "mangle", exactly, but since you seem to have UTF16-like representations of purely 8-bit chars, you might take a risk and do s[\x00(?<!\x00)][]gs; on the way in (0) to make a "real ASCII" string. You can then mangle the 8-bit ASCII string comfortably.

    Then do something like this on the way out:

    $str = join( '', map{ "\x00$_" } split('',$str) );

    That should pad you appropriately. It's cheating, but it might work.

    [0]: The negative lookbehind is to make sure a "\x00\x00" doesn't get chopped away; it's untested, though.

    <radiant.matrix>
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    I haven't found a problem yet that can't be solved by a well-placed trebuchet