in reply to Re: Windows file read
in thread Windows file read

While this did preserve the CRLF at the end of each line, I still ran into the same problem as when I opened the file simply doing:

open IN, '<:raw', $file or die "Unable to open file: $file";

This problem is that, despite setting $/ to "\x0D\x0A", the lines with a rogue LF (\x0A) are still split into two lines. I have used a hex editor to make sure that ONLY a LF is present, so I am not misreading the data.

Sadly, this puts me right back to the beginning problem. I appreciate all your assistance so far ikegami. I now know about layering the file modes, however unfortunately my line break problem still exists.

Replies are listed 'Best First'.
Re^3: Windows file read
by ikegami (Patriarch) on May 01, 2006 at 17:15 UTC
    It works for me. Show your code, please. Mine is the following:
    { open(my $fh, '>:raw', 'file') or die("open>: $!\n"); print $fh ("abc\x{0D}\x{0A}de\x{0A}fg\x{0D}\x{0A}"); # [------5------][---------7----------] # [------5------][---3--][------4-----] } { open(my $fh, '<:raw', 'file') or die("open<: $!\n"); local $/ = "\x0D\x0A"; print length, "\n" while <$fh>; }
    outputs
    5 7

    If you remove the assignment to $/, the output is

    5 3 4
Re^3: Windows file read
by thedoe (Monk) on May 01, 2006 at 19:02 UTC

    Oddly enough, when I extracted the hex information to a simple text file containing only a problem line and one line before and after it, the same code I was having a problem with worked.

    I am now re-running on the much larger, original file. If I run into another problem now, though, I will know that there must be some type of extra character which, for some reason or another, is not being reported by my hex editor.

    Thank you again for your help ikegami. I will update this post with the results of the larger run.

    Update: Unfortunately, the source file I have is still giving me this problem. I am looking into what could be doing this. I have extracted the lines around it into a temporary file, but do not have this problem with the temp file. It seems to only happen in the main source. Thank you again for your help, as I now know where I need to look for the (hopeful) solution to my dilemma.

    Update 2: Wow...after spending two days on this, I have just learned that someone modified the input file after I looked at it in hex to include a true line break in that position. Why? I have no idea. But the mystery has finally been solved. Thank you again to ikegami for pointing me back towards where I had been looking. At least now I know I'm not too crazy