omadawn has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to read from a text file and process each line. The problem is that some of these lines may end up containing a bit of binary data in them (it's a log file from someone elses code) say:
11/22/2004 04:40:03 - AD [useraccountcontrol]: nnn 11/22/2004 04:40:03 - AD [userparameters]: m: +d PCtxCfgPrese 11/22/2004 04:40:03 - AD [userprincipalname]:
None of my windows text editors have any real problems displaying the file I just get garbage like the above in it. however when I try and read it from a perl file handle: @array = <$filehandle>; then it takes the second line above (userparameters) as the last line.

Replies are listed 'Best First'.
Re: Problems with binary data in text documents
by waswas-fng (Curate) on Nov 22, 2004 at 20:25 UTC
    If this is on windows are you setting binmode?


    -Waswas
      One thing I don't get is that while I seem to still have access to individual lines the $ char in a regex doesn't do what I expect it to.
      open ( $fh, "< $path_to_file"); binmode($fh); my @log = <$fh>; my $line; while ($line = pop @log) { last if ($line =~ /^.*\ -\ line I'm looking for$/); }
      had to become:
      last if ($line =~ /^.*\ -\ line I'm looking for/);
      perldoc -f binmode ...reading.... I am now! Thanks.

        Some relevant details...

        If DOS/Windows encounters chr(26) in a text file, it considers that the end of the text file. binmode does not treat chr(26) specially. However, \r\n are not converted to \n in binmode. You may want to set $/ to "\015\012".