BaldGorilla has asked for the wisdom of the Perl Monks concerning the following question:

In the example below, $file is lines of data that can contain a null, \x00 in hex, within a string. When the lines of data are written out to $new_file, the null value is being replaced by a regular space, \x20 in hex. I tried opening both files with :utf8, but I don't even know if that has anything to do with the problem. I want to write the lines of data as-is from $file to $new_file. Any help is greatly appreciated.
use strict; use warnings; use Data::Dumper; my $file = $ARGV[0]; my $new_file = $file . '.new'; open (my $fh, "<", $file) or die $!; open (my $new_fh, ">", $new_file) or die $!; while (my $line = <$fh>){ print $new_fh "$line"; } close $fh; close $new_fh;

Replies are listed 'Best First'.
Re: Null \x00 being replaced by space \x20
by Athanasius (Archbishop) on Nov 10, 2018 at 03:38 UTC

    Hello BaldGorilla, and welcome to the Monastery!

    I assume the line print $new_gpd_fh "$line"; is meant to be print $new_fh "$line";? With this change made, I cannot reproduce your problem.

    Specifically, I created a text file, then used a hex editor to insert a null (\x00) at a suitable place in the text, and ran the script. The resulting new file has a null character in the correct place, as expected. (Confirmed both in the hex editor and in Notepad++, which displays a NUL control character.)

    Are you sure that your new file contains a space character in place of the null? Or could the space you are seeing be an artefact of the way you are viewing the file contents?

    Cheers,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Yes, fixed the the typo, sorry about that. I'm using UltraEdit and Notepad++ and I see the null and space difference. Is there any way to send screen shots?
        Is there any way to send screen shots?

        Better just to get perl to tell you what's in the file.
        As Athanasius suggested, I suspect that the file really does contain a NULL, and the tool you're using to look at it converts it to a space.

        On Windows 7 (perl-5.28.0), I created a file (null.txt) by running:
        C:\_32> perl -e "$s = 'x' . chr(0) . 'y';print $s;" >null.txt
        I then examined the contents of null.txt by running:
        C:\_32> perl -le "open RD, '<', 'null.txt'; $x = <RD>; chomp $x; print + ord substr($x, 0, 1); print ord substr($x, 1, 1);print ord substr($x +, 2, 1); print length $x;" 120 0 121 3
        After using your script to create null.txt.new, I then ran the same one liner to ascertain the contents of null.new.txt
        C:\_32> perl -le "open RD, '<', 'null.txt.new'; $x = <RD>; chomp $x; p +rint ord substr($x, 0, 1); print ord substr($x, 1, 1);print ord subst +r($x, 2, 1); print length $x;" 120 0 121 3
        So it looks to me that I also am unable to reproduce the problem.

        Cheers,
        Rob
Re: Null \x00 being replaced by space \x20
by BillKSmith (Monsignor) on Nov 10, 2018 at 05:00 UTC
    I also am unable to duplicate your problem. I have displayed my input and output files with the hex conversion utility xxd (supplied with vim). I have also compared the two files with the windows file compare utility fc.
    $xxd null_test.txt 00000000: 736f 6d65 2074 6578 7400 6d6f 7265 2074 some text.more t 00000010: 6578 740d 0a0d 0a ext.... $perl BaldGorilla.pl null_test.txt $fc null_test.txt null_test.txt.new Comparing files null_test.txt and NULL_TEST.TXT.NEW FC: no differences encountered $xxd NULL_TEST.TXT.NEW 00000000: 736f 6d65 2074 6578 7400 6d6f 7265 2074 some text.more t 00000010: 6578 740d 0a0d 0a ext.... $
    Bill
Re: Null \x00 being replaced by space \x20
by Anonymous Monk on Nov 10, 2018 at 06:43 UTC
    The :crlf IOLayer might have something to do with this, since you are probably on Windows. Are there any changes if you open your files with :raw appended to the second argument?

      :crlf transforms 0D 0A into 0A on read, and 0A into 0D 0A on write. That's it.