Re: Is there a way to open a memory file with binmode :raw?

I don't have MSWin available, but I can fake it sufficiently for this test with:

use open IO => ':crlf';
[download]

Here's four ways to do what you want (plus a fifth just to show what happens when one of those isn't used). There may, of course, be other ways I didn't think of.

#!/usr/bin/env perl -l

use strict;
use warnings;
use autodie;
use open IO => ':crlf';

my ($f, $fh); 
my $mf = \$f;
my $test_file = 'pm_1144333_test_file.txt';

open $fh, '>', $mf;
print $fh 'hello';
close $fh;
print_raw_1('mem:  ', $mf);
print_raw_2('mem:  ', $mf);
print_raw_3('mem:  ', $mf);
print_raw_4('mem:  ', $mf);
print_raw_5('mem:  ', $mf);

open $fh, '>', $test_file;
print $fh 'hello';
close $fh;
print_raw_1('file: ', $test_file);
print_raw_2('file: ', $test_file);
print_raw_3('file: ', $test_file);
print_raw_4('file: ', $test_file);
print_raw_5('file: ', $test_file);


sub print_raw_1 {
    my ($prompt, $file) = @_;

    open my $fh, '<:raw', $file;
    print '1. ', $prompt, unpack 'H*' while (<$fh>);
    close $fh;
}

sub print_raw_2 {
    my ($prompt, $file) = @_;

    open my $fh, '<', $file;
    binmode $fh, ':raw';
    print '2. ', $prompt, unpack 'H*' while (<$fh>);
    close $fh;
}

sub print_raw_3 {
    my ($prompt, $file) = @_;

    use open IN => ':raw';
    open my $fh, '<', $file;
    print '3. ', $prompt, unpack 'H*' while (<$fh>);
    close $fh;
}

sub print_raw_4 {
    my ($prompt, $file) = @_;

    use open IO => ':raw';
    open my $fh, '<', $file;
    print '4. ', $prompt, unpack 'H*' while (<$fh>);
    close $fh;
}

sub print_raw_5 {
    my ($prompt, $file) = @_;

    open my $fh, '<', $file;
    print '5. ', $prompt, unpack 'H*' while (<$fh>);
    close $fh;
}
[download]

Here's the output:

1. mem:  68656c6c6f0d0a
2. mem:  68656c6c6f0d0a
3. mem:  68656c6c6f0d0a
4. mem:  68656c6c6f0d0a
5. mem:  68656c6c6f0a
1. file: 68656c6c6f0d0a
2. file: 68656c6c6f0d0a
3. file: 68656c6c6f0d0a
4. file: 68656c6c6f0d0a
5. file: 68656c6c6f0a
[download]

And, if I hadn't "faked it", i.e. not including the use open IO => ':crlf'; line, I just get:

1. mem:  68656c6c6f0a
2. mem:  68656c6c6f0a
3. mem:  68656c6c6f0a
4. mem:  68656c6c6f0a
5. mem:  68656c6c6f0a
1. file: 68656c6c6f0a
2. file: 68656c6c6f0a
3. file: 68656c6c6f0a
4. file: 68656c6c6f0a
5. file: 68656c6c6f0a
[download]

See also:

and, if you're interested in internals:

— Ken

Comment on Re: Is there a way to open a memory file with binmode :raw? Select or Download Code

Replies are listed 'Best First'.
Re^2: Is there a way to open a memory file with binmode :raw? by stevieb (Canon) on Oct 10, 2015 at 15:40 UTC
This is very informative kcott, thanks :) However, it doesn't help me understand why on Windows, when printing to a file-based file handle, an `\n` is printed by default as `\r\n` into the file (without any `binmode` or `IO` trickery, it just does so naturally. However, the default record separator `\r\n` is not printed to a memory file based handle, it is printed only as `\n`. I would expect that regardless of type of handle, the default OS record separator would be used. I can't find anywhere that states this discrepancy between a real file and printing the exact same thing to a scalar reference acting as a file handle. That, or I'm missing something very basic.	[reply] [d/l] [select]
Re^3: Is there a way to open a memory file with binmode :raw? (\r, consistency) by tye (Sage) on Oct 10, 2015 at 16:28 UTC
The problem is only with your expectations. Did you know that, in Unix, writing "\n" also becomes "\r\n", by default, just not in ordinary files. For example, it does that when writing to a TTY (having the default configuration). Writing to a Perl scalar is not handled by the Windows clib, obviously. So there is no requirement that such writes emulate the default behavior of Windows' clib. "\r\n" is the default text record separator for Windows text files. - tye	[reply]
Re^3: Is there a way to open a memory file with binmode :raw? by BrowserUk (Patriarch) on Oct 10, 2015 at 18:05 UTC
CRLF translation is feature of the Windows file systems; not the Perl language. The PerlIO layers emulate it when writing to the Windows file system. One fairly typical usage of memory files is to reduce IO overheads by accumulating lines together into a single scalar and then write the entire file in one go. If Perl applied the CRLF translation when writing to the memory file; then when the scalar is written to the file, the file system (or file system emulation) would apply the CRLF translation a second time and you would end up with \r\r\n. Of course that could be avoided by applying the non default :raw layer to the actual output file; or by applying binmode; but that means extra steps are required. Better to only apply CRLF translations when actually writing to actual file system files and then the default behaviours work together to produce the desired result. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :) In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^3: Is there a way to open a memory file with binmode :raw? by kcott (Archbishop) on Oct 11, 2015 at 02:21 UTC
Update: The accuracy of the information I linked to (perlport: Newlines) is in question. See tye's response to this node. In Perl, `\n` is a logical newline. It *does not necessarily* represent the single ASCII character whose decimal value is `10`. Perhaps a read of "perlport: Newlines" will help clarify the situation for you. — Ken	[reply] [d/l] [select]
Re^4: Is there a way to open a memory file with binmode :raw? ("\n") by tye (Sage) on Oct 11, 2015 at 07:26 UTC
Yeah, perlport has caused more wrong conclusions than enlightenment on newlines in my experience. For example: In Perl, \n is a logical newline. It does not necessarily represent the single ASCII character In Perl, "\n" is actually always exactly one character. On an ASCII system, it is also always ASCII linefeed... except for the single case of old Macs, which took the unprecedented route of being "almost ASCII". "\n" is not much more a "logical" newline than "a" is a logical letter A. "a" is also always exactly one character and is also not always the ASCII lower-case letter A. - tye	[reply]
Re^5: Is there a way to open a memory file with binmode :raw? ("\n") by kcott (Archbishop) on Oct 12, 2015 at 03:44 UTC