in reply to Re: line ending troubles
in thread line ending troubles

Thank you for the hint with the IO layers. Very interesting. Because I never used object oriented programming in perl and knew nothing about layers, I had to read first some stuff to understand it.

Now I understand your code completely and tried it in my environment. And it is working. Then I tried to implement your suggestion to avoid the corner case by using the crlf layer.

If I understand you right the solution is as follows:

package PerlIO::via::AnyCRLF; # save as PerlIO/via/AnyCRLF.pm sub PUSHED { my ($class) = @_; my $dummy; return bless \$dummy, $class; } sub FILL { my ($self, $fh) = @_; my $len = read $fh, my $buf, 4096; if (defined $buf) { $buf =~ s/\r/\n/g; } return $len > 0 ? $buf : undef; } 1;
#!/usr/bin/perl use strict; use warnings; use PerlIO::via::AnyCRLF; open my $f, "<:crlf:via(AnyCRLF)", "le.txt" or die $!; print while <$f>;

Greetings,

Dirk

Replies are listed 'Best First'.
Re^3: line ending troubles
by almut (Canon) on Dec 22, 2009 at 22:55 UTC
    If I understand you right the solution is as follows: ...

    Exactly.

    Maybe it's worth pointing out that when you have multiple layers, the order in which they are being applied (which does matter here) is from left to right when reading, and from right to left when writing (which you aren't doing in this case, but good to know anyway :)

    ----- reading ----> external side ":crlf:via(AnyCRLF)" (file) <---- writing -----

      Thank you again for your answer. The IO layers are a great construct. And it is good to know about them.

      But now I want to add the following behaviour to the AnyCRLF module:

      If the layer crlf is not on the stack, I want that the AnyCRLF module automatically puts it on the stack. So the module will always work independent if the user specified the crlf-layer in his open call or not.

      But I have no idea how to achieve this goal. I tried to overwrite the OPEN function as follows:

      sub OPEN { my ($self, $path, $mode, $fh) = @_; print "Path: " . $path . "\n"; print "Mode: " . $mode . "\n"; print "FH: " . $fh . "\n"; open $fh, "<:crlf", $path; }

      My idea was to do an open with the crlf-layer and so to put this layer on the stack. But it does not work. First I only get the path ("le.txt") in the OPEN function. The mode and the fh are undefined.

      Would be very interesting for me how to achieve it that the AnyCRLF module is automatically putting the crlf-layer on the stack if it is not already available.

      Thank you very much

      Dirk

        ...that the AnyCRLF module is automatically putting the crlf-layer on the stack if it is not already available.

        Good question.  Like you, I couldn't figure out a way to get at the file handle of the lower layer in either OPEN or PUSHED.  OTOH, as you do have access to the handle in the FILL method, you could use binmode to compose the following hack:

        package PerlIO::via::AnyCRLF; sub PUSHED { my ($class) = @_; my $have_crlf; return bless \$have_crlf, $class; } sub FILL { my ($self, $fh) = @_; binmode $fh, ":crlf" unless $$self; $$self = 1; my $len = read $fh, my $buf, 4096; if (defined $buf) { $buf =~ s/\r/\n/g; } return $len > 0 ? $buf : undef; } 1;

        Although this works, it doesn't feel right. For one, it violates the principle of least surprise...  as you can see with the debug prints before and after reading from the file handle.

        #!/usr/bin/perl use PerlIO::via::AnyCRLF; open my $f, "<:via(AnyCRLF)", "le.txt" or die $!; # debug # print "layers before reading :", join(':',PerlIO::get_layers($f)), " +\n"; # :unix:perlio:via print while <$f>; # print "layers after reading :", join(':',PerlIO::get_layers($f)), " +\n"; # :unix:perlio:crlf:via

        The other (similarly suboptimal) way would be to take care of opening a handle yourself:

        package PerlIO::via::AnyCRLF; sub PUSHED { my ($class) = @_; my $fh_ref; return bless \$fh_ref, $class; } sub FILL { my ($self) = @_; my $len = read $$self, my $buf, 4096; if (defined $buf) { $buf =~ s/\r/\n/g; } return $len > 0 ? $buf : undef; } sub OPEN { my ($self, $path) = @_; open $$self, "<:crlf", $path or die $!; # debug # print "AnyCRLF layers :", join(':',PerlIO::get_layers($$self)), +"\n"; return 1; } 1;

        but in this case, the layer can no longer be properly stacked, as it always becomes the bottommost layer. (Might not be a problem in your particular case, though.)

        Unfortunately, the PerlIO::via docs leave a number of questions unanswered (e.g. what is OPEN supposed to return; why does it fail to pass a handle, even if there is in fact a lower layer? etc.), and code samples using OPEN are hard to find.  Well, maybe the "is subject to change" note for the *OPEN methods is meant seriously — and noone has yet worked out in what way they should be changed :)