Nordikelt has asked for the wisdom of the Perl Monks concerning the following question:

I have a snippet of code the output of which makes no sense to me:
#!/usr/bin/perl use strict; use warnings FATAL => 'all'; while (reverse <DATA>) { warn defined($_) ? "defined\n" : "undefined\n"; next unless defined($_); warn "$_\n"; } exit 0; __DATA__ this is line 1 this is line 2 this is line 3
The DATA is just for convenience. I see the same output is produced if I use STDIN, and that output is:
undefined Use of uninitialized value in reverse at text.pl line 6, <DATA> line 3 +.

If I change the while loop to foreach, then I do not see this odd behavior, but in my real world code, I have a much larger data set, and I would like to keep the memory footprint lower (i.e., see related).

Can someone tell me what is actually happening here? Thanks much!

Replies are listed 'Best First'.
Re: Uninitialized value in reverse
by davido (Cardinal) on Apr 24, 2020 at 21:10 UTC

    Consider this: while(<DATA>) is about the same as while(defined($_ = <DATA>)). We should be able to agree that there are two pieces of magic going on here. The first is the implicit assignment to $_, and the second is the implicit defined test. perlop can help shed some light. The implicit assignment is interesting:

    If and only if the input symbol is the only thing inside the conditional of a while statement (even if disguised as a for(;;) loop), the value is automatically assigned to the global variable $_

    That explains one of the bugs in your code. You have more than just the input operator in your while(...) construct. Therefore, there is no implicit assignment to $_. Your loop's body expects there to be a useful value in $_, but there isn't, because it's not being assigned to.

    Next consider reverse. Perl's reverse is actually a rather bizarre operator in how it handles context. The context of a conditional expression in a while(COND) construct is scalar context. Therefore, you get this behavior, as described in reverse:

    reverse LIST ... in scalar context, concatenates the elements of LIST and returns a string value with all characters in the opposite order.

    The documentation provides this example:

    print scalar reverse "dlrow ,", "olleH"; # Hello, world

    All the lines get concatenated together, and then everything is reversed. So if you resolved the first bug, you would have the second; that instead of getting the lines in reverse, you get the entire slurped in file's characters in reverse.

    The third bug is a little harder; Since you're pulling in the entire file in one fell swoop, your loop body will only execute once. On the second pass <DATA> will return undef.

    Your goal, I believe, is to read each line in reverse order. There are two reasonable ways to do this, based on how big the input file is. If the file doesn't ever grow too big, just slurp the whole thing in: my @lines = reverse <DATA>. This invokes reverse in LIST context, so that it leaves the character order for each line in tact, and just reverses the order of the lines. But if the file could possibly grow large enough that memory is a consideration, use File::ReadBackwards. This example is a little awkward since File::ReadBackwards isn't really designed for reading the __DATA__ segment. But it should convey the simplicity of using this module:

    #!/usr/bin/env perl use strict; use warnings; use FindBin qw($Bin); use File::Spec::Functions qw(catfile); use File::ReadBackwards; my $bw = File::ReadBackwards->new(catfile($Bin,$0)) or die "Can't read $0: $!\n"; while( defined(my $line = $bw->readline) ) { last if $line =~ m/^__(?:END|DATA)__/; print $line; } __DATA__ 1234 5678 9ABC DEFG

    This produces:

    DEFG 9ABC 5678 1234

    The File::ReadBackwards module reads the file in chunks of 8192 bytes, seeking from the end upward. Therefore, it never needs to hold the entire file in memory at once.


    Dave

Re: Uninitialized value in reverse
by choroba (Cardinal) on Apr 24, 2020 at 19:41 UTC
    while doesn't iterate $_ over the expression. while (<>) is special, it's an abbreviation of
    while (defined($_ = <>))
    By using anything more than the diamond operator (or an assignment of it), the magic is gone. So, reverse <DATA> reverses DATA, but doesn't assign anything to $_.

    See also I/O Operators in perlop.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Uninitialized value in reverse
by Fletch (Bishop) on Apr 24, 2020 at 19:55 UTC

    Never used it but remember seeing mention of PerlIO::reverse which might work on your real dataset.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Uninitialized value in reverse
by 1nickt (Canon) on Apr 24, 2020 at 19:47 UTC

    I may be dense but I don't see how reverse can work with while to reduce memory footprint as it would need to know the full list before it can provide you the last element.

    Hope this helps!


    The way forward always starts with a minimal test.