gone2015 has asked for the wisdom of the Perl Monks concerning the following question:

UPDATE: I answered my own question here. And, the moral of the story is: those that live by $_ will die by $_ -- or, at least, are in danger of quite a nasty cut, requiring stitches and everything.


I was footling around looking at IO layers. So I set up some strings containing various test patterns, and used open FH, '<', \$foo to read each one -- so I could see what happened to each under different conditions.

The code below illustrates the general shape of what I was doing.

The interesting thing is that when I run this (v5.10.0) I get:

Fits: 5, 6, 7
 Read: 'Fit I'
 Read: 'Fit II'
 Read: 'Fit III'
Fits: 5, 6, 7
 Read: 'Fit I'
 Read: 'Fit II'
 Read: 'Fit III'
Fits: undef, undef, undef
...so, somehow or other the entries in the array @fits have been set undef during the second foreach. Uh ?

The only difference between the first and second foreach is the use (or not) of a named variable. This is a serious surprise to me ! (I tried: our and my work the same way.)

Using the string as a "file" is obviously material. I haven't spotted any indication that this "eats" the string being "input". But, even if it does, my working model for Perl's parameter passing is that the my ($what) = @_ makes a copy of $_[0] (not another alias for whatever $_[0] is an alias for)...

So who mangled the entries in @fits, and was it with the lead pipe or the rope ?

Colour me curious -- any colour you like.

UPDATE: sorry ! The penny just dropped. It's a side effect of the file reading setting $_ to undef... and $_ is global, of course. Silly me. I'll get my coat.

use strict; use warnings; my $file_a = "Fit I" ; my $file_b = "Fit II" ; my $file_c = "Fit III" ; my @fits = ($file_a, $file_b, $file_c) ; show_fits() ; foreach my $f (@fits) { suck_this($f) ; } ; show_fits() ; foreach (@fits) { suck_this($_) ; } ; show_fits() ; sub show_fits { print "Fits: ", join(', ', map { defined($_) ? length($_) : 'undef' } @fits), "\n" ; } ; sub suck_this { my ($what) = @_ ; open my $FH, '<', \$what ; print " Read:" ; while (<$FH>) { s/\r/\\r/g ; s/\n/\\n/g ; print " '$_'" ; } ; print "\n" ; } ;

Replies are listed 'Best First'.
Re: Just when you thought you'd got it: 'foreach $f (@foo)' vs 'foreach (@foo)'
by ikegami (Patriarch) on Oct 15, 2008 at 15:42 UTC
    while (<$FH>) modifies global variable $_. Whenever you modify a global variable, localize it!
    sub suck_this { my ($what) = @_ ; open my $FH, '<', \$what ; print " Read:" ; local $_; while (<$FH>) { s/\r/\\r/g ; s/\n/\\n/g ; print " '$_'" ; } ; print "\n" ; } ;

      I personally believe that it would be very, very fussy of me to remind it, but just for completeness, it has to be said that there is a well known bug with localising $_ and tied objects, which -I just tested- is still there in 5.10.0 (while I had hoped for it to have been solved...) But then of course one has not to worry unless she's using tied objects...

      --
      If you can't understand the incipit, then please check the IPB Campaign.

        Worse. There are two bugs associated with local $var and neither is specific to $_. I wrote about them.

Re: Just when you thought you'd got it: 'foreach $f (@foo)' vs 'foreach (@foo)'
by wol (Hermit) on Oct 15, 2008 at 12:59 UTC
    So who mangled the entries in @fits, and was it with the lead pipe or the rope ?
    Clearly, it was oshalla, but with the candlestick. Luckily he was acquitted on a technicality.

    --
    .sig : File not found.

Re: Just when you thought you'd got it: 'foreach $f (@foo)' vs 'foreach (@foo)'
by repellent (Priest) on Oct 15, 2008 at 17:35 UTC
      I would recommend the idiom:
      while (local $_ = <$FH>)

      This is certainly best practice -- it stops the while from mangling the caller's precious $_.

      Unfortunately, being a good citizen doesn't prevent some mad person from running amuck with an Uzi and blowing away the firstborn.

      Consider:

      use strict ; use warnings ; my $mock = "I\nII\nIII\nIV" ; open my $FH, '<', \$mock ; $_ = 'firstborn' ; while (local $_ = <$FH>) { chomp ; print "$_=" ; tr/IV/iv/ ; amuck() ; print "$_ " ; } ; if (!defined($_)) { $_ = '*undef*' ; } ; print " -- $_\n" ; sub amuck { $_ = '!!!' } ;
      which gives:
        I=!!! II=!!! III=!!! IV=!!!   -- firstborn
      
      so we managed to keep the firstborn safe, but the loop is littered with bodies.

      Of course, without the local $_, ie: the common or garden:

      while (<$FH>)
      everybody gets it:
         I=!!! II=!!! III=!!! IV=!!!   -- *undef*

      Better is to lexically scope the $_, thus:

      while (my $_ = <$FH>)
      so that we can still implicitly use $_ in the loop, AND it now works (hurrah!):
         I=i II=ii III=iii IV=iv   -- !!!
      OK. The firstborn has gone, but that serves you right for leaving it unattended outside in the push-chair.

        The use of my on Perl special variables didn't come till a recent version of Perl, I believe.

        Nevertheless, I think it is a matter of usage whether to lexically scope my $_ within while or dynamically scope it local $_. I can foresee having an amuck() function wanting to read $_ instead of clobbering it.

        When I see code like $_ = ... alarm bells in me start ringing! ... and I drone into the need to localize ... :)

      I was about to tell you that I'm not sure whether I would recommend it because the

      while (<>) {...}

      is associated with quite a lot of dwimmy magic that would would vanish if you amend the construct that way, but when I was about to give you an example it turned out that it was actually a counter-example showing I was plainly wrong:

      pilsner:~ [11:58:42]$ perl -MO=Deparse -e 'while (<>) {}' while (defined($_ = <ARGV>)) { (); } -e syntax OK pilsner:~ [11:59:05]$ perl -MO=Deparse -e 'while (local $_=<>) {}' while (defined(local $_ = <ARGV>)) { (); } -e syntax OK

      Thus perl is smart enough to still add the definedness test even in that case: I stand corrected, and you taught me something new. I am fairly sure this was not the case with all perls, but I have now tested with 5.8.8 which is the oldest one I have currently access to, and it supports this feature. I don't know when it was actually introduced. I used to believe that it would only work with a variable assignment, but not if a local (or anything else) was preprended to it...

      I still feel like claiming that in most cases using while (<>) {...} responsibly would be enough.

      --
      If you can't understand the incipit, then please check the IPB Campaign.
        Yes, the dwimmy magic that adds the definedness test is still present. That's one thing I noticed, much to my relief.

        The issue with while (<>) { ... } is that the while clobbers $_. This makes debugging harder especially when it is used in some function/module somewhere else.

        Sins of Perl Revisited explains that module authors must be educated to localize $_ before changing it.

        If by "most cases", you mean not using while (<>) { ... } in a function/module, then yeah, I suppose it would be enough. But who's to say that some day, the existing code won't be packaged up wholesale and used elsewhere, causing interesting bugs ;-)

Re: Just when you thought you'd got it: 'foreach $f (@foo)' vs 'foreach (@foo)'
by blazar (Canon) on Oct 16, 2008 at 10:38 UTC

    I personally believe that perhaps this is a good situation to remind yet another time that 5.10.0 supports a lexical $_: if I only change

    foreach (@fits) { suck_this($_) ; } ;
    in your code to
    show_fits() ; { my $_; foreach (@fits) { suck_this($_) ; } ; }

    then I get

    C:\temp>perl osha.pl Fits: 5, 6, 7 Read: 'Fit I' Read: 'Fit II' Read: 'Fit III' Fits: 5, 6, 7 Read: 'Fit I' Read: 'Fit II' Read: 'Fit III' Fits: 5, 6, 7

    Here, I localized the lexical behaviour to a scope closest to the loop. But if I like it, I may set it once and for all the script say at the top of it. Of course, I may also do the same thing for one loop only:

    while (my $_ = <$fh>) { ... }

    but I don't like to assign explicitly to $_, and in that case I would probably use a named lexical variable.

    --
    If you can't understand the incipit, then please check the IPB Campaign.