jsrn has asked for the wisdom of the Perl Monks concerning the following question:

Explaining the redo operator, The Camel (3rd edititon, page 122/top) presents the following code snippet:
# Taken from The Camel, 3rd ed. while (<>) { chomp; if (s/\\//) { $_ .= <>; redo unless eof; # don't read past each file's eof } # now proces $_ }
What's the effect of the redo unless eof statement? Certainly it's *not* preventing the code from reading past each file's eof (as the first comment suggests). Consider an input file like this (presuming that the trailing visible character of each line of the file is immedeately followed by a newline):
# file_1 (this line doesn't belong to the file's content) Line one\ Line two
Packaging the above code snippet in a file named readbackslash.pl, making this file executable (not forgetting to insert the shebang, of course) and executing it with the input file as an argument ./readbackslash.pl file_1 has the following effect: After the conditional of the while loop has been evaluated, $_ has the value "Line one\\n". It is then chomped and the trailing backslash is removed. The next line is appended, thus $_ now contains "Line oneLine two\n". Reading once more from file_1 would return end-of-file, therefore eof returns true, thus the loop is not redone, meaning that $_ is not chomped once more, the newline remains at its place. Thus, the effect of the redo unless eof statement is to prevent the chomping of the last newline of the file, if the line before (the "second last line") has a trailing backslash in it. *This* is the effect of the redo unless eof statement. It *does not* prevent from crossing file boundaries, just consider the following two input files:
# file_2 (this line doesn't belong to the file's content) Line three Line four\ # file_3 (this line doesn't belong to the file's content) Line five Line six
Executing the script as readbackslash.pl file_2 file_3, $_ successively becomes "Line three", "Line fourLine five" (file bounds clearly crossed), "Line six". IMHO, the code as presented in The Camel is not very useful (but, on the other hand, as I'm not very experienced yet, I could easily have missed something). I would rewrite the above snippet as:
# My code while (<>) { chomp; if (s/\\//) { unless (eof) { # don't read past each file's eof $_ .= <>; redo; } } # now process $_ }
Thus making code and comment correlate and prevent $_ from containing some strings that are chomped and some that are not.

I looked at the errata page of The Camel but I didn't find anything concerning this issue.

Any explanations -- Or have I completely missed something?

Aside: I have looked through the perlsyn manpage and found the same example as in The Camel , except that instead of eof, eof() is used. AFAIK, this reveals the same behaviour as described above (leaving the last line of the file occasionally un-chomped), but only in the last file in @ARGV.

Jonas

Edited 2001-01-21 by Ovid. View source to see details.

Replies are listed 'Best First'.
Re: Comment not matching code in a href
by trs80 (Priest) on Jan 22, 2002 at 06:10 UTC
    I think given the context of the example in the Camel that the behavior is correct. I do think it is a poor example for someone that is new to Perl.
    Prior to the example it is stated that suppose we had a file that had \ at the end of a line to indicate that is continued on the next line. This is common in shell scripts. Your examples are assuming that the lines do not have a space before the \ character, again this is not typical in shell scripts. So perhaps a good example of a file would be:
    find ./ --name igloo.txt | \ xargs ls -l $1 | \ grep 'Dec'
    Now that is a pointless shell script, but it illustrates the use of the \ at the end of a line that I think might be what the example in the book is refering to.
    A more readable example would be:
    Hello Cold Cruel \ World. Are you ready to take a trip to \ the mall with me for some very delicious ice cream? I think this will make for a delightful day, \ don't you?
    it would "do the right thing", which is get the next line of the file and append it to the previous lines before it enters the # now process $_ block if the pattern match is a success.

    So the eof is needed to prevent it from reading past it, since in effect the redo conditional is superseding the while statement. That is if I am understanding the redo function correctly.

    Revised example:
    while (<DATA>) { chomp; if (s/\\//) { $_ .= <DATA>; redo unless eof; # don't read past each file's eof } # now proces $_ print $_ . "\n"; } __DATA__ Hello Cold Cruel \ World. Are you ready to take a trip to \ the mall with me for some very delicious ice cream? I think this will make for a delightful day, \ don't you?
Re: Comment not matching code in a href
by goldclaw (Scribe) on Jan 22, 2002 at 04:00 UTC
    Try removing the "unless eof", then add a \ to the end of second file and you'll see why its there. Its to prevent you reading after you have read the last line of the last file.

    goldclaw