in reply to Re^4: reading multiple files one at a time
in thread reading multiple files one at a time

Sure. Doing so has also highlighted an error

The diamond operator <$var>, reads lines from a file that is pointed at by the handle $var. As with many operators and function calls in Perl, the diamond operator is sensitive to where (termed the context), it is being used, to control some aspects of it's behaviour. For example, if the variable (or filehandle) $fh has been opened to a file, then

my $line = <$fh>;

reads one line from that file and assigns it to the (scalar) variable $line. This is termed a scalar context; a single (scalar) value is assigned.

However, if the same filehandle is used this way:

my @lines = <$fh>;

then instead of one line being read from the file, all the lines are read from the file and the array @lines is extended to hold as many lines as there are available in the file. This is termed a list context; multiple (a list of) variables are assigned. It's tempting to think of this as an

array</i> context, but this is a frowned upon term as the variables ne +edn't be an array. <code> my( $v1, $v2, $v3, $v4, $v5 ) = <$fh>;

Here, a list of individually named scalar variables (rather than an array of collectively named scalar variables) are assigned. Hence the statement is a list assignment. In this case, the first five lines from the file will be assigned to the five named variables, but all the lines from the file will have been read into memory. All those after the 5th line will be discarded.

The scalar operator is used to force an operator or function into a scalar context, when it would otherwise be called in a list context. So this

my @lines = scalar <$fh>;

Will cause the diamond operator to be called in a scalar context even though it's results are being assigned to an array. After this statement, @lines will contain just a single scalar variable ($lines[ 0 ]), which will contain the first line from the file. Only that line will have been read from the file and the filehandle ($fh) will be pointing at the start of the second line ready for another call to the diamond operator.

In the snippet of code, @fhs is an array of filehandles each opened to a different file, the names read from the command line via @ARGV. Having opened all the files, the task is to read one line from each file in turn and append those lines together to form a single line of output and repeat that process for all the lines in all the file(handle)s.

To do this, we need to call the diamond operator on each of the filehandles (@fhs) in turn, in a scalar context to ensure that only one line is read from each, and then append those lines together (after removing the newlines ("\n") from each, and then print the composite line out.

So, I range over @fhs using map, assigning each to $_ in turn. Applying scalar to the diamond operator scalar <$_> ensures that it is called in a scalar context and only returns one line at a time from each filehandle. These lines are assigned to @lines. I can then use chomp to remove the newlines from all of the lines and the join to concatenate them together before printing them out for redirection to the composite file.

However, if one or more of the files is shorter than the others, then the diamond operator will return undef when attempting to read a another line from a file that has already been exhausted. Pragmatically, Perl will allow the loop to continue, but it will issue a warning when you attempt to join that undefined value with the other values read in:

print join '', 'fred', 'bill', undef, 'john';; Use of uninitialized value in join or string at (eval 3) fredbilljohn

As I posted the example code, I thought about that situation and (too quickly) added a "quick fix" to deal with it. I made a mistake! It doesn't work at all..

I've corrected that in the snippet above.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.