filehandle in an array

bestfa has asked for the wisdom of the Perl Monks concerning the following question:

I want to use filehadle array to divide an input file according to their data (So the number of output files can vary according to data). For example, the input file is a input.txt

----------input--------------- 
street-cat animal
apple plant
strawberry plant
muffler goods
-----------------------
[download]

Then I want to get 3 output files like input.txtanimal.txt, input.txtplant.txt, input.txtgoods.txt. So I coded like

---------simplified code--------------
@filehandle_array=@data_type_array;
$data_type_number=@data_type_array;
for($i=0;$i<$data_type_number;$i++)
{
 open $filehandle_array&#91;$i], '>', $filename_array[$i];
}
----------------------
[download]

But this code didn't make an error message and didn't produce output files. Is there an easy way to use filehandles in an array?

Thank you for replies. I think that I found my real problem. In my real data, there are impossible characters like \ / : * ? " < > |. They cannot be used for file names.

Comment on filehandle in an array Select or Download Code

Replies are listed 'Best First'.

Re: filehandle in an array
by kcott (Archbishop) on Feb 22, 2016 at 09:17 UTC

G'day bestfa,

Welcome to the Monastery.

"... the number of output files can vary according to data ..."

In that case, I'd only open files on demand.

"But this code didn't make an error message ..."

Perl will provide you with diagnostics if you ask it. Always put these two lines at the start of all your code:

use strict;
use warnings;
[download]

See strict and warnings for details.

You should also check your I/O operations. You can do this manually but its time-consuming and error-prone: just let Perl do it for you (with the autodie pragma) by adding this line:

use autodie;
[download]

"... and didn't produce output files."

You didn't declare or populate @filename_array. You also didn't attempt to write any output data.

In addition, also note:

You didn't declare or populate @data_type_array.
@filehandle_array=@data_type_array; is almost certainly wrong.

These long-winded, C-like lines of code:

$data_type_number=@data_type_array;
for($i=0;$i<$data_type_number;$i++)
{
    ...
}
[download]

Can be written far more succinctly with this Perl-like code:

for (0 .. $#data_type_array) {
    ...
}
[download]

The use of global variables throughout your code is a disaster waiting to happen. Prefer lexical variables, and control their scope, for far less error-prone code.

More information on the points I've raised can be found in perlintro. Each section has links to further discussion and advanced usage: follow as required.

Your idea of storing filehandles in a data structure is fine; however, your code is problematic (and incomplete). Here's how I might have approached this task:

#!/usr/bin/env perl -l

use strict;
use warnings;
use autodie;

use constant {
    FILENAME   => 0,
    FILEHANDLE => 1,
}; 

my %file = (
    animal => [ 'pm_1155794_animal.txt' ],
    plant  => [ 'pm_1155794_plant.txt' ],
    goods  => [ 'pm_1155794_goods.txt' ],
);

while (<DATA>) {
    my ($item, $type) = split;
    print { get_fh($type) } $item;
} 

sub get_fh {
    my $type = shift;

    open $file{$type}[FILEHANDLE], '>', $file{$type}[FILENAME]
        unless defined $file{$type}[FILEHANDLE];

    return $file{$type}[FILEHANDLE];
} 

__DATA__
cat animal
apple plant
strawberry plant
muffler goods
[download]

The following, simple checks show this worked:

$ cat pm_1155794_animal.txt
cat
$ cat pm_1155794_plant.txt
apple
strawberry
$ cat pm_1155794_goods.txt
muffler
[download]

— Ken

[reply]
[d/l]
[select]

Re: filehandle in an array
by Athanasius (Archbishop) on Feb 22, 2016 at 08:23 UTC

Hello bestfa, and welcome to the Monastery!

Please put your code in <code> tags — see Markup in the Monastery

If you’re reading the input file line-by-line, you’ll want to create filehandles on the fly, one for each unique data type. And when you open each output file, you’ll have to open it for appending (>>), not writing (>), to allow successive iterations of the loop to add new entries to the end of the file.

The data structure you need is a hash, not an array. Specifically, a hash that maps output ~~file names~~ data types to their corresponding filehandles. Then you can use Perl’s built-in exists function on the hash to determine whether a given data type already has an associated filehandle. Here is how I would approach this task:

#! perl
use strict;
use warnings;

my $in_file = 'input.txt';
my %out_files;

open(my $in, '<', $in_file)
    or die "Cannot open file '$in_file' for reading: $!";

while (my $line = <$in>)
{
    my ($datum, $type) = split ' ', $line;

    unless (exists $out_files{$type})
    {
        my $filename = $in_file . $type . '.txt';

        open(my $fh, '>>', $filename)
            or die "Cannot open file '$filename' for appending: $!";

        $out_files{$type} = $fh;
    }

    print { $out_files{$type} } $datum . "\n";
}

for (keys %out_files)
{
    close $out_files{$_}
        or die "Cannot close the '$out_files{$_}' output file: $!";
}

close $in
    or die "Cannot close file '$in_file': $!";
[download]

Update: Fixed die message for closing output files; also corrected fourth paragraph.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: filehandle in an array

by bestfa (Novice) on Feb 23, 2016 at 03:49 UTC

You are awesome. ^-^ I think that I found my real problem. In my real data, there are impossible characters like \ / : * ? " < > |. They cannot be used for file names.

[reply]