in reply to Re: merge multiple files giving out of memory error
in thread merge multiple files giving out of memory error

Thank you for the help. Modified my code as:
my %seen; $/ = ""; while (<>) { chomp; my ($key, $value) = split ('\t', $_); my @lines = split /\n/, $key; my $key1 = $lines[1]; $seen{$key1}[Key] //= $key; $seen{$key1}[Sum] += $value; } my $file_count = @ARGV; foreach my $key1 ( keys %seen ) { if ( @{ $seen{$key1} } >= $file_count) { print join( "\t", @{$seen{$key1}}); print "\n\n"; } }
but please help me also to have the name of the files in which a particular read exists. I mean with the total count it also tells me in which files it is present.

Replies are listed 'Best First'.
Re^3: merge multiple files giving out of memory error
by Eily (Monsignor) on Feb 27, 2017 at 12:31 UTC

    my $file_count = @ARGV; foreach my $key1 ( keys %seen ) { if ( @{ $seen{$key1} } >= $file_count) { print join( "\t", @{$seen{$key1}}); print "\n\n"; } }
    This still doesn't make sense. If you add print "File count is: $file_count \n"; You'll find that $file_count is always 0, because after reading the files with while (<>), @ARGV is always empty. And you check the size of the array in $seen{$key1}, but it always is 2 (there are two elements, Key, and Sum).

    When you use while (<>) to read from a list of files, the current file is $ARGV.

    # At the top of the file use constant { Key => 0, Sum => 1, Count => 2, # Remove this if you don't use the total count Files => 3 # Should be 2 if Count is not used. };
    # In the read loop $seen{$key1}[Key] //= $key; $seen{$key1}[Sum] += $value; $seen{$key1}[Count]++; # Total count for the number of times t +his value exists $seen{$key1}[Files]{$ARGV}++; # Count in this file

    You don't seem to want a particular format for your output (because you changed it when adapting my proposition), so you could try just dumping the whole structure using either Data::Dumper (nothing to install) or YAML (needs to be installed, but can be nicer to read).

    use Data::Dumper; while (<>) { # Your code here } print Dumper(\%seen);
    Or
    use YAML; while (<>) { # Your code here } print YAML::Dump(\%seen);

      Thank you for the help. I am sorry I didn't mention it earlier, I want the name of files also with the count in each. Can you please help me in this also. very sorry if this irritates you.

        Well it's there, in $seen{$key1}[Count]. If that's not what you want you'll have to be more explicit.