Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
eg: file1 @ns ATTGCGTTC + //#$@TMSQ 2 @ns GGAGCGTTC + //#$@TMSQ 3
file2 @ns ATTGCGTTC + //#$@#//A 1
The program should give the count if second line of each read matches. I have written the following code for this. THe code is working perfectly well but consumes large amount of memory. When I tried the codes for huge files it hangs with no output. Can anyone modify my code so that minimum memory is used.output: @ns ATTGCGTTC + //#$@TMSQ 2 1 count:3 @ns GGAGCGTTC + //#$@TMSQ 3 count:3
#!/usr/bin/env perl use strict; use warnings; no warnings qw( numeric ); my %seen; $/ = ""; while (<>) { chomp; my ($key, $value) = split ('\t', $_); my @lines = split /\n/, $key; my $key1 = $lines[1]; $seen{$key1} //= [ $key ]; push (@{$seen{$key1}}, $value); } foreach my $key1 ( sort keys %seen ) { my $tot = 0; my $file_count = @ARGV; for my $val ( @{$seen{$key1}} ) { $tot += ( split /:/, $val )[0]; } if ( @{ $seen{$key1} } >= $file_count) { print join( "\t", @{$seen{$key1}}); print "\tcount:". $tot."\n\n"; } }
|
|---|