Re: speeding up parsing, jump to line

The answers to dave_the_m's questions will determine what sort of solution to look for. E.g. if the $data file is relatively small, you can load it into memory, then read each *.score.txt file exactly once to load up relevant info for each item in $data, then do stats on the info - something like this:

use strict;
use warnings;

my $data = "some/file.name";
open( F, $data ) or die "$data:$!\n";

my %targets;
while(<F>) {
    next if ( /Header/ );
    chomp;
    my ( $chr, $start, $end ) = ( split( /\t/ ))[0,2,3];
    push @{$targets{$chr}}, { table => $_, start => $start, end => $en
+d };
}

for my $chr ( keys %targets ) {
    open( R, "$chr.score.txt" ) or die "$chr.score.txt: $!\n";
    while (<R>) {
        chomp;
        my ( $pos, $score ) = ( split( /\t/ ))[1,2];
        for my $range ( @{$targets{$chr}} ) {
            if ( $pos >= $$range{start} and $pos <= $$range{end} ) {
                push @{$$range{scores}}, $score;
            }
        }
    }
}

# do statistics on contents of %targets…
[download]

If $data contains too much stuff, and/or requires too much stuff from the *.score.txt files to be held in memory for each $chr, then maybe you have to create output files for each $chr (or each start-end range in $data), so that it'll be quick/easy to do stats on those files.

Comment on Re: speeding up parsing, jump to line Download Code