Re^4: How to make a hash to evaluate columns between large datasets

This is great! Thanks a ton! As a test I tried my old script vs the better one with your method on just 10 lines on my not high-end work computer:

Elapsed time with your script: 00:00:00.959781

Elapsed time with my original: 00:00:02.324184

Multiply this difference by a few hundred thousand for the complete input files, and you can really note the improvement.

Comment on Re^4: How to make a hash to evaluate columns between large datasets

Replies are listed 'Best First'.
Re^5: How to make a hash to evaluate columns between large datasets by FreeBeerReekingMonk (Deacon) on Aug 25, 2018 at 00:30 UTC
If you are still implementing threads: Don't forget to use flock while writing to file inside the threads, something like this: `use threads; use Fcntl qw(:flock SEEK_END); if ($end >= $ref->{start} && $end <= $ref->{end}) { lock($out_fh); say $out_fh join("\t", $ID, $strand, $chr, $start, $end, $sequence +, $numPositions, $mismatches\|\|"", $ref->{info}); unlock($out_fh); } sub lock { my ($fh) = @_; flock($fh, LOCK_EX) or die $!; seek($fh, 0, SEEK_END) or die $!; } sub unlock { my ($fh) = @_; flock($fh, LOCK_UN) or die $!; }` [download] Alternatively, use shared objects (collect it in an array or hash) and write it down later.	[reply] [d/l]

Replies are listed 'Best First'.

Re^5: How to make a hash to evaluate columns between large datasets
by FreeBeerReekingMonk (Deacon) on Aug 25, 2018 at 00:30 UTC

use threads;
use Fcntl qw(:flock SEEK_END);


      if ($end >= $ref->{start} && $end <= $ref->{end}) {
    lock($out_fh);
    say $out_fh join("\t", $ID, $strand, $chr, $start, $end, $sequence
+, $numPositions, $mismatches||"", $ref->{info});
    unlock($out_fh);
      }


sub lock {
  my ($fh) = @_;
  flock($fh, LOCK_EX) or die $!;
  seek($fh, 0, SEEK_END) or die $!;
}
sub unlock {
  my ($fh) = @_;
  flock($fh, LOCK_UN) or die $!;
}
[download]

Alternatively, use shared objects (collect it in an array or hash) and write it down later.

[reply]
[d/l]