Consider this pseudo-solution, to be adapted with your exact requirement, not fully available as of this writing.

Even though it is now a bit more complex, I stick to the idea of reading only once each of the two files, because it is usually much more efficient. So, after having read once file 1 and closed it, we need to read file 2 line by line and store in a nested data structure (probably a hash of arrays or a hash of hashes) the information collected. Once reading file 2 is completed, output the content of the data structure.

I am only displaying below the second while loop of my previous code, as there is no need to change the first loop on file 1.

my $margin = 500; open my $SC, "<", $file2 or die "Error: could not open $file2 $!"; my %result; my $step = 100; while (my $line2 = <$SC>) { my ($id, $val) = split /\t/, $line2; my $val_file1 = $hash{$id}; my ($low, $high) = ( $val_file1 - $margin, $val_file1 + $margin); next unless $val > $low and $val < $high; # value not within range +, just discard it my $delta = int (($val - $low)/$step); # delta : slot number $result{$id}{$delta}++; } close $SC; # now %result has, for each $ID, a frequency distribution by steps of +100 (slots 0 to 9), we just need to extract the data from it. for my $id (keys %result) { my $low = $hash{$id} - $margin; for my $slot (0..9){ my $range = sprintf "%d-%d", $low + $slot * $step, $low + ($sl +ot + 1) * $step; my $frequency = $result{$id}{$slot} // 0; print "ID $id: $range : $frequency \n"; } }
I *think* it should work more or less the way you want, but I cannot currently test that code on my tablet, so there may be a typo or an error here or there, or possibly an off-by-one mistake somewhere, but I think the basic idea should be there and it should be easy to get it straight with just a bit of testing.

If your "sliding windows" is different from what I have done, it should be just minor changes in the value of the params ($margin, $step) and perhaps a bit more work in the final printing of the results at the end, provided the %result hash has sufficiently detailed information.


In reply to Re^5: Sliding window perl program by Laurent_R
in thread Sliding window perl program by genome

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.