As you have seen, the first code doesn't allow multiple queries to the DATA easily (would have to perhaps re-read the file or in some cases, deal with the fact that the line that ends an input record is the same thing that starts another input record).

If the data can fit into memory, then I like to to that way rather than deal with these things. There of course many ways to do this, here is one:

#!/usr/bin/perl -w use strict; use Data::Dumper; my %data; #a hash of array my $chrom; while ( defined(my $line =<DATA>) ) { chomp ($line); if ($line =~ /chrom=(\w+)$/) {$chrom = $1; next;} push ( @{$data{$chrom}}, $line); } my @triples = ("chr1 9837 9840", #same as your @triples "chr1 99998 99999", #just different spacing "chr2 9838 9840"); #print Dumper \%data; # uncomment this line and see what it does # a very powerful tool foreach (@triples) { my ($chrom, $start, $stop) = split; my @values = get_values(\%data, $chrom, $start, $stop); if (!@values) { print "No values for $chrom tags found between ". "$start and $stop inclusive\n"; } else { print "mean for $chrom tags {$start..$stop} is ", average(\@values),"\n"; print " values were: @values\n"; } } sub get_values { my ($HoA_ref, $chrom, $start, $stop) = @_; my @result; foreach my $number_string (@{$HoA_ref->{$chrom}}) { my ($tag, $value) = split(/\s+/,$number_string); push (@result, $value) if ($tag >= $start and $tag <= $stop); } return @result; } sub average #your average (mean) routine # { my ($array_ref) = @_; my $sum; my $count = scalar @$array_ref; foreach (@$array_ref) { $sum += $_; } return $sum / $count; } =prints mean for chr1 tags {9837..9840} is 0.00725 values were: 0.010 0.008 0.007 0.004 No values for chr1 tags found between 99998 and 99999 inclusive mean for chr2 tags {9838..9840} is 0.033 values were: 0.038 0.017 0.044 =cut __DATA__ variableStep chrom=chr1 9837 0.010 9838 0.008 9839 0.007 9840 0.004 9841 0.002 9842 0.001 variableStep chrom=chr2 9837 0.090 9838 0.038 9839 0.017 9840 0.044 9841 0.052 9842 0.091

In reply to Re^5: extract relevent lines according to array by Marshall
in thread extract relevent lines according to array by coldy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.