This is my go at your questions. For the first one:

#!/usr/bin/perl -w use strict; use warnings; my %hash; while (<DATA>) { my ($coord, $dist) = split; $hash{$coord} = $dist unless defined $hash{$coord} && $dist > $hash{$coord}; } print "$_\t$hash{$_}\n" foreach (sort { $a <=> $b } keys %hash); __DATA__ 567 344 1345 567 2346 78 3456 67 3456 789 4678 45 5349 6 6700 124 6700 50 8964 560

What's important to notice here is that the distance for the coordinate is saved every time except when the coordinate already has a distance and the new distance is larger than the one already saved.

For the second question, where you add more columns... well, you need either a hash or an array to accommodate those. But the idea is the same:

#!/usr/bin/perl -w use strict; use warnings; my %hash; while (<DATA>) { my ($coord, @cols) = split; $hash{$coord} = \@cols unless (defined $hash{$coord} && $cols[0] > $hash{$coord}[0]); } print join "\t", qw(coord dist chr exons palindromes), "\n"; print join "\t", $_, @{$hash{$_}},"\n" foreach (sort {$a <=> $b} keys %hash); __DATA__ 567 344 5 7 8 1345 567 5 8 123 2346 78 12 1 567 3456 67 10 1 5 3456 789 10 3 6 4678 45 6 2 0 5349 6 8 2 14 6700 124 13 8 56 6700 50 13 1 4 8964 560 2 18 8

Here I chose a hash of arrays.


In reply to Re: Repeats exclusion by jfraire
in thread Repeats exclusion by Grig

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.