To handle multiple fields, I have the hash store a hash of values, rather than a single value. That way I can access the fields by name. The top level is still controlled by coordinate name. The changes from the previous version are minor:

use warnings; use strict; open my $data, '<', 'alldata.txt' or die "Could not open 'alldata.txt': $!\n"; my %dataset; RECORD: while ( my $line = <$data> ) { chomp $line; next RECORD unless $line =~ m{\d}; my ( $coord, $dist, $chr, $exons, $pals ) = split /\s+/, $line; next RECORD if exists $dataset{$coord} and $dataset{$coord}{dist} < $dist; $dataset{$coord} = { dist => $dist, chr => $chr, exons => $exons, palindromes => $pals, }; } say "coord\tdist\tchr\texons\tpalindromes"; say "$_\t$dataset{$_}{dist}\t$dataset{$_}{chr}\t$dataset{$_}{exons}\t" . "$dataset{$_}{palindromes} " for sort { $a <=> $b } keys %dataset;

The entire data structure, according to Data::Dumper:

$VAR1 = { '6700' => { 'palindromes' => '4', 'chr' => '13', 'exons' => '1', 'dist' => 50 }, '4678' => { 'palindromes' => '0', 'chr' => '6', 'exons' => '2', 'dist' => '45' }, '2346' => { 'palindromes' => '567', 'chr' => '12', 'exons' => '1', 'dist' => '78' }, '5349' => { 'palindromes' => '14', 'chr' => '8', 'exons' => '2', 'dist' => '6' }, '3456' => { 'palindromes' => '5', 'chr' => '10', 'exons' => '1', 'dist' => 67 }, '567' => { 'palindromes' => '8', 'chr' => '5', 'exons' => '7', 'dist' => '344' }, '1345' => { 'palindromes' => '123', 'chr' => '5', 'exons' => '8', 'dist' => '567' }, '8964' => { 'palindromes' => '8', 'chr' => '2', 'exons' => '18', 'dist' => '560' } };

P.S. The debugger is your friend: perl -d program.pl.

As Occam said: Entia non sunt multiplicanda praeter necessitatem.


In reply to Re: Repeats exclusion by TomDLux
in thread Repeats exclusion by Grig

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.