As
virtualsue and
GrandFather have advised, a hash is the way to go. You can always recreate an array from your hash afterwards if you need to. The code below uses a form of Schwartzian Transform and works by getting the three elements from each data line then sorting the data lines in descending numerical order on the number field. Thus, for whichever protein/organism combination, the smallest will come last. Finally, the sorted line items are placed in turn into the hash, successively smaller values overwriting any previous larger "duplicates". I then rebuild an array at this point but that may not be what you actually want.
use strict;
use warnings;
my %smallest =
map {
$_->[0] => {
org => $_->[1],
val => $_->[2] }
}
sort { $b->[2] <=> $a->[2] }
map { chomp; [ split m{\t} ] }
<DATA>;
my @sorted = ();
foreach my $protein ( sort keys %smallest )
{
push @sorted, [
$protein,
$smallest{$protein}->{org},
$smallest{$protein}->{val} ];
}
print Data::Dumper->Dump([\@sorted], [qw{*sorted}]);
__END__
protein1 organism1 0.843534
protein2 organism2 2.45
protein3 organism3 9.5322
protein4 organism4 0.3475474
protein1 organism6 9.4534
protein2 organism7 0.43534
protein2 organism8 1.2434
protein3 organism9 0.000003
protein3 orgnanism10 1.23325
Here's the output
@sorted = (
[
'protein1',
'organism1',
'0.843534'
],
[
'protein2',
'organism7',
'0.43534'
],
[
'protein3',
'organism9',
'0.000003'
],
[
'protein4',
'organism4',
'0.3475474'
]
);
I hope this is of use.
Cheers,
JohnGG
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.