rickerl has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to sort values stored in a hash from a large dataset similar to the following:

Id, Loc, Value, Etime
N1:1, S5H1, 2.0, 1112146781
N1:1, S23H1, 1.2, 1112198223
N1:1, S100H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607

Sorted by Loc, all H1 come before H2 ordered by the middle term with respect to Id.

I'm not too familiar with map, but I believe it's what is needed. I just haven't figured out how to make it do what I need it to and would like some help.

Here's is my test code snippet and a small dataset:
#!/usr/bin/perl -w use strict; my %dataset; while (<>) { # $_ => Id, Loc, Value, Etime my ($id, $loc, $value, $etime) = split(/,\s+/, $_); $dataset{$id}{$etime} = { "Value" => $value, "Loc" => $loc, }; } foreach my $id (sort keys %dataset) { foreach my $etime ( sort { $dataset{$id}{$a}{Loc} cmp $dataset{$id} +{$b}{Loc} } keys %{$dataset{$id}}) { my $loc = $dataset{$id}{$etime}{Loc}; my $value = $dataset{$id}{$etime}{Value}; printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime) +; } }
Input dataset:
N1:1, S100H1, 1.1, 1112149258
P1:1, S10H1, 1.1, 1112149258
N3:2, S102H1, 1.1, 1112149258
P4:2, S10H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607
P1:1, S9H2, 0.9, 1112176607
N3:2, S29H2, 0.9, 1112176607
P4:2, S10H2, 0.9, 1112176607
N1:1, S23H1, 1.2, 1112198223
P1:1, S2H1, 1.2, 1112198223
N3:2, S33H1, 1.2, 1112198223
P4:2, S21H1, 1.2, 1112198223
N1:1, S5H1, 2.0, 1112146781
P1:1, S15H1, 2.0, 1112146781
N3:2, S5H1, 2.0, 1112146781
P4:2, S1H1, 2.0, 1112146781

Desired output:
N1:1, S5H1, 2.0, 1112146781
N1:1, S23H1, 1.2, 1112198223
N1:1, S100H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607
N3:2, S5H1, 2.0, 1112146781
N3:2, S33H1, 1.2, 1112198223
N3:2, S102H1, 1.1, 1112149258
N3:2, S29H2, 0.9, 1112176607
P1:1, S2H1, 1.2, 1112198223
P1:1, S10H1, 1.1, 1112149258
P1:1, S15H1, 2.0, 1112146781
P1:1, S9H2, 0.9, 1112176607
P4:2, S1H1, 2.0, 1112146781
P4:2, S10H1, 1.1, 1112149258
P4:2, S21H1, 1.2, 1112198223
P4:2, S10H2, 0.9, 1112176607

Thanks,
Ryan

Replies are listed 'Best First'.
Re: Alpha numeric sort
by tlm (Prior) on Mar 30, 2005 at 23:08 UTC

    You'll be deluged with responses containing the string "Schwartz(ian)? Transform" or "ST". See this post for an explanation. Basically, using the terminology of that post, you want a function

    sub property { my $loc = ( split /,\s+/, $_[ 0 ] )[ 1 ]; $loc =~ /^S\d+H(\d+)$/; return $1; }
    Then, do the ST thang:
    my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, property( $_ ) ] } @unsorted;

    the lowliest monk

Re: Alpha numeric sort
by thekestrel (Friar) on Mar 30, 2005 at 23:05 UTC
    Hi,
    There was a post just the other day on doing Alpha sorts like you describe which can be found here which has some good approaches to this.

    Regards Paul
Re: Alpha numeric sort
by rickerl (Acolyte) on Mar 31, 2005 at 07:44 UTC
    Cool, I just needed to do more thinking. I think I have it figured out...

    How's this?
    use strict; my %dataset; while (<>) { # $_ => Id, Loc, Value, Etime my ($id, $loc, $value, $etime) = split(/,/, $_); $dataset{$id}{$etime} = { "Value" => $value, "Loc" => $loc, }; } foreach my $id (sort keys %dataset) { foreach my $etime ( map { $_->[0] } sort sort_loc map { [$_, $datas +et{$id}{$_}{Loc}] } keys %{$dataset{$id}} ) { my $loc = $dataset{$id}{$etime}{Loc}; my $value = $dataset{$id}{$etime}{Value}; printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime) +; } } sub sort_loc { my ($loc_a, $head_a) = $a->[1] =~ /S(\d+)H(\d)/; my ($loc_b, $head_b) = $b->[1] =~ /S(\d+)H(\d)/; if (($head_a <=> $head_b) == 0) { $loc_a <=> $loc_b; } else { $head_a <=> $head_b; } }
    Thanks again!
    Ryan
Re: Alpha numeric sort
by rickerl (Acolyte) on Mar 31, 2005 at 07:24 UTC
    Well, I have something working, but it isn't pretty.
    use strict; my %dataset; while (<>) { # $_ => Id, Loc, Value, Etime my ($id, $loc, $value, $etime) = split(/,/, $_); $dataset{$id}{$etime} = { "Value" => $value, "Loc" => $loc, }; } foreach my $id (sort keys %dataset) { foreach my $etime ( sort { my ($loc_a, $head_a) = $dataset{$id}{$a}{Loc} =~ + /S(\d+)H(\d)/; my ($loc_b, $head_b) = $dataset{$id}{$b}{Loc} =~ /S(\d+) +H(\d)/; if (($head_a <=> $head_b) == 0) { $loc_a <=> $loc_b; } else { $head_a <=> $head_b; } } keys %{$dataset{$id}} ) { my $loc = $dataset{$id}{$etime}{Loc}; my $value = $dataset{$id}{$etime}{Value}; printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime) +; } }
    Thanks for your help guys,
    Ryan