Alpha numeric sort

rickerl has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to sort values stored in a hash from a large dataset similar to the following:

Id, Loc, Value, Etime
N1:1, S5H1, 2.0, 1112146781
N1:1, S23H1, 1.2, 1112198223
N1:1, S100H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607

Sorted by Loc, all H1 come before H2 ordered by the middle term with respect to Id.

I'm not too familiar with map, but I believe it's what is needed. I just haven't figured out how to make it do what I need it to and would like some help.

Here's is my test code snippet and a small dataset:

#!/usr/bin/perl -w
use strict;

my %dataset;

while (<>) {
   # $_ => Id, Loc, Value, Etime

   my ($id, $loc, $value, $etime) = split(/,\s+/, $_);

   $dataset{$id}{$etime} = { "Value" => $value,
                             "Loc" => $loc,
                           };
}

foreach my $id (sort keys %dataset) {
   foreach my $etime ( sort { $dataset{$id}{$a}{Loc} cmp $dataset{$id}
+{$b}{Loc} } keys %{$dataset{$id}}) {
      my $loc = $dataset{$id}{$etime}{Loc};
      my $value = $dataset{$id}{$etime}{Value};

      printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime)
+;
   }
}
[download]

Input dataset:
N1:1, S100H1, 1.1, 1112149258
P1:1, S10H1, 1.1, 1112149258
N3:2, S102H1, 1.1, 1112149258
P4:2, S10H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607
P1:1, S9H2, 0.9, 1112176607
N3:2, S29H2, 0.9, 1112176607
P4:2, S10H2, 0.9, 1112176607
N1:1, S23H1, 1.2, 1112198223
P1:1, S2H1, 1.2, 1112198223
N3:2, S33H1, 1.2, 1112198223
P4:2, S21H1, 1.2, 1112198223
N1:1, S5H1, 2.0, 1112146781
P1:1, S15H1, 2.0, 1112146781
N3:2, S5H1, 2.0, 1112146781
P4:2, S1H1, 2.0, 1112146781

Desired output:
N1:1, S5H1, 2.0, 1112146781
N1:1, S23H1, 1.2, 1112198223
N1:1, S100H1, 1.1, 1112149258
N1:1, S19H2, 0.9, 1112176607
N3:2, S5H1, 2.0, 1112146781
N3:2, S33H1, 1.2, 1112198223
N3:2, S102H1, 1.1, 1112149258
N3:2, S29H2, 0.9, 1112176607
P1:1, S2H1, 1.2, 1112198223
P1:1, S10H1, 1.1, 1112149258
P1:1, S15H1, 2.0, 1112146781
P1:1, S9H2, 0.9, 1112176607
P4:2, S1H1, 2.0, 1112146781
P4:2, S10H1, 1.1, 1112149258
P4:2, S21H1, 1.2, 1112198223
P4:2, S10H2, 0.9, 1112176607

Thanks,
Ryan

Comment on Alpha numeric sort Download Code

Replies are listed 'Best First'.
Re: Alpha numeric sort by tlm (Prior) on Mar 30, 2005 at 23:08 UTC
You'll be deluged with responses containing the string "Schwartz(ian)? Transform" or "ST". See this post for an explanation. Basically, using the terminology of that post, you want a function `sub property { my $loc = ( split /,\s+/, $_[ 0 ] )[ 1 ]; $loc =~ /^S\d+H(\d+)$/; return $1; }` [download] Then, do the ST thang: `my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, property( $_ ) ] } @unsorted;` [download] the lowliest monk	[reply] [d/l] [select]
Re: Alpha numeric sort by thekestrel (Friar) on Mar 30, 2005 at 23:05 UTC
Hi, There was a post just the other day on doing Alpha sorts like you describe which can be found here which has some good approaches to this. Regards Paul	[reply]
Re: Alpha numeric sort by rickerl (Acolyte) on Mar 31, 2005 at 07:44 UTC
Cool, I just needed to do more thinking. I think I have it figured out... How's this? use strict; my %dataset; while (<>) { # $_ => Id, Loc, Value, Etime my ($id, $loc, $value, $etime) = split(/,/, $_); $dataset{$id}{$etime} = { "Value" => $value, "Loc" => $loc, }; } foreach my $id (sort keys %dataset) { foreach my $etime ( map { $_->[0] } sort sort_loc map { [$_, $datas +et{$id}{$_}{Loc}] } keys %{$dataset{$id}} ) { my $loc = $dataset{$id}{$etime}{Loc}; my $value = $dataset{$id}{$etime}{Value}; printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime) +; } } sub sort_loc { my ($loc_a, $head_a) = $a->[1] =~ /S(\d+)H(\d)/; my ($loc_b, $head_b) = $b->[1] =~ /S(\d+)H(\d)/; if (($head_a <=> $head_b) == 0) { $loc_a <=> $loc_b; } else { $head_a <=> $head_b; } } [download] Thanks again! Ryan	[reply] [d/l]
Re: Alpha numeric sort by rickerl (Acolyte) on Mar 31, 2005 at 07:24 UTC
Well, I have something working, but it isn't pretty. use strict; my %dataset; while (<>) { # $_ => Id, Loc, Value, Etime my ($id, $loc, $value, $etime) = split(/,/, $_); $dataset{$id}{$etime} = { "Value" => $value, "Loc" => $loc, }; } foreach my $id (sort keys %dataset) { foreach my $etime ( sort { my ($loc_a, $head_a) = $dataset{$id}{$a}{Loc} =~ + /S(\d+)H(\d)/; my ($loc_b, $head_b) = $dataset{$id}{$b}{Loc} =~ /S(\d+) +H(\d)/; if (($head_a <=> $head_b) == 0) { $loc_a <=> $loc_b; } else { $head_a <=> $head_b; } } keys %{$dataset{$id}} ) { my $loc = $dataset{$id}{$etime}{Loc}; my $value = $dataset{$id}{$etime}{Value}; printf(STDOUT "%2s, %6s, %.1f, %i\n", $id, $loc, $value, $etime) +; } } [download] Thanks for your help guys, Ryan	[reply] [d/l]