sorting a table columns using hashes

ihperlbeg has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: sorting a table columns using hashes by Marshall (Canon) on May 05, 2011 at 15:56 UTC
I am also a bit fuzzy on the requirements as the data doesn't appear to exercise all of the limits. But it appears that the data can be sorted in some order and then a filter (grep) operation run to select which lines are of relevance. See attached code. If I don't have it exactly right, I think you will be able to modify this pattern to do what you want. I'm not quite sure about the relationship between col 1 and col 2. The example data had only "matching sets" of those two columns. matching up the lowest col 5 with highest col 7 is accomplished by reversing the sort order of col 7 as shown below. #!/usr/bin/perl -w use strict; use Data::Dump qw(pp); my @AoA; while(<DATA>) { my @cols = split; push @AoA, [@cols]; } @AoA = sort{ $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] or $a->[4] <=> $b->[4] or $b->[6] <=> $a->[6] }@AoA; pp \@AoA; =prints [ ["xhahhxha", 60, 3, "shdgehsh", 8, 1, 150], #this one ["xhahhxha", 60, 3, "jrthtahtat", 8, 1, 110], ["xhahhxha", 60, 3, "hahaghagah", 10, 1, 101], ["hsghtahs", 100, 19, "shdgehsh", 10, 20, 400], #this one ["hsghtahs", 100, 19, "jrthtahtat", 10, 20, 300], ["hsghtahs", 100, 19, "hahaghagah", 10, 20, 200], ] =cut my %seen; @AoA = grep{!$seen{"$_->[1]"."$_->[0]"}++}@AoA; #first of new col1,2 c +ombo pp \@AoA; =prints [ ["xhahhxha", 60, 3, "shdgehsh", 8, 1, 150], ["hsghtahs", 100, 19, "shdgehsh", 10, 20, 400], ] =cut __DATA__ xhahhxha 60 3 hahaghagah 10 1 101 xhahhxha 60 3 jrthtahtat 8 1 110 xhahhxha 60 3 shdgehsh 8 1 150 hsghtahs 100 19 hahaghagah 10 20 200 hsghtahs 100 19 jrthtahtat 10 20 300 hsghtahs 100 19 shdgehsh 10 20 400 [download]	[reply] [d/l]
Re^2: sorting a table columns using hashes (weaving output) by LanX (Saint) on May 05, 2011 at 16:09 UTC
I like the way you are weaving the test output as POD into your code. I wonder if there is already a CPAN module allowing to automate this, since ~~`$.`~~ `__LINE__` and `caller` give the current line of source code. Might be handy for PM posts Cheers Rolf PS: talking about fuzzy requirements, why does the second columns have precedence over the first in your sort? UPDATE: corrected $. (which is only INPUT_LINE_NUMBER) UPDATE: see weaving output into code for a proof of concept	[reply] [d/l] [select]
Re^3: sorting a table columns using hashes (weaving output) by Marshall (Canon) on May 05, 2011 at 16:35 UTC
Glad you like my little POD trick. I don't know of any automated modules for this - interesting idea! I'm still unsure about the col 1 and col 2 stuff. I guess I keyed in on this phrase "(sorted based on second column..." and thought that was the primary key. My brain had a bit of trouble interpreting the spec. The test data doesn't have enough cases to unambiguously demonstrate the desired behavior. I hope the code is clear enough that the OP can make these precedence tweaks or other desired changes. cheers, Marshall	[reply]
Re: sorting a table columns using hashes by LanX (Saint) on May 05, 2011 at 15:25 UTC
> suggestions? sure: the perldocs for while, split, sort Searching for "Hash of Array" and "Array of Array " and "Schwartzian transform" might help. Showing us your attempts instead of just a fuzzy requirement hidden in the code-area will help giving you more constructive advices. Cheers Rolf UPDATE: If I understand your data, you should: parse your data into a hash (first column=key) of arrays of arrays (splitted lines). Sort the arrays of arrays combining different weighted criteria with or (search sort examples for `\|\|`) output of top-entry for every key of hash.	[reply] [d/l]
Re: sorting a table columns using hashes by Utilitarian (Vicar) on May 05, 2011 at 15:39 UTC
Does lowest value in column 5 override highest value in column 7 or versa vice? ie with the data `xhahhxha 60 3 hahaghagah 7 1 101 xhahhxha 60 3 jrthtahtat 8 1 110 xhahhxha 60 3 shdgehsh 10 1 150` [download] What do you output? `print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."`	[reply] [d/l] [select]
Re^2: sorting a table columns using hashes by ihperlbeg (Novice) on May 05, 2011 at 15:46 UTC
`with this data: xhahhxha 60 3 hahaghagah 7 1 101 xhahhxha 60 3 jrthtahtat 8 1 110 xhahhxha 60 3 shdgehsh 10 1 150 Output will be: xhahhxha 60 3 hahaghagah 7 1 101` [download]	[reply] [d/l]
Re^3: sorting a table columns using hashes by Utilitarian (Vicar) on May 05, 2011 at 15:56 UTC
#!/usr/bin/perl while (<DATA>){ chomp; @record=split(/\s+/,$_); if (defined $records{$record[0]}){ # We've seen it before and need + to compare the data if($record[1] < $records{$record[0]}->[1]){ # we have a smalle +r value and so should use this $records{$record[0]}->[1] = $record[1]; } if ( ($record[4] < $records{$record[0]}->[4]) ){ $records{$record[0]}->[3] = $record[3]; $records{$record[0]}->[4] = $record[4]; $records{$record[0]}->[6] = $record[6]; }elsif ( ( $record[4] == $records{$record[0]}->[4]) && ($recor +d[6] > $records{$record[0]}->[6]) ){ $records{$record[0]}->[3] = $record[3]; $records{$record[0]}->[6] = $record[6]; } } else{ @{$records{$record[0]}}=@record; } } for $key (reverse sort keys %records){ print join ("\t", @{$records{$key}}),"\n"; } __DATA__ xhahhxha 60 3 hahaghagah 10 1 101 xhahhxha 60 3 jrthtahtat 8 1 110 xhahhxha 60 3 shdgehsh 8 1 150 hsghtahs 100 19 hahaghagah 10 20 200 hsghtahs 100 19 jrthtahtat 10 20 300 hsghtahs 100 19 shdgehsh 10 20 400 __END__ xhahhxha 60 3 shdgehsh 8 1 150 hsghtahs 100 19 shdgehsh 10 20 400 [download] Ugly but functional `print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."`	[reply] [d/l] [select]