in reply to sorting a table columns using hashes

I am also a bit fuzzy on the requirements as the data doesn't appear to exercise all of the limits. But it appears that the data can be sorted in some order and then a filter (grep) operation run to select which lines are of relevance. See attached code.

If I don't have it exactly right, I think you will be able to modify this pattern to do what you want. I'm not quite sure about the relationship between col 1 and col 2. The example data had only "matching sets" of those two columns. matching up the lowest col 5 with highest col 7 is accomplished by reversing the sort order of col 7 as shown below.

#!/usr/bin/perl -w use strict; use Data::Dump qw(pp); my @AoA; while(<DATA>) { my @cols = split; push @AoA, [@cols]; } @AoA = sort{ $a->[1] <=> $b->[1] or $a->[0] cmp $b->[0] or $a->[4] <=> $b->[4] or $b->[6] <=> $a->[6] }@AoA; pp \@AoA; =prints [ ["xhahhxha", 60, 3, "shdgehsh", 8, 1, 150], #this one ["xhahhxha", 60, 3, "jrthtahtat", 8, 1, 110], ["xhahhxha", 60, 3, "hahaghagah", 10, 1, 101], ["hsghtahs", 100, 19, "shdgehsh", 10, 20, 400], #this one ["hsghtahs", 100, 19, "jrthtahtat", 10, 20, 300], ["hsghtahs", 100, 19, "hahaghagah", 10, 20, 200], ] =cut my %seen; @AoA = grep{!$seen{"$_->[1]"."$_->[0]"}++}@AoA; #first of new col1,2 c +ombo pp \@AoA; =prints [ ["xhahhxha", 60, 3, "shdgehsh", 8, 1, 150], ["hsghtahs", 100, 19, "shdgehsh", 10, 20, 400], ] =cut __DATA__ xhahhxha 60 3 hahaghagah 10 1 101 xhahhxha 60 3 jrthtahtat 8 1 110 xhahhxha 60 3 shdgehsh 8 1 150 hsghtahs 100 19 hahaghagah 10 20 200 hsghtahs 100 19 jrthtahtat 10 20 300 hsghtahs 100 19 shdgehsh 10 20 400

Replies are listed 'Best First'.
Re^2: sorting a table columns using hashes (weaving output)
by LanX (Saint) on May 05, 2011 at 16:09 UTC
    I like the way you are weaving the test output as POD into your code.

    I wonder if there is already a CPAN module allowing to automate this, since $. __LINE__ and caller give the current line of source code.

    Might be handy for PM posts

    Cheers Rolf

    PS: talking about fuzzy requirements, why does the second columns have precedence over the first in your sort?

    UPDATE: corrected $. (which is only INPUT_LINE_NUMBER)

    UPDATE: see weaving output into code for a proof of concept

      Glad you like my little POD trick. I don't know of any automated modules for this - interesting idea!

      I'm still unsure about the col 1 and col 2 stuff. I guess I keyed in on this phrase "(sorted based on second column..." and thought that was the primary key. My brain had a bit of trouble interpreting the spec. The test data doesn't have enough cases to unambiguously demonstrate the desired behavior. I hope the code is clear enough that the OP can make these precedence tweaks or other desired changes.

      cheers, Marshall