in reply to Re: Sorting based on any column
in thread Sorting based on any column

This is basically one of the solutions I offered above, but with a number of caveats about the data.

Replies are listed 'Best First'.
Re^3: Sorting based on any column
by Anonymous Monk on May 20, 2015 at 11:07 UTC

    Sorry for not being clear in requirement. Consider my OBJD Array data is like shown below:

    1 ab 2 3 cd 4 5 6 6 9 rc 4 5 ef 6 3 4 1 7 fa 5 2 tg 5 9 9 0 3 bg 3 9 jh 5 2 2 1

    I want to create a subroutine which takes two argument as input, First argument is array and second argument is column number on which sorting has to be done. Say for example sort_array is a subroutine and I pass OBJD array and column to sort as arguments. something like sort_array(\@OBJD,2) and this should provide me below output

    1 ab 2 3 cd 4 5 6 6 3 bg 3 9 jh 5 2 2 1 9 rc 4 5 ef 6 3 4 1 7 fa 5 2 tg 5 9 9 0

    Or something like sort_array(\@OBJD,6) and this should provide me below output

    3 bg 3 9 jh 5 2 2 1 9 rc 4 5 ef 6 3 4 1 1 ab 2 3 cd 4 5 6 6 7 fa 5 2 tg 5 9 9 0

    Would like to do it using regular sorting method as well as using 'Schwartzian transform' just to learn it.

      It depends on whether your sample data represents an array-of-arrays, with each non-whitespace token as an element of a sub-array, or a single-level array with each line as an element. In the first case, it's fairly simple, something like this:

      sub sort_aoa { my( $array, $column ) = @_; return sort { $a->[$column] cmp $b->[$column] } @$array; }

      If each element is a whole line, you'll have to split them into words before sorting. This is where a Schwartzian Transform is likely to help the most, but I'll show the basic idea and you can add that:

      sub sort_lines_by_column { my( $array, $column ) = @_; return sort { return( (split ' ', $a)[$column] cmp (split ' ', $b)[$column] ); } @$array; }

      (Untested. In both cases, replace the 'cmp' comparison with whatever you want.)

      Aaron B.
      Available for small or large Perl jobs and *nix system administration; see my home node.

        Wow, thanks a lot Aaaron !!!

        This is not Array of Array, so I used second code what you have provided and it works perfectly fine. But I have few doubts:

        1. In split you have specified ' ' which means it will split on single space, but in reality it splits for any number of space.

        2. How can I do it using 'Schwartzian transform', I am still novice in perl, please do not mind.

        3. How can I have flexibility to pass sorting order Ascending OR Descending to this subroutine. I tried as shown below but its not working.

        sub sort_lines_by_column { my( $array, $column, $order ) = @_; my $ab; my $cd; if ($order eq 'asc') {$ab = "\$a"; $cd = "\$b";} elsif ($order eq 'dsc +') {$ab = "\$b"; $cd = "\$a";}; return sort { return( (split ' ', $ab)[$column] <=> (split ' ', $cd)[$column] ); } @$array; }

        Its giving error "Use of uninitialized value in numeric comparison (<=>)"

      Another piece of the puzzle: sort accepts code refs as the sorting function.

      my @l = (10, 2, 200, 23, 3, 9, 11, 1); my $num = sub { $a <=> $b }; my $str = sub { $a cmp $b }; print join(", ", sort $num @l ), "\n"; print join(", ", sort $str @l ), "\n"; __END__ 1, 2, 3, 9, 10, 11, 23, 200 1, 10, 11, 2, 200, 23, 3, 9

      The thread linked to by the other anon above contains some ideas. Personally I like Scalar::Util's looks_like_number, since AFAIK that's the same function Perl uses internally. If you know that your columns always contain either numbers xor non-numbers, testing the first value of the column should be enough to determine which comparison to use for that column. If however your columns contain a mix of numbers and non-numbers (e.g. "1 a","foo","3.14","42","b5ar","93b"), you'll have to make a decision on how to sort a column like that.