v15 has asked for the wisdom of the Perl Monks concerning the following question:

HI Eevryone, I have a tab delimited table. Something like this
a 23 11 b 24 15 c 16 19 d 8 2 e 40 41
What i want to do is first arrange column 2 in the descending order of values acoording to column 1 so the result would be
e 40 b 24 a 23 c 16 d 8
Similarly for third and first column it will be
e 41 c 19 b 15 a 11 d 2
The output should be like this
a 3 4 b 2 3 c 4 2 d 5 5 e 1 1
Now in the output the values are basically the line numbers. So for column 2 in the input file a was on 3rd line when we sorted it according to the values. Similarly for b,c,d,e. How can i do this. THanks V15

Replies are listed 'Best First'.
Re: manipulating a data table
by choroba (Cardinal) on Aug 16, 2016 at 16:32 UTC
    No need for Perl, you can use a simple bash script:
    #!/bin/bash file=$1 paste <(join -1 1 -2 2 -o 1.1 2.1 "$file" \ <(cut -f1,2 "$file" | sort -k2nr | nl | sort -k2) ) \ <( cut -f1,3 "$file" | sort -k2nr | nl | sort -k2 | cut -f +1)

    But jokes aside:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; # Read the input. my @input; while (<>) { push @input, [ split ]; } # Sort by columns, save the line number. my @output = map [ $_->[0] ], @input; for my $c (1, 2) { my $x = 1; $output[$_][$c] = $x++ for sort { $input[$b][$c] <=> $input[$a][$c +] } 0 .. $#input; } # Print the line numbers using the original order. for my $l (0 .. $#input) { say join "\t", @{ $output[$l] }; }
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      I admit you got me good with this one choroba.

      I started at anger and prepared to load up my rarely used down vote gun, then went on to irritation when I saw I was out of ammunition, then went on to a sinking feeling in my stomach when I checked who the offending monk was, then to confusion as I checked to make sure it wasn't somehow April 1st again already, and then returned to a state of relaxation with an audible laugh when I actually finished reading past the opening line.

      Well played sir, well played...

      I love it when things get difficult; after all, difficult pays the mortgage. - Dr. Keith Whites
      I hate it when things get difficult, so I'll just sell my house and rent cheap instead. - perldigious
      Hi, The perl code worked really well and for the columns more than 2 i can just use
      for my $c(1..16)
      Can you explain me a little bit what some portions of your code is doing. In the while loop you are adding each line of the input file as an array ref into an array named input
      while (<>) { chomp; my @s = split /\t/,$_; push @input ,\@s; }
      Can you explain me the use of map and what exactly is that line doing? Also can you right this statement in the loop form rather than one line. It is easy to understand
      $output[$_][$c] = $x++ for sort { $input[$b][$c] <=> $input[$a][$c] } +0 .. $#input;
      Thanks
        To see what structure the first loop creates, use
        use Data::Dumper; print Dumper(\@input);

        You'll get the following output:

        The map line

        my @output = map [ $_->[0] ], @input;

        creates the output, it maps each element of @input to a reference to a new array containing its first element, i.e.

        The last cited line sorts the array indices (that represent input line numbers) by $c-th element (reversed), and than sets $c-th element of the output to be the order number. Thus, the first iteration processes the second column:

        and the second one the next one:

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: manipulating a data table
by erix (Prior) on Aug 16, 2016 at 17:51 UTC

    Just for the hack of it, can I peddle a postgres version of a 'solution'?

    Assuming a postgres instance, with module file_fdw installed, and your 5 text lines tab-separated saved under '/tmp/table.txt':

    create server abc_server foreign data wrapper file_fdw; create foreign table file_table ( c1 text, i2 int, i3 int) server abc_server options ( filename '/tmp/table.txt', format 'csv', delimiter E'\t' + ); select c1, i2, i3 from file_table order by c1; select c1, i2 from file_table order by i2 desc; select c1, i3 from file_table order by i3 desc; select c1 , rank() over (order by i2 desc) , rank() over (order by i3 desc) from file_table order by c1;

    Output (mimicking yours):

    $ psql -qXf table.sql c1 | i2 | i3 ----+----+---- a | 23 | 11 b | 24 | 15 c | 16 | 19 d | 8 | 2 e | 40 | 41 (5 rows) c1 | i2 ----+---- e | 40 b | 24 a | 23 c | 16 d | 8 (5 rows) c1 | i3 ----+---- e | 41 c | 19 b | 15 a | 11 d | 2 (5 rows) c1 | rank | rank ----+------+------ a | 3 | 4 b | 2 | 3 c | 4 | 2 d | 5 | 5 e | 1 | 1 (5 rows)

    Not entirely serious as a solution but I thought is was cute all the same: it is also the shortest till now, with only 3 lines :)

      OK, who can present a solution that at least involves ...

      • Web Services
      • a NoSQL database
      • a Linux cluster
      • two different ORM
      • Java
      • C#
      • Oracle

      Bonus points for using MUMPS or FORTRAN.

      SCNR ;-)

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        I was tempted to go for a Fortran web service with SQL (http://fortran.io), but I settled for a solution using Data::Table.

        #!/usr/bin/env perl use strict; use warnings; use Data::Table; # Giving '0' as the second arg to Data::Table::fromTSV, # results in a Data::Table with column names of col1, col2 and col3 my $t = Data::Table::fromTSV("data.tsv", 0); my $nrow = $t->nofRow; my @row_numbers = 1..$nrow; # Sort by 'col2' as numeric values in decending order $t->sort( 'col2', 0, 1 ); # Add a new column that contains the row indices for the 'col2' sort o +rder $t->addCol( \@row_numbers, 'col2order' ); # Sort by 'col3' as numeric values in decending order $t->sort( 'col3', 0, 1 ); # Add a new column that contains the row indices for the 'col3' sort o +rder $t->addCol( \@row_numbers, 'col3order' ); # Sort by 'col1' as string values in ascending order $t->sort( 'col1', 1, 0 ); # Get a subtable with the columns of interest my $result = $t->subTable( undef, [ 'col1', 'col2order', 'col3order' ] + ); print $result->tsv; exit;
        This input is in data.tsv:
        a 23 11 b 24 15 c 16 19 d 8 2 e 40 41
        The output is:
        col1 col2order col3order a 3 4 b 2 3 c 4 2 d 5 5 e 1 1
Re: manipulating a data table
by Anonymous Monk on Aug 16, 2016 at 16:46 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1169831 use strict; use warnings; $_ = <<END; a 23 11 b 24 15 c 16 19 d 8 2 e 40 41 END s/ +/\t/g; # replace tabs if pasted out my @array; @array[sort {$b <=> $a} /\t(\d+)\t/g] = 1..tr/\n//; s/\t\K(\d+)(?=\t)/$array[$1]/ge; @array[sort {$b <=> $a} /\b(\d+)$/gm] = 1..tr/\n//; s/\b(\d+)$/$array[$1]/gem; print;