http://qs1969.pair.com?node_id=512942

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

Here is what I am trying to do.

Read the CSV (excel sheet saved as a comma separated file) file into a perl data structure (either hash of array OR array of arrays). And sort based on one of the columns. When this is done, I want the rest of the columns too to be re-ordered based on the sorted column. That is the columns of a row must remain same.

For eg: This is the CSV file.
Name,Score,State
Mike,67,CA
Rob,63,FL
Jim,72,IL
Chan,32,AZ

When this is read into a perl data structure, and suppose I sort the column for Score in ascending order, I want the rest of the fields corresponding to the column to be re-ordered. So the result should be

Chan,32,AZ
Rob,63,FL
Mike,67,CA
Jim,72,IL

1. What data structure is best suited for above?
2. And how to achieve the above?

Thanks in advance.

Replies are listed 'Best First'.
Re: sorting CSV files
by ptum (Priest) on Nov 30, 2005 at 14:35 UTC
    Depending on how flexible you need this to be, you might consider using an array of hash references, such that each row would be represented as a reference to an associative array with the column names as keys. Then you could sort the array of rows according to a particular key with a comparison subroutine:
    foreach my $this_record (sort byname @record_set) { # do something } sub byname { $a->{'NAME'} cmp $b->{'NAME'} }
    See pages 789-793 of Programming Perl (3rd Edition)

    No good deed goes unpunished. -- (attributed to) Oscar Wilde
Re: sorting CSV files
by derby (Abbot) on Nov 30, 2005 at 14:36 UTC

    TMTOWTDI ... but an array of arrays would work:

    #!/usr/bin/perl my $sheet; my $count = -1; while( <DATA> ) { chomp; $count++; # skip header next unless $count; my $row; @$row = split( /,/, $_ ); push @$sheet, $row; } foreach my $row ( sort { $a->[1] <=> $b->[1] } @$sheet ) { print join( ',', @$row ), "\n"; } __DATA__ Name,Score,State Mike,67,CA Rob,63,FL Jim,72,IL Chan,32,AZ
    -derby
      Hello derby, Im working in a similar data sctructure. In my case the csv files are around 26Mb. And I want to sort them with reference to a particular column, and then taking a sorted group, i want to sort them again using a different column. Looking at your earlier solution, I am unable to sort them in a hash table. (Since that is the best way I can think of)

        What hash? There's no hash there, just references to arrays. You can always do a secondary sort:

        foreach my $row ( sort { $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] } +@$sheet ) {
        That construct will sort first by second col and then by the third col.

        -derby
Re: sorting CSV files
by dorward (Curate) on Nov 30, 2005 at 14:59 UTC
    DBD::CSV might do the job. It lets you run SQL queries over the data.
Re: sorting CSV files
by swkronenfeld (Hermit) on Nov 30, 2005 at 18:37 UTC
    Don't write a script, just use your *IX command line!

    sort -n -t , -k 2 fileName.csv

    Slightly expanded, that's

    sort --numeric-sort --field-separator=, --key=2 fileName.csv

    Of course, if you're on Windows...I'm sorry.
Re: sorting CSV files
by hakkr (Chaplain) on Nov 30, 2005 at 14:37 UTC

    Please at least try.

    the command to read and open a file is 'open' :)
    the command to split the columns on comma is 'split' :)
    an array or hash will do The command to sort is 'sort' :)
Re: sorting CSV files
by crashtest (Curate) on Dec 01, 2005 at 02:04 UTC

    Am I the only one this reeks of homework to? In general, people are happy to help if you at least start with some code and specific questions about it (and why it's not working).

    Maybe you can turn in the following one liner (for Windows)...?

    C:\temp>perl -F, -ane "BEGIN{$c = shift} push @a, [@F]; END{print join +(',' => @$_) for sort {($a->[abs($c)-1] cmp $b->[abs($c)-1]) * $c} (@ +a)};" -- -3 grades.csv Jim,72,IL Rob,63,FL Mike,67,CA Chan,32,AZ
    ... where the only argument (before the input file) specifies on which column to sort and in which direction (negative column value for descending sort, positive for ascending).

      Hi, Thanks for wonderful one liner file sorting. Can you please explain this in detail that how it works ?

        I can, but I won't. The OP looked like a homework assignment. My goal was to post a working solution that was as cryptic, compressed, unreadable and definitely un-turn-in-able as possible.

        If you are genuinely interested, I would recommend reading:

        • perlrun for an explanation of -a -F, and -n.
        • perlmod to understand what BEGIN and END blocks do.
        • perlfunc for an explanation of the sort function.