in reply to Help with converting Python script to Perl for CSV sort

Once your $sheet array ref is populated with your data (which your program is doing more or less correctly), sorting it is fairly easy.

Sorting numerically on the first field:

my @sorted = sort {$a->[0] <=> $b->[0]} @$sheet;
Sorting alphabetically on the second field:
my @sorted = sort {$a->[1] cmp $b->[1]} @$sheet;
Note that populating a $sheet array ref (rather than a plain @sheet array) is making things a little bit more complicated than they need to be.

Update: poj typed faster than I did. ;)

Replies are listed 'Best First'.
Re^2: Help with converting Python script to Perl for CSV sort
by jasonwolf (Sexton) on Jan 31, 2017 at 15:54 UTC
    Let me take a moment and see if I fully understand the code, before I move on. I think I am getting excited and ahead of myself. So
    1. my $sheet; 2. my $count = -1; 3. 4. while( <DATA> ) { 5. chomp; 6. $count++; 7. # skip header 8. next unless $count; 9. my $row; 10. @$row = split( /,/, $_ ); 11. push @$sheet, $row; 12. }
    The above code is using “strict” since I am using “my” in front of “$sheet” and “$count”. “$sheet” and “$count” are scalars/variable and ‘count’ is getting assigned a negative value. However, I believe “$sheet” is getting set to an empty or undefined scalar that can be defined later. Line 4. Is the start of the while loop; however, not sure what to call ‘<DATA>’ Line 5. Is chomping the ‘\n’ from end of line Line 6. Is adding one to $count to skip the column headers/column titles Line 7. Comment Line 8. Not sure but I think this is saying to skip the top row if there is something in count?? Line 9. Read in row from file??? Line 10. Assign row data into array?? And split on comma Line 11. Use the PUSH command to add what is in $row to @$sheet??? Or do I have that backwards?
    foreach my $row ( sort { $a->[1] <=> $b->[1] } @$sheet ) { print join( ',', @$row ), "\n";
    This is where I do my sort as you pointed out in your post, and where I need to format my SORT syntax, which I am reading up on, but you have already assisted with it. Not sure I understand what you mean about “Note that populating a $sheet array ref (rather than a plain @sheet array) is making things a little bit more complicated than they need to be.” What is more complicated? Note – the example I am using is just an example I found on the Internet that I am trying to understand. Thank you JW
      The fact that you're using "my" does not prove that you're using strict, and the use strict; is not there in the code you show. It goes the other way around: if you use strict, then you have to use my (or some other declarator).

      If you use the $sheet variable (with a $ sigil), then you declare a scalar variable. If you had used @sheet, you would have declared an array variable. Independently of whether this variable is defined immediately or left "empty" for the moment and populated later.

      DATA is a special file handle referring to some data put at the end of your script, after a __DATA__ tag (see examples at the bottom of this post).

      If you want to read from a file, then you would have to open the file first and read from the file handle used for opening the file, with something like:

      open my $FH, "<", "file.txt" or die "Cannot open file.txt $!"; while (<$FH>) { # ... }

      On the $count variable: you initialize it to -1, and you inc rement it in your while loop. The first time through the loop, its value becomes 0, and the line with the header is skipped because a 0 value is evaluated to false in Boolean context.

      Line 10 splits the input line stored in $_ and stores the resulting array into the $row array ref. And the next code line stored the row array ref into the $sheet array ref.

      Here is how you could minimally change your code to get the sorted result:

      use strict; use warnings; use Data::Dumper; my $sheet; my $count = -1; while( <DATA> ) { chomp; $count++; # skip header next unless $count; my $row; @$row = split( /,/, $_ ); push @$sheet, $row; } #my @sorted = sort {$a->[0] <=> $b->[0]} @$sheet; my @sorted = sort {$a->[1] cmp $b->[1]} @$sheet; print Dumper \@sorted; __DATA__ HEADER 1,Beginning C,Beginning C1 2,Beginning C++,Beginning C++1 12,navy blue,navy blue1 3,Python Intro,Python Intro1 8,Baker's dozon,Baker's dozon1 9,Jumbo frames,Jumbo frames1 4,Acme cook book,Acme cook book1 5,Jumping Jack Flash,Jumping Jack Flash1 6,Zebra,Zebra1 7,Ace hardware,Ace hardware1 10,Attack show,Attack show1 11,car 54 where are you,car 54 where are you1 13,navy gold,navy gold1

      And this is a slightly improved (simpler) version using arrays instead or array refs. Also using the $. builtin input file line counter, instead of $sount.

      use strict; use warnings; use Data::Dumper; my @sheet; while( <DATA> ) { chomp; # skip header next if $. == 1; my @row = split( /,/, $_ ); push @sheet, [@row]; } my @sorted = sort {$a->[0] <=> $b->[0]} @sheet; #my @sorted = sort {$a->[1] cmp $b->[1]} @sheet; print Dumper \@sorted; __DATA__ HEADER 1,Beginning C,Beginning C1 2,Beginning C++,Beginning C++1 12,navy blue,navy blue1 3,Python Intro,Python Intro1 8,Baker's dozon,Baker's dozon1 9,Jumbo frames,Jumbo frames1 4,Acme cook book,Acme cook book1 5,Jumping Jack Flash,Jumping Jack Flash1 6,Zebra,Zebra1 7,Ace hardware,Ace hardware1 10,Attack show,Attack show1 11,car 54 where are you,car 54 where are you1 13,navy gold,navy gold1

      I hope this helps.

        This is helping me a great deal; however, I am now confused at another step. When I run the above code you provided - I get the following output.

        __DATA___ E:\code\perl>cli-pl-csvProcess.pl a.cs $VAR1 = [ [ '7', 'Ace hardware', 'Ace hardware1' ], [ '4', 'Acme cook book', 'Acme cook book1' ], [ '10', 'Attack show', 'Attack show1' ], [ '8', 'Baker\'s dozon', 'Baker\'s dozon1' ], [ '1', 'Beginning C', 'Beginning C1' ], [ '2', 'Beginning C++', 'Beginning C++1' ], [ '9', 'Jumbo frames', 'Jumbo frames1' ], [ '5', 'Jumping Jack Flash', 'Jumping Jack Flash1' ], [ '3', 'Python Intro', 'Python Intro1' ], [ '6', 'Zebra', 'Zebra1' ], [ '11', 'car 54 where are you', 'car 54 where are you1' ], [ '12', 'navy blue', 'navy blue1' ], [ '13', 'navy gold', 'navy gold1' ] ];

        How can I format this back into a normal CSV format output? My goal is to create a new file called output like in my hacked python script..

        I can handle the file appending part; however, I do not full understand what is taking place right now. Looks like an Array of an array was printed out, but I am not sure if that is what I want or need.

        No CPAN modules?