http://qs1969.pair.com?node_id=584607


in reply to Sorting files you Have read

That's a bit like asking "How should I drink?". It depends a great deal on what it is you are drinking!

First up:

  1. What does your data look like?
  2. How big is the file?
  3. How often does it need to be done?
  4. Is sort time important?
  5. Will the sorted data be reused in some fashion?

Actually, many of those are related to each other.

Because you were good and supplied some code, I'll show you some:

my @contents = sort <SOMELIST>; print @contents;

and because I'm kind I'll give you some hints:


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Sorting files you Have read
by brusimm (Pilgrim) on Nov 16, 2006 at 21:30 UTC
    My file is comma delimited
    about 50 lines of text
    How often - for my initial purposes, to run and make it happen.
    At the moment, due to the small file size, sort time is unimportant.
    At some point, when I get to that stage, the sorted data will be reused.

    I tried your code, and it seems quite simple, hence, efective, BUT
    It runs with no errors, but nothing prints
    Neither to the screen or to a file.

    Thank you - I have read through various tutorials,
    and sort and it's various ways of handling data,
    i saw the Schwartzian Transform, but it made my brain hurt at this point in time..
    remember, newbie here

    and either I did not operate it right, or there is little on the simpler process, but super search did not turn up anything I could make sense of...

    again, newbie, injured brain, etc, etc.
    Thanks.

      Ok, lets give that "simple effective" sample code some data:

      use strict; use warnings; my @contents = sort <DATA>; print @contents; __DATA__ At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d. I tried your code, and it seems quite simple, hence, efective. BUT, It runs with no errors, but nothing prints.

      Prints:

      At some point, when I get to that stage, the sorted data will be reuse +d. At the moment, due to the small file size, sort time is unimportant. BUT, It runs with no errors, but nothing prints. I tried your code, and it seems quite simple, hence, efective.

      which is sorted on the whole line. It works, but ain't what you want. So lets add in some brain hurty code to sort by the "second column": :)

      use strict; use warnings; my @contents = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [$_, extractColumn (1, $_)] } <DATA>; print @contents; sub extractColumn { my ($columnIndex, $line) = @_; my ($key) = $line =~ /(?:[^,]*,){$columnIndex}([^,]*)/; return $key; } __DATA__ At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d. I tried your code, and it seems quite simple, hence, efective. BUT, It runs with no errors, but nothing prints.

      Prints:

      BUT, It runs with no errors, but nothing prints. I tried your code, and it seems quite simple, hence, efective. At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d.

      However if you are dealing with csv (comma separated variable) data then you really want to be using a module such as Text::CSV to read the file. You may like to check out a few nodes that have asked the "sort CSV" question before (Super Search SoPW remember): Sorting a CSV file and sorting CSV files may help too.


      DWIM is Perl's answer to Gödel
        I will tackle this snippet and see what happens. Thank you for your time.
      I'll modify your example code to help you do some simple sorting. Now, this is likely not the most efficient way to do it, but since you have a small dataset and you are new to perl, this can help you get started.

      use strict; use warnings; open (SOMELIST, "somelist") or die "Cannot open file $!\n"; my %sort_data; while (my $record = <SOMELIST>) { my @one_line = split(/,/, $record); while (exists $sort_data{$one_line[1]}) { $one_line[1] = "$one_line[1]" . " "; #add a blank for + uniqueness } $sort_data{$one_line[1]} = $record; #store it by 2nd column } close (SOMELIST); foreach my $line (sort {$a cmp $b} keys %sort_data) { print "$sort_data{$line}"; }


      Like I said, this isn't the most efficient or even best method, but it is simple enough that you can hopefully see what is going on.


      (2006-11-18 17:21 GMT) Edited my perl code to remove a couple of syntax errors. - Thanks Grandfather for pointing them out.

        If you find yourself "uniquifying" keys for a hash you should probably be using an array. Consider:

        use strict; use warnings; use constant KEY => 1; my @sort_data; while (my $record = <DATA>) { my $key = (split(/,/, $record))[KEY]; push @sort_data, [$record, $key]; } foreach my $pair (sort {$a->[1] cmp $b->[1]} @sort_data) { print "$pair->[0]"; }

        using the same data as in previous samples prints:

        BUT, It runs with no errors, but nothing prints. I tried your code, and it seems quite simple, hence, efective. At the moment, due to the small file size, sort time is unimportant. At some point, when I get to that stage, the sorted data will be reuse +d.

        DWIM is Perl's answer to Gödel