Sort then conditionally sort

lukez has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sort then conditionally sort by ikegami (Patriarch) on Apr 08, 2009 at 20:48 UTC
There's no way to know whether to sort Y ascending or descending before X is fully sorted, so you'll need to do multiple passes. use strict; use warnings; my @data; while (<DATA>) { chomp; push @data, [ split /\s+\|,/ ]; } @data = sort { $a->[0] cmp $b->[0] \|\| $a->[1] <=> $b->[1] } @data; my %order; my $last_f; my $last_x; for (@data) { if (!defined($last_f) \|\| $last_f ne $_->[0]) { $order{$_->[0]}{$_->[1]} = +1; $last_f = $_->[0]; $last_x = $_->[1]; } if ($last_x ne $_->[1]) { $order{$_->[0]}{$_->[1]} = -$order{$last_f}{$last_x}; $last_x = $_->[1]; } } @data = sort { $a->[0] cmp $b->[0] \|\| $a->[1] <=> $b->[1] \|\| ( $a->[2] <=> $b->[2] ) * $order{$a->[0]}{$a->[1]} } @data; print("$_->[0] $_->[1],$_->[2]\n") for @data; __DATA__ ... [download] Update: By sorting only one dimension at a time, we can avoid multiple passes. It actually makes the program simpler: `use strict; use warnings; my %data; while (<DATA>) { chomp; my ($f,$x,$y) = split /\s+\|,/; push @{ $data{$f}{$x} }, $y; } my @data; my $order = +1; for my $f (sort keys %data) { my $xs = $data{$f}; for my $x (sort { $a <=> $b } keys %$xs) { my $ys = $xs->{$x}; push @data, map [ $f, $x, $_ ], sort { $order * ( $a <=> $b ) } @$ys; $order *= -1; } } print("$_->[0] $_->[1],$_->[2]\n") for @data; __DATA__ ...` [download] In both case, the data was `aaa 1,2 aaa 2,1 aaa 2,3 aaa 3,1 aaa 3,2 aaa 4,1 aaa 4,5 bbb 2,2 bbb 2,5 bbb 2,1 bbb 4,3 bbb 4,6 bbb 4,1 bbb 4,2 ccc 3,3 ccc 3,6 ccc 1,3 ccc 1,1 ccc 6,4 ccc 6,6 ccc 2,2 ccc 2,4` [download]	[reply] [d/l] [select]
Re^2: Sort then conditionally sort by lukez (Initiate) on Apr 10, 2009 at 17:18 UTC
Global symbol "%data" requires explicit package name Ive been trying to figure out the file in/out code, but I get the error above, before getting to where the file in code even starts. Is something missing or not in correct order?	[reply]
Re^3: Sort then conditionally sort by ikegami (Patriarch) on Apr 11, 2009 at 06:44 UTC
My program doesn't generate that error. You'd get that error if you used hash `%data` without declaring it.	[reply] [d/l]
Re^2: Sort then conditionally sort by lukez (Initiate) on Apr 10, 2009 at 04:22 UTC
I want to thank all of you for your help on this . Hi ikegami, I think your (update code) will be easier for me to eventually understand than the other code versions. I am open to any explanation/comments 1 thing I need to mention is that the data will be in an external file. Do I need to change the way your code opens up data? That is this part while (<DATA>) to something like this??? ======================================================== open (my $IN, 'myfile.dat') or die "$!"; my @data = <$IN>; close $IN; ======================================================== I hope I am saying this correctly... I will also need to have all sorted data in a different file... I think something like open (my $OUT, ">", 'output.dat') or die "$!"; then perhaps add $OUT to your print-output code print ("$_->[0] $_->1,$_->2\n") for @data; to print $OUT("$_->[0] $_->1,$_->2\n") for @data; would that work or am I out in left field? You guys a teriffic, thank you all again... by the way is perl the best method to do this? I am curious about why there are so many languages, if 3 or 4 can do it all....Not sure if that is true tho... thx so much everyone Luke pps Is there a way to know when I get a response from you guys, as in an email notice? peace!!!	[reply]
Re^3: Sort then conditionally sort by ikegami (Patriarch) on Apr 11, 2009 at 06:52 UTC
Why would think that reading a line from a file handle should be replaced with reading the entire file into an array and closing the file? By the way, my data wasn't exactly in the same format as yours. I thought the first column was actually a file name and not in the file itself. That means you'll need to adjust the input parsing and output format. would that work or am I out in left field? Yes, that's how you write to a file. I am curious about why there are so many languages, if 3 or 4 can do it all.... Because no language does it all, or does it the same way.	[reply]
Re^4: Sort then conditionally sort by lukez (Initiate) on Apr 15, 2009 at 00:10 UTC
Re^5: Sort then conditionally sort by ikegami (Patriarch) on Apr 15, 2009 at 02:31 UTC
Re: Sort then conditionally sort by ig (Vicar) on Apr 08, 2009 at 20:50 UTC
The sort function allows you to define your own sorting criteria, as an expression, a block of code or a subroutine. You can split your records into three fields with something like split(/[\s,]+/). Then all you have to do is decide how to compare the fields. You can use a lexical comparison (cmp) for the first field and a numeric comparison (<=>) for the second. These operators are described in perlop. The third field is more challenging and how to proceed depends on details that are not clear to me. You say "each new X value" but not the context in which the value is "new". Are you concerned with the order in which they appear in the input file or the order they appear after sorting the first two fields or something else?	[reply]
Re^2: Sort then conditionally sort by lukez (Initiate) on Apr 10, 2009 at 16:57 UTC
Hi Ig, thank you The data in file, is a lot larger and with longer group names between the ''s. this data file is to be only read in and the code will be used to sort it as described in my 1st note. The X-sorted/then cond Y-sorted version is then saved to a new file. The original file is untouched. The 'Groups' can remain in whatever original order they were in, or they can be sorted if that is easier. (doesnt matter). To recap the X's for each 'Group' are sorted Ascending, then for each new X value the direction (ascend/decend) of the Y sort is changed. My other question was the proper way to call in the Data file in PERL, and how to print to another file. I gave my CODE guesses on how to do this, in my other email note. How far off was I on my guesses? Can you fix or confirm the code I thought I would need to do? thanks Im sorry to bother you again but I just realized the 'Group' names may contain Alpha numeric characters 'aaa3' '5aaa2' '43bbb' etc. Sorry I didnt mention this before. Will this affect the type of code sort. By the way these groups dont need to be sorted, they can be left in the original order, only their XYs need sorting.	[reply]
Re^3: Sort then conditionally sort by ig (Vicar) on Apr 16, 2009 at 02:34 UTC
I think you have answers to most of your questions from others by now but briefly... Ikegami's update looks good to me. There are many ways to do everything - a bit confusing in the beginning but good in the long run. To read your data from another file you can let some perl "magic" do it for you, perhaps something like the following: `#!/usr/local/bin/perl use strict; use warnings; foreach my $line (<>) { print "$line"; }` [download] The above script will read every line of every file named on the command line or, if no files are named on the command line, will read from standard input (STDIN). It does nothing but print the contents of the file, but you can put anything you like inside the loop. Alternatively, and perhaps a bit less mysteriously, you can open the file explicity yourself. The following would do it: `#!/usr/local/bin/perl use strict; use warnings; my $filename = shift; # get the filename from the command line open(my $fh, '<', $filename) or die "$filename: $!"; foreach my $line (<$fh>) { print "$line"; }` [download] You might read open and perlopentut for more on opening files for input and output. You may have realized that <DATA> is special: it reads the data in your program file that appears after a line containing "__DATA__" (without the quotes) or "__END__". This is convenient for test scripts and otherwise. You can read more about this in perldata. Having some numbers in the group names won't be a problem. If you use Ikegami's examples the group names are sorted lexically. It is easier to sort them so that all the records for a group come together.	[reply] [d/l] [select]
Re: Sort then conditionally sort by kyle (Abbot) on Apr 08, 2009 at 20:38 UTC
That's kind of an interesting problem. As such, I've written a solution, even though you haven't shown any work. My solution could be a lot more readable, but it does the job. Some explanation is in the comments. Read more... (3 kB) Since you're new, you might understand what I wrote better if you look at perldsc, perlreftut, perlref, Test::More, References quick reference, and—what the heck—PerlMonks FAQ	[reply] [d/l]
Re^2: Sort then conditionally sort by lukez (Initiate) on Apr 10, 2009 at 04:46 UTC
Hi Kyle, thank you sorry i didnt have code, I am JUST learning and I learn by looking at code solutions AND READING faqs and books etc. I was confused about this part of your code; my $op_io = <<'OP_INPUT_AND_OUTPUT'; and all the example before and after columns listed in between OP_INPUT_AND_OUTPUT ; My request for help had the columns on the left as an example of the input data file to be sorted... the 2 columns on the right are the sorted /cond sorted data that needs to go in a separate file. This confused me. thank you for taking the time to help me.	[reply]
Re^3: Sort then conditionally sort by kyle (Abbot) on Apr 10, 2009 at 15:20 UTC
The construct is called a "here-document", and you can find them documented in perlop. It's basically a way to include some large chunk of text as a value in your program. In this case, I used it to hold your example data. After setting `$op_io` to that value, I use split to cut it into individual lines, and I loop over those lines to pull the individual values out. When I'm done, I have your inputs and desired output. I did it that way so I wouldn't have to reformat what you posted. I just pasted it in and wrote some code to pull out what I wanted.	[reply] [d/l]
Re^4: Sort then conditionally sort by lukez (Initiate) on Apr 10, 2009 at 16:41 UTC
Re^5: Sort then conditionally sort by kyle (Abbot) on Apr 10, 2009 at 16:56 UTC
Some notes below your chosen depth have not been shown here