simonozzy has asked for the wisdom of the Perl Monks concerning the following question:

Hello wise monks

We have a bespoke Perl scheduler that calls around 100 ETL tasks to load data into a data warehouse. The tasks are defined in a array and each taskname has it's own hash array of values which can be quite extensive, including file, time and sql dependancies.

We need to run the tasks in order of priority, but the problem is that priority needs to be a value within the taskname hash, so I can't see a way of sorting by it? I've searched a few sites for sort options but can't seem to get this to work. Below is what I've tried so far. Any ideas?

#!/usr/bin/perl -w use strict; my $tasks = { "A_Task_2" => { priority => 2, }, "A_Task_1" => { priority => 1, }, "A_Task_3" => { priority => 3, }, "B_Task_1" => { priority => 1, task_depends => [ 'A_Task_1', 'A_Task_2', 'A_Task_3', 'A_Task_4', ], }, "A_Task_4" => { priority => 4, }, }; # for my $taskname (sort keys %$tasks) # works fine for my $taskname (sort { $tasks->{priority} <=> $tasks->{priority} } k +eys %$tasks ) # not working - can you see what I'm trying to do? { my $task = $tasks->{$taskname}; print $task->{priority} , "-", $taskname, "\n" ; # works sleep(1) ; }

Replies are listed 'Best First'.
Re: Sorting an array of hashes by a value of hash
by Ratazong (Monsignor) on Apr 13, 2011 at 11:26 UTC
    When defining your own sort-function, you have access to two special variables, $a and $b, which contain the values to be sorted. Using them, your solution could look like
    for my $taskname (sort { $tasks->{$a}{priority} <=> $tasks->{$b}{prior +ity} } keys %$tasks )
    Rata

      Thanks Rata, that works perfectly. I looked at the FAQ, but the special variables confused me, so your explanation helped.

      If some of the tasks have no specified priority, I get an "uninitialized" error, but it still works, assuming low priority if it's missing (that was after I switched $a and $b)

        You may be making a mistake if there are undefs in your hash. That warning is telling you something. If you don't mind that undef and zero (0) get sorted as the same value, you have nothing to worry about with rata's solution, and the warnings can be ignored. But if undef should be ordered differently from zero, you need to do an additional check.

        Please have a look at the following test code that demonstrates three ways to handle your sort, while also dealing with the warnings.

        use 5.012_002; use strict; use warnings; my %hash = ( this => 1, that => 4, the => undef, other => undef, those => 2, it => 0, ); # First simply disable the warning temporarily. { no warnings qw/uninitialized/; my @warnsorted = sort { $hash{$a} <=> $hash{$b} } keys %hash; say "Without Warnings:\t@warnsorted"; } # Second, check each item for definedness. my @defsorted = sort { ( defined( $hash{$a} ) && $hash{$a} ) <=> ( defined( $hash{$b} ) && $hash{$b} ) } keys %hash; say "Testing Definedness:\t@defsorted"; # Third, sort with definedness as a criteria my @critsorted = sort { defined( $hash{$a} ) <=> defined( $hash{$b} ) or ( defined( $hash{$a} ) && $hash{$a} ) <=> ( defined( $hash{$b} ) && $hash{$b} ) } keys %hash; say "Defined as Criteria:\t@critsorted";

        And the output:

        Without Warnings: the it other this those that Testing Definedness: the it other this those that Defined as Criteria: the other it this those that

        The first method shown simply disables the warning, lexically scoped to a very narrow scope surrounding the sort. But this doesn't deal with the fact that 'undef' and '0' will get sorted together as the same effective value.

        The second method eliminates the warning by not directly sorting undefs. What it does is test each side of the comparison for definedness. If defined, then compare the value on that side of the <=>. If undefined, compare the value returned by defined(), which would be zero in that case. This preserves the possibly errant artifact of undef being comparatively the same as zero.

        The third method first does a definedness comparison on both sides of the <=>. If those are equal, then it moves on to a regular sort similar to my second example. This method will promote undef to the top of the heap, above '0'. Of course you could demote it to the bottom of the pile just as easily. But the point is that it sorts undef as a value different from zero.

        There's a final solution which reduces the number of calls to defined, by combining the first method with the third:

        { no warnings qw/uninitialized/; my @sorted = sort { defined( $hash{$a} ) <=> defined( $hash{$b} ) or $hash{$a} <=> $hash{$b} } keys %hash; say "Defined as Criteria:\t@sorted"; }

        Here we are still sorting with undef as a distinctly different value from zero, but warnings would continue to be generated. Since we're already sure that we've dealt with that warning, it would be safe to just ignore it, which is what the no warnings...... does for us.


        Dave

Re: Sorting an array of hashes by a value of hash
by Corion (Patriarch) on Apr 13, 2011 at 11:20 UTC