in reply to Re: Crazy hash sorting issue.
in thread Crazy hash sorting issue.

You can also use a sieve. It's potentially faster than a complicated sort() algorithm to place things matching certain patterns before others.
The code you post maybe faster than a 'sort', but compared to your "sieve", the sort needed is very simple.

I fail to understand the reason of this awfully complicated code. For this case, I'd just iterate over the hash twice:

my ($k, $v); $v =~ /C5$/ and print "$k $v\n" while ($k, $v) = each %tapes; $v =~ /P5$/ and print "$k $v\n" while ($k, $v) = each %tapes;
If you want to iterate once, you need additional storage:
my (@C5, @P5); while (my ($k, $v) = each %tapes) { $v =~ /C5$/ ? push @C5, "$k $v\n" : push @P5, "$k $v\n"; } print @C5, @P5;
Although you can reduce storage costs by printing the first set while iterating:
my @P5; while (my ($k, $v) = each %tapes) { $v =~ /C5$/ ? print "$k $v\n" : push @P5, "$k $v\n"; } print @P5;
Perl --((8:>*

Replies are listed 'Best First'.
Re^3: Crazy hash sorting issue.
by japhy (Canon) on Nov 07, 2005 at 13:08 UTC
    The more cases you have, the less efficient it is to loop over the data set each time. That's where a sieve comes in handy. It short-circuits the loop over the filters, and loops through the data only once.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
      The more cases you have, the less efficient it is to loop over the data set each time.
      That's why I said that in this case, I'd iterate over the hash twice - because there are only 2 'buckets'. Note also that I gave three solutions, two of them iterating over the data only once (and they only do one test per iteration, unlike your sieve). Realise also that if the number of sieves grows, the gain of using a 'sieve' or 'buckets' over sorting deminishes.

      Would you have had an unknown number of cases, all distinguished by their last 2 characters, I'd used:

      my %buckets; while (my ($k, $v) = each %tapes) { push @{$buckets{substr $v, -2}}, "$k $v\n"; } print @{$bucket{$_}} for sort keys %buckets;
      Which is a lot more efficient (and simpler) than your solution. Your solution isn't very efficient - you critize my "special 2-bucket solution" to not being very efficient because it loops over the data set each time.

      But let's analyze that, ok? Suppose your have a dataset of N elements, and you have M buckets to "sieve" them into. Generalizing my "special 2-bucket solution" to an M-bucket one, requires M loops over the data - but doing one test in each iteration, for a total of N * M tests. Your complicated sieving solution loops over the data only once, but you are (potentially) doing M tests per iteration. So you're still doing N * M tests. That's the same amount of tests. The difference is that my solution doesn't require any extra storage.

      On the other hand, my solution can quickly be adapted to deal with an unknown number of buckets. It requires more storage than the "special 2-bucket solution", but not more than your solution. But what's more interesting, it's doing one pass over the data, and doing just one test per iteration. Hence, N tests. It takes more work to adapt your solution to deal with an unknown number of buckets.

      Perl --((8:>*