in reply to Can Schwartzian transform be modified to sort an arrayref uniquely?

Well, before we get to your question, there are a few things that need to be cleared up. First, it's good to see you trying out advanced techniques and getting more tools in your box, but you do have a few issues.

Your first data structure has parentheses around it. That makes it a comma expression which, when evaluated in list context, will return a list. In scalar context, its value is whatever the last item evaluates to. You only have one item, so your scalar gets the array ref and your code works, but you should avoid doing things like this unless you're positive that this is what you need. Drop the parentheses.

The Schwartzian is overkill. A Schwartzian transform is great if you have an expensive step to extract the data and put it in a sortable format. You do not have such a step. Your Schwartzian reduces to this:

my @sorted = sort { $a->{num} cmp $b->{ num } } @$data;

If it makes you feel better, I was caught by the same issue. See Sort on Boredom :)

Here's how I shortened your code:

#!/usr/local/bin/perl -w use strict; my $data = [ { num => 'OF1234', title => 'title OF1234', }, { num => 'AF1234', title => 'title AF1234', }, { num => 'AF1234', title => 'title AF1234', }, ]; my %saw; my @sorted = grep { ! $saw{$_->{num}}++ } sort { $a->{num} cmp $b->{ num } } @$data; print join ', ', map { $_->{num} } @sorted;

Hope that helps.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
(tye)Re: Can Schwartzian transform be modified to sort an arrayref uniquely?
by tye (Sage) on Jan 04, 2002 at 22:13 UTC

    You can also either do the grep first so that the sort will be faster or leave it there and make use of the fact that the data is now sorted to avoid creating a possibly large %saw hash:

    my $prev= ""; my @sorted= grep { ( $_->{num} ne $prev, $prev= $_->{num} )[0] } sort { $a->{num} cmp $b->{num} } @$data;
    Which doesn't matter for such a small data set, of course. (:

            - tye (but my friends call me "Tye")

      Indeed. There are trade-offs everywhere: by default, I always write

      sort {} grep {} @foo;

      so that the grep happens first. The speed is more important than the memory for almost everything I do.

      Here's a little bit of lies/damn-lies/statistics demonstrating the speed difference on an array in which most values appear twice.

      #!/usr/local/bin/perl -w use Benchmark; use strict; my @data; my %seen; foreach ( 1..100_000 ) { push @data, int($_/2) } fisher_yates_shuffle ( \@data ); #my @gs = gs(); print join ( "\n", @gs ); #my @sg = sg(); print join ( "\n", @sg ); timethese ( 20, { sort_grep => \&sg, grep_sort => \&gs, } ); sub gs { return sort { $a <=> $b } grep { ! $seen{$_} ++ }@data; } sub sg { return grep { ! $seen{$_} ++ } sort { $a <=> $b } @data; } # fisher_yates_shuffle( \@array ) : from Perl Cookbook sub fisher_yates_shuffle { my $array = shift; my $i; for ($i = @$array; --$i; ) { my $j = int rand ($i+1); next if $i == $j; @$array[$i,$j] = @$array[$j,$i]; } }

      And the output I get:

      Benchmark: timing 20 iterations of grep_sort, sort_grep... grep_sort: 5 wallclock secs ( 4.70 usr + 0.02 sys = 4.72 CPU) @ 4 +.24/s (n=20) sort_grep: 16 wallclock secs (15.56 usr + 0.03 sys = 15.59 CPU) @ 1 +.28/s (n=20)

      As a side note, I regard this kind of optimization as "habitual", rather than of the premature kind. <laugh>

Re: (Ovid) Re: Can Schwartzian transform be modified to sort an arrayref uniquely?
by gbarr (Monk) on Jan 05, 2002 at 01:24 UTC
    One of the "features" of the Schwartzian transform is that you don't have to create any named temporary variables. So instead of using %saw or $prev you could do

    my $data = [ { num => 'OF1234', title => 'title OF1234', }, { num => 'AF1234', title => 'title AF1234', }, { num => 'AF1234', title => 'title AF1234', }, ]; my @sorted = sort { $a->{num} cmp $b->{ num } } values %{ +{ map { ($_->{num},$_) } @$data }}; print join ', ', map { $_->{num} } @sorted;