in reply to Re: Sort this data
in thread Sort this data

I'd guard against splice() continually from the beginning of an array. I'd much rather call it from the end of the array. "But Jeff," you'd say, "what if the order of the hash references matters? You couldn't do:
while (my ($a,$b,$c) = splice(@data, -3)) { push @hashrefs, { a => $a, b => $b, c => $c }; pop @data; # null field }
because then @hashrefs would be in reverse!" Yes, that's absolutely right. And it might be inefficient to reverse() the array when we're done with it, and it's also inefficient to keep unshift()ing to the array. So what possible efficient solution could I come up with to combine the splice() speed with the insertion speed?

Pre-extend the array! (Dun, dun, DUN!)
my @hashrefs; $#hashrefs = int(@data / 4); my $i = $#hashrefs; while (@data and my ($a,$b,$c) = splice(@data, -3)) { $hashrefs[$i--] = { a => $a, b => $b, c => $c }; pop @data; }
What a sneaky trick I've done.

Update

Oops, used -4 above, when I meant -3. Thanks, jcwren.

Update

splice() will wig out when the array is empty. The while loop has been adjusted.

japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re (tilly) 3: Sort this data
by tilly (Archbishop) on Nov 19, 2000 at 20:42 UTC
    I would have to check, but I thought that splice was only slow if you have to move the array around. But I don't know whether pulling from the beginning of the array has been optimized. And (like jcwren) I am too lazy to benchmark it at the moment. In any case shift is fast and has the benefit of avoiding tracking indices. (Note that you had a bug in your code which jcwren caught? Yes, I am talking about that.)
    while (@big_array) { my $href; @$href{'title', 'author', 'link'} = map shift(@big_array), 1..4; push @structs, $href; }
    This might be faster than your sneaky trick. It might be slower. It certainly has fewer indices.

    Also the cost of reverse is overstated. You have just walked through a list of n things in Perl. You then want to reverse a list of n/4 things. What is the relative cost of those two operations? Right.

    Pick up good material on optimization. Such as this sample chapter from Code Complete. Or RE: Efficient Perl Programming. You will find that experienced people understand that getting maintainable code with good algorithms can result in better overall speed wins than trying to optimize every line.

    Now noticing the splice, that matters. If it isn't optimized then that is an order(n) operation n times - which is n^2 and therefore is likely to be slow. But one reverse at the end is an order n operation once. Should the body of the loop be slightly more efficient from doing the slice rather than repeated manipulation of indices (something I would have to benchmark to have a feeling for either way) then your attempt to optimize would actually lose.

    To summarize, don't worry about slow operations, worry about bad algorithms. A slow operation inside a loop may matter. A slow operation outside a loop which speeds up the loop can go either way. An order n (or worse) operation inside a loop - that is the only one which should cause you to want to care up front about optimizing the structure of the code!

    EDIT
    I had messed up the final paragraph.

(jcwren) Re: (3) Sort this data
by jcwren (Prior) on Nov 19, 2000 at 19:55 UTC
    Nice solution! But a couple of questions:

    It would be interesting to benchmark if using unshift() into the $hashrefs array would be more or less efficient than messing with $i. If nothing else, it would be two lines shorter, and more Perlish.

    And perhaps this is too early in the morning, but why are you splicing by -4, and then popping @data?

    It's Sunday morning where I am, and I'm too lazy to try to write a coherent benchmark...

    --Chris

    e-mail jcwren
Re: Re: Re: Sort this data
by extremely (Priest) on Nov 20, 2000 at 01:47 UTC
    What if there is a blank line at the END of the list? You might want to sniff for that and pop off blank lines first.

    Also, what if fields are allowed to be null? If so, you HAVE to read from the front...

    --
    $you = new YOU;
    honk() if $you->love(perl)

      Yes, and we've just seen that reading from the front is not all that slow an operation. Sorry, I didn't understand how the array was implemented internally. Now that I do, I see that splice()ing at the front of an array is a damn nice operation, and makes unshift()ing later less of a headache.

      japhy -- Perl and Regex Hacker