http://qs1969.pair.com?node_id=427861


in reply to Re^3: Better mousetrap (getting top N values from list X)
in thread Better mousetrap (getting top N values from list X)

Oh! A bubble sort. That's essentially what I did in my first attempt 427046, but it didn't fair very well. It would reasonable for very small N, but it very rapidly gets overtaken.

And you'll notice, it still has to process the whole list on the first pass, so lazy lists don't really come into play.


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

Replies are listed 'Best First'.
Re^5: Better mousetrap (getting top N values from list X)
by sleepingsquirrel (Chaplain) on Feb 03, 2005 at 23:20 UTC
    You can do lazy heap/merge/quick sorts.


    -- All code is 100% tested and functional unless otherwise noted.

      True, but you still need the whole list for the first pass.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        So?


        -- All code is 100% tested and functional unless otherwise noted.
Re^5: Better mousetrap (getting top N values from list X)
by sleepingsquirrel (Chaplain) on Feb 03, 2005 at 23:39 UTC
    And you'll notice, it still has to process the whole list on the first pass, so lazy lists don't really come into play.
    Yes, lazy lists are what make the whole scheme elegant. The sort returns a lazy list of values, which is nothing more than a thunk, a promise to do some work in the future (i.e. there is no computation done, and there are no elements sitting around in memory). When you end up needing the value in @sorted[0] that thunk is evaluated (you get the minimal value in O(n) time). And when you need @sorted[1] you get that in another O(log(n)) time. Etc. You might want to check out the lazy language Haskell for all kinds of funky stuff that can be done lazily (and all perl monks know the value of Laziness!)


    -- All code is 100% tested and functional unless otherwise noted.

      Okay. I see what you are saying now. The slice across the return from sort generates a lazy list.

      So the code:

      my @data = getdata( 100000 ); my @topN = ( sort( @data ) )[ 1 .. 10 ]; undef( @data ); for my $top ( @topN ) { ## << COrrect per 427878 ## do something with $top }

      Passes the whole list of data to sort, (which takes a copy of it in case we modify or discard it) and returns an iterator which gets bound to @top, so that each time we access the next element of @top in the for loop, the iterator is called via the magic of tieing, and the copy of the 100,000 elements that we passed to sort gets scanned to produce the next of our 5 return values.

      Lazy huh?!


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        Essentially. Although I think you probably meant...
        for my $top ( @topN ) { ## do something with $top }


        -- All code is 100% tested and functional unless otherwise noted.