comment on

Your right Bart.

When I did my testing, for simplicity I used an array rather than a hash. Or rather 3 arrays.

One with the values in random order.
One with the values pre-sorted ascending
One with the values pre-sorted descending

This, so as to get a feel for average and worst case performance.

The numbers of sorts and comparisons I gave above were those for the worst case scenario where the 1_000_000 are pre-ordered ascending and every new value would require the keep array to be re-sorted. This is slow, as I indicated.

The bit I missed with your code was that as the values are coming from a hash, they are being processed in "hash order", which of course is undefined, but does a pretty good job of approximating random.

This means that the keep array only ends up getting sorted an average of around 900 times instead of the worse case of 999_000 times in the 100 from 1_000_000 case, with the obvious benefit on performance.

Whilst the worst case scenario still exists, the chances of it happening naturally are pretty astronomical and can therefore be discounted. A quick test using your code on 'AAAAA' .. 'BAAAA' for 913_952 keys used 908 sorts and took around 45 seconds on my 233MHz. This compares pretty well with my code on a 1_000_000 element array taking 15 seconds given that mine doesn't have the extra level of indirection in the comparitor that it would need to retain the keys.

I'd like to see a comparison of the two run on 10 million records as I don't think that they would scale linearly, but I agree that it makes the benefits of my algorithm look much less good. Leastwise while it is coded in pure perl.

An XS or C implementation of bsearch() would improve things dramatically. Maybe I'll have a go. I think that it would be a useful addition to List::Util. What d'ya think?

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.

In reply to Re: Re: Re: Re: Re: *Fastest* way to print a hash sorted by value by BrowserUk
in thread *Fastest* way to print a hash sorted by value by smellysocks

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.