Re^22: Why is the execution order of subexpressions undefined?

Replies are listed 'Best First'.
Re^23: Why is the execution order of subexpressions undefined? by Anonymous Monk on Apr 15, 2005 at 14:47 UTC
So then you're going to show us a piece of code which is ~~slower~~ less efficient in Haskell than in Perl?	[reply]
Re^24: Why is the execution order of subexpressions undefined? by BrowserUk (Patriarch) on Apr 15, 2005 at 19:05 UTC
Here is a perl program that does a fair shuffle on a 1 million line file in 13 seconds. #! perl -sw use strict; $\| = 1; sub shuffle { my $ref = @_ == 1 ? $_[ 0 ] : \@_; my $n = @$ref; for( 0 .. $#$ref ) { my $p = $_ + rand( $n-- ); my $t = $ref->[ $p ]; $ref->[ $p ] = $ref->[ $_ ]; $ref->[ $_ ] = $t; } return unless defined wantarray; return wantarray ? @{ $ref } : $ref; } warn 'Start : '.localtime() .$/; my @data = <>; warn 'Read : '.localtime() .$/; shuffle \@data; warn 'Shuffled: '.localtime().$/; print for @data; warn 'Written : '.localtime().$/; __END__ P:\test>shuffleFile.pl data\1millionlines.dat >nul Start : Fri Apr 15 19:41:44 2005 Read : Fri Apr 15 19:41:53 2005 Shuffled: Fri Apr 15 19:41:55 2005 Written : Fri Apr 15 19:41:57 2005 [download] Can Haskell do this? I believe the answer to be no, because the 1 million line file in my example was chosen because, when loaded, it is pushes the memory consumption (on my machine) very close to the physical limits. The hand coded shuffle routine is used in preference to List::Util::shuffle, because it operated in-place rather than duplicating the list. This could equally well be an in-place qsort. The equivalent Haskel program would be less efficient because: It could not do the shuffle/sort in-place, as that involves side-effects. It would therefore need to duplicate the data at least once, which given the parameters of the program above, would push my machine into swapping. A Haskell equivalent would probably require twice or more times as much memory, as the only fair shuffle mechanism I have seen that doesn't operate in-place, uses a tree structure which must take up more storage than an array or list (I think). Equally, the popular and celebrated Haskell qsort implementation uses huge amounts of ram as it divides and subdivides the input list into a zillion smaller lists during recursion. It's an extremely neat implementation and really quite fast on smallish lists, but efficiency isn't just about operations that run well within the limits, but also what happens when those limits are breached. And where those limits are. Ram maybe cheap, but diskspace is even cheaper and the volumes of data being routinely processed are growing much faster than typical ram. Haskell (and other FP langauges) that categorically deny the possibility of programming with side effects--anything that operates in-place--will always be disadvantaged with respect to dealing with large volumes of data that must be in memory for algorithms to work. Perl is memory hungry, but also provides the tools that allow that hunger to be bypassed by using alternative techniques. Haskell doesn't provide this fallback position. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. Rule 1 has a caveat! -- Who broke the cabal?	[reply] [d/l]
GHC shuffle more efficient than Perl5. by audreyt (Hermit) on Apr 22, 2005 at 14:23 UTC
So, I ported your Perl 5 program to Haskell, and benchmarked both against the `1millionlines.dat` generated with this: `for (1..1_000_000) { print int(rand(10)), $/ }` [download] Result on my FreeBSD 5.3 laptop is: Perl5: 4.619u 1.010s 0:05.89 95.4% 10+79736k 0+0io 0pf+0w GHC: 3.007u 0.038s 0:03.10 97.7% 323+359k 0+0io 0pf+0w Note the constant memory use by GHC. The Haskell source code is attached as below. Read more... (2 kB)	[reply] [d/l] [select]
Re: GHC shuffle more efficient than Perl5. by Anonymous Monk on Apr 22, 2005 at 19:17 UTC
A reply falls below the community's threshold of quality. You may see it by logging in.
Haskell and inplace sorting by audreyt (Hermit) on Apr 22, 2005 at 09:12 UTC
Greetings. I'd just like to point out that Haskell can very well do side-effects inside the IO monad, including in-place sorting, down to pointer arithmetic with C structures. Ever since the Monadic Revolution, side effects in Haskell has become composable and distinguished from pure values, but every bit as usable as the equivalent C code.	[reply]
Re^25: Why is the execution order of subexpressions undefined? by Anonymous Monk on Apr 18, 2005 at 01:23 UTC
But a shuffle is nothing but a mapping of array indicies to indicies. So you can eliminate all the memory churn associated with your update in place scheme. It'll be faster in Perl or your functional language of choice. The `shuffle` routine will, of course, have to be parameterized for the length of array you are using. Its definition is left as an exercise for the reader). `$size = @big_array for (0..$size) { print $big_array[shuffle($_,$size)]; }` [download]	[reply] [d/l] [select]
Re^26: Why is the execution order of subexpressions undefined? by BrowserUk (Patriarch) on Apr 18, 2005 at 01:32 UTC