Randomize CSV word lists

Grendel2112 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Randomize CSV word lists by Juerd (Abbot) on Apr 04, 2002 at 18:14 UTC
DBI and DBD::CSV Text::CSV or Text::CSV::XS How do I shuffle an array randomly? It's Perl, not PERL. Good luck! U28geW91IGNhbiBhbGwgcm90MTMgY W5kIHBhY2soKS4gQnV0IGRvIHlvdS ByZWNvZ25pc2UgQmFzZTY0IHdoZW4 geW91IHNlZSBpdD8gIC0tIEp1ZXJk	[reply]
Re: Re: Randomize CSV word lists by RMGir (Prior) on Apr 04, 2002 at 18:16 UTC
Good list, Juerd. But you forgot "Make sure you have -w and use strict" :) -- Mike	[reply]
Re: Re: Re: Randomize CSV word lists by Juerd (Abbot) on Apr 04, 2002 at 18:21 UTC
But you forgot "Make sure you have -w and use strict" :) I assume every beginner already has seen that thousands of times. Besides, I'm convinced one has to experience strictless hell before enjoying strict - that's how I did it, and it made me love strict even more. U28geW91IGNhbiBhbGwgcm90MTMgY W5kIHBhY2soKS4gQnV0IGRvIHlvdS ByZWNvZ25pc2UgQmFzZTY0IHdoZW4 geW91IHNlZSBpdD8gIC0tIEp1ZXJk	[reply]
Re: Re: Re: Re: Randomize CSV word lists by RMGir (Prior) on Apr 04, 2002 at 18:34 UTC
Re: Re: Randomize CSV word lists by Grendel2112 (Initiate) on Apr 04, 2002 at 19:07 UTC
I saw the "How do I shuffle an array randomly?" but my concern with that is that it might randomize the words out of their original columns and not knowing enough about this I can't determine if that is the case.	[reply]
Re: Randomize CSV word lists by traveler (Parson) on Apr 04, 2002 at 19:08 UTC
Juerd's list is good, but there is no need to copy and paste the shuffle algorithm. You can use Algorithm::Numerical::Shuffle. Despite the name, it does shuffle lists of strings. This module has been of great use to me. HTH, --traveler	[reply]
Re: Randomize CSV word lists by graff (Chancellor) on Apr 05, 2002 at 03:29 UTC
Previous comments were all informative, but I gather you may still be wondering how to deal with 80 columns or so of data... You want to transpose the CSV array, so that each column is stored in its own array so you can shuffle it. I tried the following on (a copy of) a csv dump of my last bank statement -- seems to do the job (could make taxes interesting this year...) BTW, I suppose Fisher_Yates is good enough, but my own favorite has always been to prepend a random number to the string (default output of rand() is between 0.0 and 0.999...), then sort, then remove the random number. use strict; my @transpose; # this will be an array of arrays my $ncols = 0; while (<>) { chomp; my @cols = split(/,/); if ( $ncols ) { die "Line $. doesn't have $ncols columns\n" if ( $ncols != scalar @cols ); } else { $ncols = scalar @cols; } foreach my $i (0..$#cols) { push( @{$transpose[$i]}, $cols[$i] ); } } my $nrows = $.; for (0..$ncols-1) { &fisher_yates_shuffle( $transpose[$_] ); } foreach my $i (0..$nrows-1) { my @cols = (); foreach my $j (0..$ncols-1) { push( @cols, $transpose[$j][$i] ); } print join( ",", @cols ) . "\n"; } [download]	[reply] [d/l]
Re: Re: Randomize CSV word lists by Juerd (Abbot) on Apr 05, 2002 at 06:33 UTC
BTW, I suppose Fisher_Yates is good enough, but my own favorite has always been to prepend a random number to the string (default output of rand() is between 0.0 and 0.999...), then sort, then remove the random number. Not only is it good enough, it's also a lot more efficient and scalable. The Fisher_Yates algorithm is an inline sort, swapping array elements. Your solution first alters all elements, then sorts it, assigns the result of the sort to an array, after which you remove the string. I have not benchmarked it, but it sounds like a slow procedure - which may still be very useful for small arrays. my @cols = split(/,/); That's not CSV parsing. CSV isn't just comma-seperated, the format also supports quoted strings and escaping of quotes with other quotes. See Re (tilly) 1: csv output. U28geW91IGNhbiBhbGwgcm90MTMgY W5kIHBhY2soKS4gQnV0IGRvIHlvdS ByZWNvZ25pc2UgQmFzZTY0IHdoZW4 geW91IHNlZSBpdD8gIC0tIEp1ZXJk	[reply]
Re: Re: Randomize CSV word lists by Grendel2112 (Initiate) on Apr 05, 2002 at 13:19 UTC
Thank you very much. That was very useful. Oh, frabjous day, Calloo, callay. :)	[reply]