comment on

One could argue in great length whether additional shuffling in the array improves the randomness, or whether it produces patterns in the longer term,

If your RNG was perfect, additional shuffles would neither improve or degrade the randomness.

It seems that additional "shuffling" is required to prevent a permutation of the array where there is a correlation between items that are swapped.

It seems the "additional shuffling" is required.

Consider the shuffling of an array of three items without the so called "additional shuffling": Starting from the left:

Initial:   abc

1 move:    abc             bac(1/3)        cba(1/3)

2 move:    abc(1/6)  acb(1/6)
[download]

Updated: Per Abigail's criticism of a misstyped table and obscurity. To clarify:

I take issue with:

On the surface, each array entry has the appearance of being given a single opportunity at being swapped with another array entry. In practice, array entries nearer to the beginning of the array have additional opportunities of being swapped (as a result of later swaps), meaning that less shuffling happens at the end of the array than at the beginning.

One could argue in great length whether additional shuffling in the array improves the randomness, or whether it produces patterns in the longer term, but it isn't difficult to argue that regardless of which conclusion is valid, the fact that the entries nearer to the beginning have additional chances of being shuffled, while entries nearer the end have fewer chances, that *one* of these ends will be improved more than the other, therefore the algorithm is not optimal.

Which seems to say that the possible movement of an already moved item is a weakness of the algorithm. But an array cannot be randomly shuffled in place without allowing elements to be moved more than once.

The idea that the errors of a RNG will more adversely affect the items to which that RNG is more often applied is an argument that is more subtle. I reject it out of hand. If the RNG is incorrect the results will not be random and we are quibling about how obvious the error is. This distinction may be very important in practice and there are practical solutions to get a pseudo-randomness that is okay for varying definitions of okay.

In reply to (Re:)+ Fisher-Yates theory by rir
in thread Fisher-Yates theory by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.