comment on

You can either do this through sampling the population randomly or using the entire set as a base. Either way a straightforward way to do it would be to (let's assume the whole set):

Determine the optimum number of files in a directory (make up a number) or directories in a directory. Let's say...50. (for an example)
Load up an array with all of your data. Let's say there's 60000 ISBN's.
Determine the integer root of 60000 that closely yeilds 50. square root is 244, cube is roughly 38. Make your direcory depth 3 (cube root).
Sort your list (or your samples)
Split your list into 38 sub-list ranges. The first element in each of these 38 sections represents the uppermost bound for this section, the last the lowermost. This is your first level directory.
Split each of those into 38 sub-list ranges again. The first and last of each of these sublists represent the range of acceptable files for the second level directory. (These last two steps are nicely recursive...)
If you used a sample, you're gonna need a big enough sample to come close to 38^2.
Distribute the stuff in between accordingly into the sub-sub directories. You should now have 38 directories, with 38 subdirectories each with about 38 files.

In reply to Re: brainteaser: splitting up a namespace evenly by clintp
in thread brainteaser: splitting up a namespace evenly by perrin

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.