comment on

Not_a_Number,
...are we allowed to sort our datastructure by frequency of paragrams..

Yes. In fact, the reason the mystery text remains secret is so this technique is not applied to just that text skewing the results.

If so, does anybody know of a freely available list of word frequencies in US English?

I am fairly certain I came across one this morning when researching but can't be sure that it was US English.

will the mystery text** consist of (a) more or less 'normal' English prose (albeit with punctation and capitalisation removed) or (b) a more or less random string of words (in which case frequency considerations will be otiose)?

More or less US English prose.

... - which means that it would be pretty difficult to construct a coherent text of any length consisting of words only to be found in the list.

You are quite correct. The 2of12inf.txt does a much better job in this area. On the other hand, if an entire book can be written without using the letter e in two different languages, I am sure that it will not be too difficult to provide mystery text between 3000 and 5000 words that meet the constraints.

Thanks once again for an interesting, thought-provoking challenge.

You're welcome.

Cheers - L~R

In reply to Re^2: Challenge: Predictive Texting by Limbic~Region
in thread Challenge: Predictive Texting by Limbic~Region

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.