Yes. In fact, the reason the mystery text remains secret is so this technique is not applied to just that text skewing the results.
If so, does anybody know of a freely available list of word frequencies in US English?I am fairly certain I came across one this morning when researching but can't be sure that it was US English.
will the mystery text** consist of (a) more or less 'normal' English prose (albeit with punctation and capitalisation removed) or (b) a more or less random string of words (in which case frequency considerations will be otiose)?More or less US English prose.
... - which means that it would be pretty difficult to construct a coherent text of any length consisting of words only to be found in the list.You are quite correct. The 2of12inf.txt does a much better job in this area. On the other hand, if an entire book can be written without using the letter e in two different languages, I am sure that it will not be too difficult to provide mystery text between 3000 and 5000 words that meet the constraints.
Thanks once again for an interesting, thought-provoking challenge.You're welcome.
Cheers - L~R
In reply to Re^2: Challenge: Predictive Texting
by Limbic~Region
in thread Challenge: Predictive Texting
by Limbic~Region
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |