Hi bliako,

You might have some interest in reading Section 11.13 (Markov Analysis) of my Perl 6 book (http://greenteapress.com/thinkperl6/thinkperl6.pdf), which suggests an exercise on a quite similar subject, as well as Subsection A.9.5. presenting a solution to the aforesaid exercise. The exercise and its solution are doing much simpler things than your module, but it goes in the same direction: looking into a text for the probability, for a given sequence of words, of the words that might come next. Then using that probability to generate random sentences that might almost look like English (at least much more so than just picking random words). For example, running the program on Emma, the novel by Jane Austen, produced the following random text:

it was a black morning’s work for her. the friends from whom she could not have come to hartfield any more! dear affectionate creature! you banished to abbey mill farm. now i am afraid you are a great deal happier if she had no hesitation in approving. dear harriet, i give myself joy of so sorrowful an event;
As you can see, the result is almost syntactically correct, but not quite. And, semantically, it almost makes sense, but not quite.

I hope you find it fun.


In reply to Re: n-dimensional statistical analysis of DNA sequences (or text, or ...) by Laurent_R
in thread n-dimensional statistical analysis of DNA sequences (or text, or ...) by bliako

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.