Hello

I am trying a quite (for me) difficult task: adapt the module Lingua::EN::Tagger to be used with another language. To do so I need to train the probability values with a corpus in my language. The probability values are saved in several YAML files. Unfortunately there is 0 documentation, as far as I can say, describing how to do this. Actually, I have problems understanding how the probabilities are saved. I have some background in linguistics and in corpus linguistics. However, without documentation it is a hard task for me. I have seen that there is also a German version (Lingua::EN::Tagger) derived from the EN one. So the task, provided a corpus and some manual tagging to train the model, should be doable. I've written to the authors to get some info on how to proceed, but no response. Has somebody already tried to do something like this? If yes, have you found some documentation online on how to train the model? Any suggestion would be very much appreciated. Best.


In reply to Lingua::EN::Tagger adapting to other language by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.