Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: OT - Hemingway Editor (was: Re^4: How to count the vocabulary of an author?)

by LanX (Saint)
on Jun 14, 2021 at 15:20 UTC ( [id://11133850]=note: print w/replies, xml ) Need Help??


in reply to Re: OT - Hemingway Editor (was: Re^4: How to count the vocabulary of an author?)
in thread How to count the vocabulary of an author?

> But generally, English is one of the easier languages to process.

For stemmer! Sure!

But lack of grammar makes context and interpretation harder...

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^3: OT - Hemingway Editor (was: Re^4: How to count the vocabulary of an author?)
by choroba (Cardinal) on Jun 14, 2021 at 15:35 UTC
    Ever tried saying it to an average English speaker?

    The lack of grammar keeps related phrases closer to each other which helps parsing a lot.

    For free word order languages, grammar seems to help, but due to homonymy (or homography) you usually don't have a solid foundation to base the grammar on.

    The most advanced system nowadays are based on Machine Learning, so there's no grammar involved at all, you just need large training data.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      I've noticed that, when a google-translate translation (EN->NL is most conspicuous to me) is ridiculously wrong, it's often fixed a few months later. I think (or at least hope) that the reason is that more data has been processed, i.e., more 'training data', or at the very least - better numbers/statistic decisions.

        Google translate is (was?) problematic when translating between two non-English languages, because it transits via English.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

      > Ever tried saying it to an average English speaker?

      Sure, that's how I normally greet my friend John from Buffalo! ;-) °

      > The lack of grammar keeps related phrases closer to each other which helps parsing a lot.

      It ain't necessarily so, try deciphering the headlines of newspapers like the Guardian ...

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      °) actually he is from Hamburg NY, but that's too confusing for the locals here ...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11133850]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-28 12:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found