"But in the days of automatic translations this should be covered." From a few days ago.... It has also been a problem when the language was English (in so much that the stuff was/is on cpan/backpan).
| [reply] |
> has also been a problem when the language was English
Well, in the age of AI censoring to assure political correctness this could be flagged too. (Sarcasm batteries included)
(Tho, I'm not too sure about biological intelligence either, I remember the case of Romanian referees attacked at a champions league game for using the word "negru" which only means black in their language and shouldn't be measured by anglo Saxon standards)
update
https://valahia.news/romanian-referee-black-negru-demba-ba/
| [reply] |
This very nicely corresponds with a statement which I read there on Mastodon:
The two big problems with artificial intelligence are that it doesn’t exist and the real thing is in pretty short supply as well.
| [reply] |
Regarding machine translations, that may still be somewhat of an issue outside of European languages. Asian languages differ so radically from European languages that any machine translation between them is of poor quality. It may suffice, for those languages where translation is available, to understand the main topic, but the details are unreliable (some keywords can be totally mangled). Use of idioms will reduce translation quality considerably, and a skilled linguist could potentially manipulate the machine translation to obscure the actual intent of the message. True, great strides toward improvement have been made in recent years, but there are still many shortcomings and languages for which no machine translation is available--for several of which (Karen, Shan, etc.) I am already in process of coding utilities.
Regarding inclusion of non-Roman scripts in POD, there may be some encoding issues, especially for those unaccustomed to or unprepared for working with unicode fonts.
| [reply] |
There are also issues between western languages because English is often the intermediate bridge language and lacks grammar.
As an example:
Most western languages differentiate between a respectful you and an intimate you
(Vous/tu Sie/du Usted/tu ...) when addressing another person.°
But Google translate will default to respectful, so "I like you" in French will become something like "Sir/Ma'am, I like you" in Spanish, which is weird when flirting.
°) on a side note the antiquated English "thou" is the direct translation for tu/du , so it's the informal you. But because people mostly know it from biblical texts, they think it's a respectful form.
(Surprise: God doesn't respect you! ;)
Bottom line: the old English from Beowulf would be a much better intermediate language for Indo European translations.
| [reply] |
In Biblical language, there is no pronoun deviation between persons based on status of any type, only by gender or number. There is, of course, the familiar versus the formal, but, for example, this does not affect the language used either by or for God. This is unlike several of the Asian languages which have "royal language" that must be used for deity or royalty, and which is entirely distinct from that of the commoners' language. If Google Translate detects Biblical language, it will adjust the pronouns accordingly; but there is a threshold of similarity to the Biblical text beneath which this adjustment is not made.
But there are so many differences in languages. Thai and Lao, for example, simply do not have all of the vocabulary which English has, including a number of the key "glue" words that give basic structure to the grammar. For some examples, the following words have no translation equivalent: of, lest, neither, nor, either, never, etc. There is no such thing as verb conjugations, so certain concepts, such as the future perfect tense in English, are not translatable. There is no such thing as plurals, so distinguishing between a singular form and a plural form requires additional words. There is no such thing as articles (a, an, the), but there is such a thing as a noun classifier word, which varies by item and has no English equivalent.
Some words are surprisingly absent, seemingly along with their entire concept: ignore/ignorance, brother/sister (must specify if older/younger), sibling, parent (must specify father/mother), etc., and some words diverge that are unified in English, such as grandfather/grandmother (two words, one for paternal, one for maternal). Machine translation cannot possibly add information that did not previously exist in the source language, but it is forced to guess; so "brother" becomes "younger brother" and "grandma" becomes "mother's mother," etc. when translating from English. Even a human translator is forced to make these same calls, of course, so this is not a criticism of the machine translation so much as of the potential accuracy for such a translation in the first place. The machine will not ask for clarifications as a human translator might, where the opportunity exists.
The lack of the word "of" is one that irritates me. Bible translations here, for example, may translate "And he that taketh not his cross, and followeth after me, is not worthy of me" (Matthew 10:38) to something like "the one who does not receive his cross and follow me is not worthy." The "of me" is simply not translated, because...what else would it be? "for me"? "to me"? "from me"? "about me"? "by me"? "in me"? This issue with the Biblical translation holds true with multiple Asian languages, including Thai, Lao, Malay, and Bengali--none of these languages has a true equivalent for "of."
| [reply] |