in reply to De-googleizing translation scripts

Nowadays DeepL is considered to produce better results than Google Translate. It even has an HTTP API, for which you can register for free (for some value of free): https://www.deepl.com/pro-api?cta=header-pro-api

I don't know about this "trans" command of yours, but coding a wrapper script for a HTTP API is trivial in Perl.

Replies are listed 'Best First'.
Re^2: De-googleizing translation scripts
by Aldebaran (Curate) on Nov 06, 2022 at 04:54 UTC
    Nowadays DeepL is considered to produce better results than Google Translate. It even has an HTTP API, for which you can register for free (for some value of free):

    Thx for your reply, kikuchiyo, I think this is gonna work out for me. They do make you put a credit card on record, but I'm willing to offer that kind of skin in this game. DeepL seems more trustworthy than Google. I'm excited to see what capabilities this service can provide.

    I don't know about this "trans" command of yours, but coding a wrapper script for a HTTP API is trivial in Perl.

    Well, I don't know about "trivial." Maybe for corion and bliako, but I'm a garden-variety human who fumbles the ball and needs to consult. I was proud that I remembered corion's curl converter, from which I got this:

    #!perl use strict; use warnings; use HTTP::Tiny; my $ua = HTTP::Tiny->new( 'verify_SSL' => '1' ); my $res = $ua->request( 'POST' => 'https://api-free.deepl.com/v2/translate', { headers => { 'Authorization' => 'DeepL-Auth-Key redacted', 'Content-Length' => '37', 'Accept' => '*/*', 'Content-Type' => 'application/x-www-form-urlencoded', 'User-Agent' => 'curl/7.55.1' }, content => "text=Hello\x252C\x2520world!&target_lang=DE" }, ); __END__ Created from curl command line curl -X POST 'https://api-free.deepl.com/v2/translate' -H 'Author +ization: DeepL-Auth-Key redacted' -d 'text=Hello%2C%20world!' + -d 'target_lang=DE'

    But I run into trouble decoding the json:

    fritz@laptop:~/Documents$ ./3.trans.pl Hello neighbor on Watercress lane, {"translations":[{"detected_source_language":"EN","text":"Hola vecino +de Watercress lane,"}]}content is {"translations":[{"detected_source_ +language":"EN","text":"Hola vecino de Watercress lane,"}]} data is HASH(0x55fbf7cf4140) ... Anyways, I start getting letters saying that I have not complied with +this declaration, which had the bizarre predicate that we had to come + to their residence to prove that we had complied. One thing I can pr +omise you: I will never cross their threshold, because I don't want t +o know them at all based on what they stuffed into my mailbox. {"translations":[{"detected_source_language":"EN","text":"De todos mod +os, empiezo a recibir cartas diciendo que no he cumplido con esta dec +laración, que tenía el extraño predicado de que teníamos que ir a su +residencia para demostrar que habíamos cumplido. Una cosa puedo prome +ter: Nunca cruzaré su umbral, porque no quiero conocerlos en absoluto + basándome en lo que me metieron en el buzón."}]}content is {"transla +tions":[{"detected_source_language":"EN","text":"De todos modos, empi +ezo a recibir cartas diciendo que no he cumplido con esta declaración +, que tenía el extraño predicado de que teníamos que ir a su residenc +ia para demostrar que habíamos cumplido. Una cosa puedo prometer: Nun +ca cruzaré su umbral, porque no quiero conocerlos en absoluto basándo +me en lo que me metieron en el buzón."}]} data is HASH(0x55fbf8768858) fritz@laptop:~/Documents$ ^C

    Source:

    #!/usr/bin/perl use v5.030; # strictness implied use warnings; use Path::Tiny; use HTTP::Tiny; use JSON::MaybeXS; my $file_in = path("/home/fritz/Desktop/1.enchanto.txt"); my $file_out = path('/home/fritz/Desktop/1.enc_trans.txt'); my $lang = 'es'; my $guts = $file_in->slurp_utf8; my @spl = split( '\n', $guts ); my $ua = HTTP::Tiny->new( 'verify_SSL' => '1' ); for my $para (@spl) { say $para; my $payload = "text=$para&target_lang=$lang"; my $payloadlen = length($payload); my $response = $ua->request( 'POST' => 'https://api-free.deepl.com/v2/translate', { headers => { 'Authorization' => 'DeepL-Auth-Key redacted', 'Content-Length' => $payloadlen, 'Accept' => '*/*', 'Content-Type' => 'application/x-www-form-urlencoded', 'User-Agent' => 'curl/7.55.1' }, content => $payload, }, ); die "Failed!\n" unless $response->{success}; print $response->{content} if length $response->{content}; my $content = $response->{content}; say "content is $content"; my $data = decode_json($content); say "data is $data"; $file_out->spew_utf8( $para, $data ); } __END__

    I typically use bliako's software for this, but I couldn't reconcile that with HTTP::Tiny:

    use LWP::UserAgent; use HTTP::Request; use Data::Roundtrip; ... my $req = HTTP::Request->new( ... $response = $ua->request($req); die "Error fetching: " . $response->status_line unless $response->is_success; my $content = $response->decoded_content; my $data = Data::Roundtrip::json2perl($content); die "failed to parse received data:\n$content\n" unless exists $data->{'elevation'}; return $data->{'elevation'};

    In particular I don't see how to do this without these modules:

    my $content = $response->decoded_content; my $data = Data::Roundtrip::json2perl($content);

    Anyways, I'm elated that I have spanish that I don't understand already and hope that someone can help me over the finish line with the json.

    Cheers from the Rocky Mountains,

      data is HASH(0x55fbf8768858)

      You are receiving a JSON string from the remote server with your script (great!), that's stored in $response->decoded_content. Then you correctly convert that string, using decode_json(), into a perl data structure and store it in variable $data, in this case, of type HASH. You can use this data structure ($data) as usual, e.g. my $text1 = $data->{'translations'}->[0]->{'text'}. The data structure is this, for my case:

      { 'translations' => [ { 'text' => 'vencino hola', 'detected_source_language' => 'ES' } ] };

      If your question is how to print this data structure ($data) and get something meaningful instead of data is HASH(0x55fbf8768858), then there are lots of choices, I know of 2: Data::Dumper's Dumper() and Data::Roundtrip's perl2dump()*, which you mentioned already. Pick your poison.

      Of course you can write your own "data dumper", and that would be a nice climb up Recursion Peak and the Monastery is right behind you.

      Note that you have included an auth-key in your SCSE. You don't want that. *They* have now linked your CC, your translations and your monk handle and thus your comments. Brrrr (but hey the danger is not with "They" but with evil dictators outside Western Democracies /sic/ /sarcasm-off)

      bw, bliako

      Edit: *) Data::Roundtrip depends on Data::Dumper, so it would be simpler to use the latter, the former offers data converters and an easy way to "not-bloody-escape-unicode" which the latter does incessantly, to my eyeballs' irritation.

        "an easy way to "not-bloody-escape-unicode" which the latter does incessantly, to my eyeballs' irritation."

        See Data::Dumper::AutoEncode.

        Hope this helps!


        The way forward always starts with a minimal test.
        Note that you have included an auth-key in your SCSE. You don't want that.

        Thanks everyone who told me directly that I had left my fly undone. I was amazed that the example code worked right out of the gate...I guess it wasn't just an example, I got a little fooled as I hadn't searched for my key yet. (now changed) It's a pretty slick operation at deepl.

        First of all, bliako, thank you for your response, and it's good to hear from you. I had feared for your welfare with your proximity to ...Charybdis, but you sound no worse for the wear. I got some better results and then tried to extend it, make it more bliako-esque, and didn't quite get there. The writeup will be better in readmores:

        una velada agradable para el monasterio,

      See here for some more ideas for your client. See also WWW::Curl. If I didn’t mention this already.

      And remember from the DeepL API:

      "… You should not put the key in publicly-distributed code… If your authentication key becomes compromised, you can recreate a new key and discard the old one in your account settings."

      Regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

Re^2: De-googleizing translation scripts
by karlgoethebier (Abbot) on Nov 07, 2022 at 01:28 UTC
    "…better results…"

    Out of curiosity, let's compare:

    «The Crux of the Biscuit is the Apostrophe»

      I'm really impressed by both of those. (Caveat: I don't really know German, I mostly understand it by analogy with Dutch - I'm interested to know what a native speaker thinks.)

      I'm not sure, but I suspect the DeepL transcript is slightly the more idiomatic. I'm slightly confused, however, by "Sie kann die Welt beeinflussen" - the switch from "er" used in the preceding sentences to "sie" seems odd (contrasted with the consistent "es" in the Google translation), but maybe there's some grammatical requirement.

      FWIW I've been struggling over the last few months with Google and Yandex translations of English <-> Russian, in correspondence with a group of mathematicians. Both of them seem to do a pretty terrible job translating those mathematical discussions - certainly much worse than the quality of these English-German translations might cause me to expect. Restricting myself to short, idiom-free sentences does not appear to have helped.

        Well, I would say the use of personal pronouns in both translations is just plain wrong. Regards, Karl

        «The Crux of the Biscuit is the Apostrophe»