ultranerds has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to get my head around this. I have created a CSV file:

https://steampunkjunkies.net/it_other.csv

...yet when I import into Amazon, the description is screwed:

https://www.amazon.it/Steampunk-Junkies-alimentazione-connettore-micro-USB/dp/B01CO672AS/

The .csv file shows as "ANSI as UTF-8" for me ("Encoded in UTF8, without BOM" in Notepad++) ... and Excel / LibreCalc show all the characters fine.

I'm at a loss as to where this issue is coming from. Any suggestions are much welcome.

The code is pretty simple:

open (WRITE_IT,">$CFG->{admin_root_path}/amazon_template_tmp/it_other. +csv") || die "Cant write $CFG->{admin_root_path}/amazon_template_tmp/ +it_other.csv. Reason: $!"; binmode(WRITE_IT, ":utf8"); print WRITE_IT $the_contents; close(WRITEIT);


The mySQL table is stored in UTF-8, so there shouldn't be any need to convert.

Thanks!

Andy

Replies are listed 'Best First'.
Re: Stupid UTF-8 issue with CSV file
by ikegami (Patriarch) on Dec 05, 2016 at 18:17 UTC

    It is properly encoded using UTF-8. Even though "quindi è perfetto" shows up as "quindi Ú perfetto" in the page you linked, "è" is encoded as C3 A8 in the file you linked. So the question becomes: What encoding does Amazon expect? There could also be an issue in how the data was passed to Amazon.

    Update: Removed bit about the file not being a CSV file and how to fix that. That's obviously not relevant since Amazon is actually able to read your data. Replaced it with more information about the encoding.

        What happens if you force the file to be written as UTF-8?

        open(my $output, '>:utf8', $filename) or die "Cannot write $filename: +$!";

        Considering that the CVS data is in UTF-8, which is a good idea anyway unless you have restrictions for it.

        You should be using open with three arguments anyway, it is a good practice.

        Alceu Rodrigues de Freitas Junior
        ---------------------------------
        "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill

        Don't just replace :utf8 with :encoding(iso-latin-1).

        Instead of

        my $qfn = $CFG->{admin_root_path}."/amazon_template_tmp/it_other.csv"; open(my $fh, '>:encoding(iso-latin-1)', $qfn) or die("Can't create \"$qfn\": $!\n"); print($fh $the_contents);
        use
        use Encode qw( encode ); my $qfn = $CFG->{admin_root_path}."/amazon_template_tmp/it_other.csv"; open(my $fh, '>:raw', $qfn) or die("Can't create \"$qfn\": $!\n"); print($fh encode('iso-latin-1', $the_contents, Encode::FB_HTMLCREF));

        That way, characters that can't be encoded using iso-latin-1 will be replaced with HTML escapes (e.g. ♠).

Re: Stupid UTF-8 issue with CSV file
by kennethk (Abbot) on Dec 05, 2016 at 18:41 UTC
    Not to be a jerk, but you've been around long enough that you should know to use <code> tags and not <pre> tags. See Code: and Tags You Should NOT Use in Markup in the Monastery.

    Update: Corrected. Thank you.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Oops - its been a long day! (13 hours+). I've fixed it up. I think I need to call it a day and come back to this fresh tomorrow.
Re: Stupid UTF-8 issue with CSV file
by 1nickt (Canon) on Dec 05, 2016 at 18:11 UTC

    Your code looks fine to me, and I see accented characters on the amazon.it page. Can you restate your problem, please?

    The way forward always starts with a minimal test.
      Thanks for the reply. So you don't see this?

      Questo caricatore super elegante ti darà la carica extra necessaria durante la giornata. Abbastanza per una carica completa per ogni tipo di telefono cellulare. E anche per una carica ampia di tablet. Consegnato con un connector multi-uso molto pratico (certificato Apple).
      
      Pack batteria e Caricatore per iPad®, iPhone®, iPod®, e altri oggetti mobili USB (3000mAh)
      
      Tieniti il tuo iPad®, smartphone, Playstation Vita?, e ogni altro oggetto mobile completamente carico in qualunque posto andrai con questo pack caricatore!
      
      Consegnato in una custodia di velluto, e una scatola di legno (per mantenerlo al sicuro o per regalarlo!)


      For me, every single charachter is broken.

      Cheers

      Andy

        Hi, I had not scrolled down far enough to see the broken text: there is a large amount above that is properly rendered.

        The way forward always starts with a minimal test.