in reply to accents and diacritical marks in a web page

...renders like crap, unless I go to my browser and change the text encoding to Mac-Roman
Ok,

1. what does "like crap" mean?

2. your html <meta> tag claims the HTML printed from your perl code is utf-8 encoded. Usually that means you should set binmode(STDOUT,":utf8") prior to printing anything (also: you should make sure you read any non-7-bit-ascii input correctly)

Replies are listed 'Best First'.
Re^2: accents and diacritical marks in a web page
by punkish (Priest) on Sep 10, 2007 at 00:28 UTC
    1. what does "like crap" mean?
    like crap means
    explora el complejo dinámico entre la gente y la conservación como parte de la misión instead of explora el complejo dinámico entre la gente y la conservación como parte de la misión
    2. your html <meta> tag claims the HTML printed from your perl code is utf-8 encoded...
    that means I don't know UTF from my butt. I was simply trying different meta tags to try and make my web page claim something that would be understood by my browser -- I tried utf-8 along with x-mac-roman as well as iso-8859-1. I was just shooting in the dark hoping something will stick until I got the brilliant idea that I should ask those who know more than I do, that is, you monks.

    update: and note my OP above... this same utf-8 incantation exists in the plain html file with different languages, and that renders just fine on the same computer, same web server, everything. Its just that when put through my perl application, the languages get mangled.

    --

    when small people start casting long shadows, it is time to go to bed
      Seeing this, and reading your update, it's clear that the perl script itself is UTF-8 encoded. In that case you probably should use the utf8 pragma to indicate that that is the case (basically it just means you don't have to decode() all your string literals).

      AFAIK you can then use HTML::Entities directly, or set STDOUT to :utf8 as I suggested above, or both.

        Hi Joost,

        use utf8; # and even adding... binmode(STDOUT,":utf8");

        doesn't do it. Stuff comes out garbled. It would be nice if it worked, as I wouldn't have use Encode::decode_utf8 on everything, but right now only the latter seems to work.

        If you have any suggestions, let me know. In the meantime I will continue to proceed with Encode and then HTML::Entities

        Many thanks to you and shmem.

        --

        when small people start casting long shadows, it is time to go to bed