carcassonne has asked for the wisdom of the Perl Monks concerning the following question:

Well, in this worldwide community that's the internet, what could be considered as 'foreign' ? Undoubtly something that's out of this world.

Not:

How can I use French characters in POD files like, for instance, in the following lines, so that once converted to HTML they are OK:

=head1 Filières =head1 Objet: Création

I use emacs. When I open the POD file in hex mode, the extended characters are represented using two bytes. For instance 'è' is 0xC3 0xA8 (or 0xA8 0xC3) and not an ASCII chars above 128. So I guess some kind of encoding goes on.

Do I have to give other arguments when using pod2html to convert the file ?

Thanks !

UPDATE

Found it.

The HTML file generated by pod2html has to be fixed by adding the following line after the DOCTYPE declaration:

<meta http-equiv="Content-type"content="text/html;charset=UTF-8" />

There must be a way to do this automatically every time pod2html is called. Maybe a wrapper...

Replies are listed 'Best First'.
Re: POD and foreign characters
by TimToady (Parson) on Nov 28, 2006 at 19:24 UTC
    I believe a recent version of pod2html will recognize an initial
    =encoding utf8
    and do the right thing.
      I don't know much about how encodings work, but if what we've finally got, after all those years with extended ASCII charsets, is a universal way of representing characters using multibytes, should not then UTF-8 be _the_ encoding and thus always specified in concerned documents ?

      Let me guess... There's still no all-encompassing way of representing all known characters. 32 bits (4294967295) is still not enough.

Re: POD and foreign characters
by shmem (Chancellor) on Nov 28, 2006 at 15:25 UTC
    The sequence chr(0xc2) . chr(0xa8) is the utf-8 encoded è. You should open your emacs in utf-8 mode (don't know how's that, I use vi ;-)

    Other arguments for pod2html? But you wrote "once converted to HTML they are OK" ?

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      emacs is fine. There are no problems with emacs and these characters. The original POD file edited using emacs shows the characters as described in hex in the original query.

      The problem reads:

      How can I ... (do this, do that) ... so that once converted to HTML they are OK.

      ...Meaning that the POD HTML output is currently _not_ OK.

      Sorry for any misunderstanding.

        the POD HTML output is currently _not_ OK

        Oh, I think it probably is. It's just UTF-8. You need to configure your web server to tell browsers that these files are UTF-8. Try adding the appropriate META tag to the HTML files. See if that improves things.

        --
        <http://dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg