Can perldoc support unicode?

woosley has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have used to write document using POD, but if I type some Chinese in the POD file, then perldoc command can not display it correctly.

=encoding utf8

=head1 NAME
    
    &#27979;&#35797;perldoc
[download]

run perldoc test.pod will give you

TEST(1)               User Contributed Perl Documentation             
+ TEST(1)

NAME
           XXperldoc

perl v5.10.0                      2009-12-30
[download]

All the unicode is replaced by a 'X'. How can I fix this?
Man, perlmonk dose not support unicode either?

Comment on Can perldoc support unicode? Select or Download Code

Replies are listed 'Best First'.
Re: Can perldoc support unicode? by ikegami (Patriarch) on Dec 30, 2009 at 14:22 UTC
Man, perlmonk dose not support unicode either Yes and no. The form contents must be encoded using iso-8859-1, but since PerlMonks accepts HTML, you can use entities (`&` sequences) to represent any unicode character. You tried to post characters outside of iso-8859-1, so your browser submitted than as entities. The problem is that those entities were inside code tags, so they were presumed to be text, not HTML. 测试 As for `perldoc`, `perldoc -t` worked... kinda. It issued a warning and it will only work if you have a UTF-8 terminal. I don't know if the failure to work without `-t` is a bug, but it sounds like it from the earlier reply to your post. There's no excuse for -t's behaviour, though. A bug report is in order.	[reply] [d/l] [select]
Re: Can perldoc support unicode? by almut (Canon) on Dec 30, 2009 at 13:40 UTC
The issue seems to be specific to the man page formatter (`perldoc -o text ...` for example (i.e. Pod::Text) appears to work fine for me). From Pod::Man: utf8 By default, Pod::Man produces the most conservative possible roff output to try to ensure that it will work with as many different roff implementations as possible. Many roff implementations cannot handle non-ASCII characters, so this means all non-ASCII characters are converted either to a roff escape sequence that tries to create a properly accented character (at least for troff output) or to `X`. If this option is set, Pod::Man will instead output UTF-8. If your roff implementation can handle it, this is the best output format to use and avoids corruption of documents containing non-ASCII characters. However, be warned that roff source with literal UTF-8 characters is not supported by many implementations and may even result in segfaults and other bad behavior. So I guess the question becomes how to tell `perldoc` to pass the `utf8` option to the Pod::Man constructor...	[reply] [d/l] [select]