PDF::FromHTML and UTF8

Tobiwan has asked for the wisdom of the Perl Monks concerning the following question:

Has anyone use the module PDF::FromHTML with unicode-characters? My HTML contains german chars like 'ö' and it would contain russian text soon. It doesn't matter if I write it in utf-8 or as ö or in XML-encoded, all non-ACSCII-chars where deleted.

I tried to set an unicode-font, but it uses PDF::API2::Resource::Font::CoreFont::* fonts, which are allways without any unicode definition. Any idea? Or know anybody another module für generate PDF from HTML with uncode-chars?

Tobiwan

Comment on PDF::FromHTML and UTF8

Replies are listed 'Best First'.
Re: PDF::FromHTML and UTF8 by Joost (Canon) on Dec 07, 2006 at 21:57 UTC
reading the documentation: `my $pdf = PDF::FromHTML->new( encoding => 'utf-8' ); $pdf->load_file('source.html'); $pdf->convert( # With PDF::API2, font names such as 'traditional' also works Font => 'font.ttf', LineHeight => 10, Landscape => 1, ); $pdf->write_file('target.pdf');` [download] It seems you might need a utf8 true type font (ttf) or PDF::API2 "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply] [d/l]
Re^2: PDF::FromHTML and UTF8 by Anonymous Monk on Dec 08, 2006 at 14:58 UTC
It's not that simple with an utf8-ttf :( Internally the module use charsets from PDF::API2::Resource::Font::CoreFont::*. In this tables any char is described. If I use some utf8-ttf, I've to write some of thoose descriptions. And I don't want to do that for thousands of characters. Can I use PDF::API2 to create PDS's from HTML directly?	[reply]
Re^3: PDF::FromHTML and UTF8 by Joost (Canon) on Dec 09, 2006 at 00:05 UTC
I'm sorry, I don't really understand what you are telling me. Do you mean you need to write more code to use utf8 fonts? If so, show some of it, maybe we can reduce it a bit. "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply]