in reply to Re: PDF::API2 / unicode characters
in thread PDF::API2 / unicode characters

I believe the core fonts (i.e. what you get via $pdf->corefont(...) ) simply don't work with Unicode (investigation of the sources and extensive playing around essentially confirmed my suspicion that the generated font descriptions are always classical "Type 1" with an 8-bit encoding vector).  I'd be happy to be proven wrong, though!

OTOH, when you use an appropriate TTF font which has the required glyphs, things work just fine:

#!/usr/bin/perl use strict; use warnings; use PDF::API2; # Create a blank PDF file my $pdf = PDF::API2->new(); # Add a blank page my $page = $pdf->page(); my $font = $pdf->ttfont('DejaVuSans.ttf'); my $ustring = "\x{03a7}\x{03b1}\x{03b9}\x{03c1}\x{03b5}!"; # Add some text to the page my $text = $page->text(); $text->font($font, 20); $text->translate(80, 710); $text->text($ustring); # Save the PDF $pdf->saveas('test.pdf');

DejaVu is a free font with very good Unicode coverage (see the unicover.txt file that comes with the package for details).  And if you feel like designing your own glyphs, you can even download the font's FontForge source files...

P.S.: older versions of PDF::API2 (up to the maintainer change with 2.016) shipped with the DejaVu fonts included, so you may already have them under PDF/API2/fonts/.   Otherwise, if you install them somewhere else, you might want to specify the full path to the respective .ttf file.

Replies are listed 'Best First'.
Re^3: PDF::API2 / unicode characters
by LonelyPilgrim (Beadle) on Feb 17, 2012 at 20:11 UTC

    It works! I think my (and perhaps Andrea's) mistake was assuming that the PDF::API2 "core fonts" were the same as the standard TrueType fonts of the same name -- in my case, the version of Times New Roman in the Windows Fonts folder. When I changed your font line in the script above to this:

    my $font = $pdf->ttfont('C:/Windows/Fonts/times.ttf');

    I got the appropriate Unicode text in my PDF, in Times New Roman. The standard TrueType versions of Times New Roman, Verdana, and other fonts that come with Windows and probably other systems do support at least the basic Greek character set. That was why I assumed that the core fonts "should" support Greek. Thanks for your help!

    Thanks, too, for the recommendation for the DejaVu fonts. They seem like nice ones! I am kind of a font hoarder, especially for useful Unicode ones. For the extended Greek character set, diacritics and such, I also like the New Athena fonts (one of which supports the inverted breve circumflex, as opposed to the tilde circumflex many use).

    http://www.fontspace.com/american-philological-association/new-athena-unicode

    Correction once again: New Athena is a nice one, but it's Gentium and GentiumAlt that have the inverted breve circumflex:

    http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=gentium