in reply to PDF::API2 printing non ascii characters

Hello Anonymous Monk,

One possible way could be with HTML::Entities.

Sample of code:

#!/usr/bin/perl
use strict;
use warnings;
use HTML::Entities;
use open ':std', ':encoding(UTF-8)';

my $html = "Character one: ω character two: ∞";
print decode_entities($html), "\n";

__END__

$ perl test.pl
Character one: ω character two: ∞

Update: Adding complete answer. Sample of code from PDF::API2 / unicode characters. The solution to your problem is to add the appropriate font method. From the documentation PDF::API2/FONT_METHODS:

FONT METHODS @directories = PDF::API2::addFontDirs($dir1, $dir2, ...) Adds one or more directories to the search path for finding font files +. Returns the list of searched directories. $font = $pdf->corefont($fontname, [%options]) Returns a new Adobe core font object.

In my sample of code I only use one but if you follow the documentation you can add more. I downloaded the fonts from Fonts by DejaVu Fonts.

Sample of working code:

#!/usr/bin/perl use strict; use warnings; use PDF::API2; use HTML::Entities; # Create a blank PDF file my $pdf = PDF::API2->new(); # Add a blank page my $page = $pdf->page(); my $font = $pdf->ttfont('DejaVuSans.ttf'); # Add some text to the page my $text = $page->text(); $text->font($font, 20); $text->translate(80, 710); my $html = "Character one: ω character two: &#8734"; my $decoded_string = decode_entities($html); $text->text($decoded_string); # Save the PDF $pdf->saveas('test.pdf');

Let us know if this works for you. BR / Thanos.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Replies are listed 'Best First'.
Re^2: PDF::API2 printing non ascii characters
by Anonymous Monk on Mar 13, 2018 at 15:08 UTC

    What if the submitted html input is "%CF%89%20%E2%88%9E" (ω ∞) instead of the numeric codes below?

    ω ∞

    How do I decode that before handing over to the pdf text method?

      Hello again Anonymous Monk,

      In this case you can use URI::Escape. See sample bellow:

      #!/usr/bin/perl
      use strict;
      use warnings;
      use URI::Escape;
      use feature 'say';
      
      my $str = "Character one: ω character two: ∞";
      my $hex_code = uri_escape( $str );
      say $hex_code;
      
      my $string = uri_unescape( $hex_code );
      say $string;
      
      __END__
      
      $ perl test.pl
      Character%20one%3A%20%CF%89%20character%20two%3A%20%E2%88%9E
      Character one: ω character two: ∞
      

      Hope this helps, BR.

      Seeking for Perl wisdom...on the process of learning...not there...yet!

        Hi, thanos1983. I'd like to point out, that use of non-latin1 characters in Perl source, without use utf8; is misleading. Perhaps your example might be modified to this (note the differences, while output is deceptively the same):

        #!/usr/bin/perl
        use strict;
        use warnings;
        use URI::Escape;
        use feature 'say';
        
        use utf8;
        use Encode qw/ encode decode /;
        binmode STDOUT, ':utf8';
        
        my $str = "Character one: ω character two: ∞";
        my $hex_code = uri_escape_utf8( $str );
        say $hex_code;
        
        my $string = decode 'UTF-8', uri_unescape( $hex_code );
        say $string;
        
        __END__
        
        

        To answer Anonymous Monk's direct question:

        use strict; use warnings; use PDF::API2; use URI::Escape; use Encode qw/ decode /; my $percent_encoded_str = '%CF%89%20%E2%88%9E'; my $octets = uri_unescape $percent_encoded_str; my $proper_unicode_str = decode 'UTF-8', $octets; my $pdf = PDF::API2-> new; my $page = $pdf-> page; my $text = $page-> text; my $ttf_font = $pdf-> ttfont( 'DejaVuSans.ttf' ); $text-> font( $ttf_font, 20 ); $text-> translate( 50, 700 ); $text-> text( $proper_unicode_str ); $pdf-> saveas( 'test.pdf' );

        I'm scratching my head now with the code below:

        #sometext is a web input my $line = uri_escape($sometext); # $line prints $VAR1 = 'Hello%20%CF%89%20%E2%88%9E'; $line = uri_unescape($line); # $line prints Hello ω ∞
        instead of Hello ω ∞

        What am I missing?

Re^2: PDF::API2 printing non ascii characters
by Anonymous Monk on Mar 13, 2018 at 13:09 UTC

    It works (bouncing!!!)

    Thank you so much!!!