stolara:

I couldn't reproduce the problem here with your code. I keep getting the message:

$ perl 979790.pl Wide character in syswrite at /usr/local/share/perl/5.10.1/PDF/Reuse.p +m line 977, <FILE> line 212.

I'm supposing that either you're not handling UTF-8 cleanly enough in your program, or you're not providing adequate test data to reproduce it. (Try adding a single UTF-8 string in your program, so we don't need to locate a UTF-8 file.)

I went ahead and bodged something up to create a simple PDF with UTF-8 to see if I could find the text:

#!/usr/bin/perl use strict; use warnings; use utf8; use PDF::API2; my $content='&#8750; E&#8901;da = Q, roboticus n &#8594; &#8734;, &#87 +21; f(i) = &#8719; g(i)'; my $pdf = PDF::API2->new(); my $page = $pdf->page(); $page->mediabox('Letter'); my $font = $pdf->ttfont('/usr/share/cups/fonts/FreeMono.ttf', -encodin +g=>'utf-8'); my $text = $page->text(); $text->font($font, 20); $text->translate(15,15); $text->text($content); $pdf->saveas('test_603.pdf');

It properly created the file, and Adobe Reader displayed it as I expected. But, as you reported in your setup, Adobe Reader wouldn't find a simple, little robot embedded in the text.

$ perl --version This is perl, v5.10.1 (*) built for i486-linux-gnu-thread-multi Linux Boink 2.6.32-39-generic #86-Ubuntu SMP Mon Feb 13 21:47:32 UTC 2 +012 i686 GNU/Linux PDF::API2 2.019 Adobe Reader 9.3.2 04/01/2010

Update: Fixed the broken code tag. (I can't believe I missed it in preview!) Also: I cut & pasted the code into the text window, but it seems to have recoded the example string: '∮ E⋅da = Q, marco n → ∞, ∑ f(i) = ∏ g(i)'. I couldn't find a way (google wasn't very helpful to me, I can't seem to exclude enough irrelevent nodes) to get the utf-8 stuff into the code listing.

...roboticus

When your only tool is a hammer, all problems look like your thumb.


In reply to Re: PDF search problem by roboticus
in thread PDF search problem by stolara

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.