in reply to Print unicode strings to pdf

In the case that thanos1983's reply is not what you're looking for, maybe showing us some code would help us understand the situation better.

I think that you currently have code like the following:

#!perl -w use strict; my $hardcoded_string_in_perl = "\x{0052}\x{0044}\x{0024}"; my $string_as_read_from_database = '\x{0052}\x{0044}\x{0024}'; print $hardcoded_string_in_perl; # RDS print $string_as_read_from_database; # \x{0052}\x{0044}\x{0024} print_to_pdf( $hardcoded_string_in_perl ); # works and shows the corre +ct characters in the PDF print_to_pdf( $string_as_read_from_database ); # works and shows the c +orrect characters in the PDF

If that is true, you will likely just need to convert the backslashed escape sequences to Perl characters. For example using something like the following subroutine:

sub unescape { my( $str ) = @_; $str =~ s!\\x\{([0-9a-f]{4})\}!chr(hex $1)!ge; $str }; print unescape('\x{0052}\x{0044}\x{0024}');

Replies are listed 'Best First'.
Re^2: Print unicode strings to pdf
by Anonymous Monk on Mar 26, 2018 at 13:56 UTC
    Hello Corion,

    With your unescaping it works! :) Thank you so much.

    This is what I see when I do mysql SELECT:
    +-----+------+-----------------------+ | 1 | \x{043b}\x{0432} | | 2 | \x{0042}\x{0073} | | 3 | \x{005a}\x{0024} | +-----+------+-----------------------+
    How should I save the unicode strings in the database so that the unescaping is not necessary?

      You haven't told us what database you are talking to and what you are using to load data into the database. Also, we will need to know what the column type is you're inserting into.

      When loading the data from Perl, you will need to use Encode to decode the data from its source representation to Unicode.

      You will likely need to tell the database driver when loading data into the database and when reading data out of the database that it should consider the column as UTF-8 encoded Unicode (if that's what your original data is).

      Alternatively, encode the data to UTF-8 encoded octets and then write those raw octets into the database. You will lose the ability to query the data from within the database in a nice way. LIKE queries and UPPER() will likely not work in the way you expect them.

      When reading from the database, set up your database so it decodes the data from your database format to Unicode in Perl.

      If you are using MySQL, read DBD::mysql for some UTF-8 options.

        I am using MySQl and using the source command with a source file to insert into the table:
        insert_test (plaintext source file) INSERT INTO test VALUES(NULL, "\\x{20ab}"); INSERT INTO test VALUES(NULL, "\\x{005a}\\x{0024}"); INSERT INTO test VALUES(NULL, "\\x{0042}\\x{0073}"); mysql> source C:/Program Files/MySQL/MySQL Server 5.0/sql/insert_test
        Could you enlighten me on a string like this '\x{005a}'? Is it called a unicode string? And is that what we should store in the database (MySQL)?
        Could you enlighten me on this related issue?

        I have a json string from perl (encoded with to_json) which goes to my Javascript code. When I view the json string, it looks like this:
        { "hex_code":"\\x{a5}" }
        I want to display the hex_code as an actual symbol in an input field. But what I see in the input field is '\x{a5}' and not the symbol. In Javascript when I set a variable like this:
        var hex_code = '\x{a5}'; document.getElementById('some_field').value = hex_code;
        The symbol gets displayed in the input field.

        What do I need to in the perl code (or in the Javascript) to display the hex code as symbol in the input field?

        I'm quite confused by this. Hope you can shed some light :)