Sainathuni has asked for the wisdom of the Perl Monks concerning the following question:

Encountered an issue with a perl script on a new Linux server with Fedora 38 Operating system and Perl version 5.36.0 and hoping you might be able to assist with.
The exact same script run with the same parameters on a old Linux server with Centos Operating System and perl version 5.16.3 completes successfully. The scripts are in sync between the two servers.
Below are the warning/error messages on the new server:
1) "Wide character in print at XX_VERTICA.pm line 951." warning messages in the new server
2) "Error: [Vertica] [Support] (50310) Unrecognized ICU conversion error. (SQL-HY000)" in the new server
This does not appear to be an issue with the files, as the exact same report pulling the same data executes successfully on old server. It seems to be an issue with how the Perl environment on new server is handling non-ascii characters. Our Database is Vertica and both the servers are pointing to the same database and the perl script is also pointed to the same database.

Replies are listed 'Best First'.
Re: Unrecognized ICU conversion error
by choroba (Cardinal) on Aug 09, 2023 at 20:13 UTC
    What module do you use to connect to Vertica? I've seen DBD::ODBC being used. What version of the module do you use for each perl version?

    Note that if the old version is buggy and stores invalid characters, the fixed version might be unable to fetch them. Rewriting the problematic columns might be needed to fix the problems.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Unrecognized ICU conversion error
by ewcarroll (Initiate) on Aug 10, 2023 at 19:23 UTC

    The Perl versions were inadvertently swapped in the original post, corrected info is as follows.

    CURRENT HOST
    CentOS Linux 7 (Core)
    Perl version 5.16.3
    Perl DBD::ODBC Version : 1.58
    Vertica Analytic Database v9.2.1-28
    vertica-client-8.1.1-0.x86_64

    NEW HOST
    Fedora Linux 38 (Thirty Eight)
    Perl version 5.36.0
    Perl DBD::ODBC Version : 1.61
    Vertica Analytic Database v9.2.1-28
    vertica-client-8.1.1-0.x86_64

    Example of data causing the issue: SLAPŘ


    OLD HOST LOG EXTRACT
    [06/28/2023 13:03:23] loading ul_config ... <br> [06/28/2023 13:03:26] user_level_l_topic.pl started: custom, 202306281 +30317, 6738, 42149 <br> [06/28/2023 13:03:26] work_dir: /project/tmp/std_user_level/custom/202 +30628130317/6738/42149 <br> [06/28/2023 13:03:26] 6738 xxxxx Weekly 31552 6582 42149 xxxxx Newsle +tter 99175 custom N <br> [06/28/2023 13:03:26] tactic_name='xxxxx_NR_381703.2' <br> [06/28/2023 13:03:26] create_ul_target_list <br> [06/28/2023 13:03:27] SELECT ANALYZE_STATISTICS('UL_TARGET_LIST') <br> [06/28/2023 13:03:28] fill ul_cohort... <br> [06/28/2023 13:06:49] 31342 records added <br> [06/28/2023 13:06:49] starting fill_ul_report_detail... <br> [06/28/2023 13:06:49] deleting ul_report_detail for REPORT_ID = 6738 a +nd ID = 160497671... <br> [06/28/2023 13:06:49] 0 records deleted <br> [06/28/2023 13:06:49] inserting into ul_report_detail for 6738 and ID += 160497671... <br> [06/28/2023 13:06:49] 1 records added <br> [06/28/2023 13:06:49] USER <br> [06/28/2023 13:06:52] ACTION <br> [06/28/2023 13:06:54] Validating reports... <br> [06/28/2023 13:06:54] USER: /project/tmp/std_user_level/custom/202 +30628130317/6738/42149/6738_xxxxx_USER_DATA_20230628130317.txt size=6 +083513 bytes <br> [06/28/2023 13:06:54] ACTION: /project/tmp/std_user_level/custom/202 +30628130317/6738/42149/6738_xxxxx_USER_ACTION_DATA_20230628130317.txt + size=10097634 bytes <br> [06/28/2023 13:06:54] file size validation - passed <br> [06/28/2023 13:06:55] unique user counts in USER and ACTION - passed < +br> [06/28/2023 13:06:55] QUESTION is not applicable to this product <br> [06/28/2023 13:06:55] 21050 records in /project/tmp/std_user_level/cus +tom/20230628130317/6738/42149/6738_xxxxx_USER_DATA_20230628130317.txt + <br> [06/28/2023 13:06:55] 31342 records in /project/tmp/std_user_level/cus +tom/20230628130317/6738/42149/6738_xxxxx_USER_ACTION_DATA_20230628130 +317.txt <br> [06/28/2023 13:06:55] update ul_run_status... <br> [06/28/2023 13:06:55] user_level_l_topic.pl ended <br> <br> [06/28/2023 13:07:03] Connected to v_xxxxx_node0010 <br> [06/28/2023 13:07:03] FILE_STATUS_ID=410508751 <br> [06/28/2023 13:07:03] Load Format Data... <br> [06/28/2023 13:07:03] Extract report data... <br> [06/28/2023 13:07:07] Generate data <br> <br> [06/28/2023 13:07:07] generate_data: Processing format detail 1 <br> [06/28/2023 13:07:07] Metrics=4 <br> [06/28/2023 13:07:08] generate_data: done with format detail 1:User Ac +tion Media Data <br> <br> [06/28/2023 13:07:08] generate_data: Processing format detail 2 <br> [06/28/2023 13:07:08] Metrics=40 <br> [06/28/2023 13:07:22] generate_data: done with format detail 2:User Ac +tion Data <br> <br> [06/28/2023 13:07:22] Generate files <br> <br> [06/28/2023 13:07:22] generate_file: Processing format detail 1 <br> [06/28/2023 13:07:22] generate_file: done with format detail 1:User Ac +tion Media Data <br> <br> [06/28/2023 13:07:22] generate_file: Processing format detail 2 <br> [06/28/2023 13:07:24] New file: /mnt/xxxxx/PromoUserLevelReporting/xxx +xx/xxxxx/custom/xxxxx/6738_xxxxx_USER_LEVEL_20230628130701.txt <br> [06/28/2023 13:07:27] New file: /mnt/xxxxx/PromoUserLevelReporting/xxx +xx/xxxxx/custom/xxxxx/6738_xxxxx_CTL_20230628130701.ctl <br> [06/28/2023 13:07:27] generate_file: done with format detail 2:User Ac +tion Data <br> <br> [06/28/2023 13:07:27] Moving 2 report files to target dir <br> [06/28/2023 13:07:27] mv /project/tmp/generate_report_files/2023062813 +0701/6738/104/* '/mnt/xxxxx/PromoUserLevelReporting/xxxxx/xxxxx/custo +m/xxxxx' 2>>/dev/null <br> [06/28/2023 13:07:27] generate_report_files.pl ended <br>



    NEW HOST LOG EXTRACT
    [06/28/2023 12:56:45] loading ul_config ... <br> [06/28/2023 12:56:45] user_level_l_topic.pl started: custom, 202306281 +25643, 6738, 42149 <br> [06/28/2023 12:56:45] work_dir: /project/tmp/std_user_level/custom/202 +30628125643/6738/42149 <br> [06/28/2023 12:56:45] 6738 xxxxx Weekly 31552 6582 42149 xxxxx Newsle +tter 99175 custom N <br> [06/28/2023 12:56:45] tactic_name='xxxxx_NR_381703.2' <br> [06/28/2023 12:56:45] create_ul_target_list <br> [06/28/2023 12:56:45] SELECT ANALYZE_STATISTICS('UL_TARGET_LIST') <br> [06/28/2023 12:56:46] fill ul_cohort... <br> [06/28/2023 12:59:43] 31342 records added <br> [06/28/2023 12:59:43] starting fill_ul_report_detail... <br> [06/28/2023 12:59:43] deleting ul_report_detail for REPORT_ID = 6738 a +nd ID = 160493471... <br> [06/28/2023 12:59:43] 0 records deleted <br> [06/28/2023 12:59:43] inserting into ul_report_detail for 6738 and ID += 160493471... <br> [06/28/2023 12:59:43] 1 records added <br> [06/28/2023 12:59:43] USER <br> Wide character in print at UL_VERTICA.pm line 951. <br> Wide character in print at UL_VERTICA.pm line 951. <br> Wide character in print at UL_VERTICA.pm line 951. <br> [06/28/2023 12:59:46] ACTION <br> [06/28/2023 12:59:49] Validating reports... <br> [06/28/2023 12:59:49] USER: /project/tmp/std_user_level/custom/202 +30628125643/6738/42149/6738_xxxxx_USER_DATA_20230628125643.txt size=6 +083486 bytes <br> [06/28/2023 12:59:49] ACTION: /project/tmp/std_user_level/custom/202 +30628125643/6738/42149/6738_xxxxx_USER_ACTION_DATA_20230628125643.txt + size=9990561 bytes <br> [06/28/2023 12:59:49] file size validation - passed <br> [06/28/2023 12:59:50] unique user counts in USER and ACTION - passed < +br> [06/28/2023 12:59:50] QUESTION is not applicable to this product <br> [06/28/2023 12:59:50] 21050 records in /project/tmp/std_user_level/cus +tom/20230628125643/6738/42149/6738_xxxxx_USER_DATA_20230628125643.txt + <br> [06/28/2023 12:59:50] 31342 records in /project/tmp/std_user_level/cus +tom/20230628125643/6738/42149/6738_xxxxx_USER_ACTION_DATA_20230628125 +643.txt <br> [06/28/2023 12:59:50] update ul_run_status... <br> [06/28/2023 12:59:50] user_level_l_topic.pl ended <br> <br> [06/28/2023 13:00:00] Connected to v_xxxxx_node0005 <br> [06/28/2023 13:00:00] FILE_STATUS_ID=410504550 <br> [06/28/2023 13:00:00] Load Format Data... <br> [06/28/2023 13:00:00] Extract report data... <br> [06/28/2023 13:00:07] Generate data <br> <br> [06/28/2023 13:00:07] generate_data: Processing format detail 1 <br> [06/28/2023 13:00:07] Metrics=4 <br> [06/28/2023 13:00:12] generate_data: done with format detail 1:User Ac +tion Media Data <br> <br> [06/28/2023 13:00:12] generate_data: Processing format detail 2 <br> [06/28/2023 13:00:12] Metrics=40 <br> [06/28/2023 13:00:36] generate_data: done with format detail 2:User Ac +tion Data <br> <br> [06/28/2023 13:00:36] Generate files <br> <br> [06/28/2023 13:00:36] generate_file: Processing format detail 1 <br> [06/28/2023 13:00:37] generate_file: done with format detail 1:User Ac +tion Media Data <br> <br> [06/28/2023 13:00:37] generate_file: Processing format detail 2 <br> [06/28/2023 13:00:38] Error: [Vertica][Support] (50310) Unrecognized I +CU conversion error. (SQL-HY000) <br> [06/28/2023 13:00:38] generate_report_files.pl ended <br>

      G'day ewcarroll,

      Welcome to the Monastery.

      ++ for your post but did you notice that all of your timestamps have become links?

      Links are autogenerated for any plain text in square brackets. It's better to wrap code, data, exception messages, and other program output in <code>...</code> tags. This will not create links and also handles characters that are special to HTML (e.g. &, <, and so on). See "Writeup Formatting Tips" for more details about this.

      — Ken

      Wide character in print at UL_VERTICA.pm line 951.

      As i wrote in Re^5: Unrecognized ICU conversion error, this looks like a Unicode/UTF8 Problem.

      Basically, Perl internally uses Unicode codepoints for characters, e.g. the "number" of a character can be greater than 255. Example:

      #!/usr/bin/env perl use strict; use warnings; use utf8; use Encode; # Let's use the "Medium shade" block, Unicode point 0x2592 # https://www.unicode.org/charts/beta/nameslist/n_2580.html my $unicodechar = "\N{MEDIUM SHADE}"; print "Character code: ", ord($unicodechar), "\n"; print "Character: ", $unicodechar, "\n"; # "Wide character in print at + unicode_perlmonks.pl line 15." my $utf8 = encode('UTF-8', $unicodechar, Encode::FB_CROAK); print "Character as UTF8: ", $utf8, "\n";

      In line 15, when you try to print the internal representation, problems happen. Basically, STDOUT expects valid 8-bit-per-byte characters, but you try to output too many bits for a single byte.

      With proper encoding, in this case UTF8, you can turn the single character into a bytestream that encodes the character into multiple valid bytes. This isn't just splitting up the internal bytes, it is a "proper" encoding that works around multiple issues. Like, for example, preventing bytes that have the value of zero (so as not to mess up zero terminated string handling in C-like languages).

      Tom Scott has a nice video on this if you are interested how this actually works: Characters, Symbols and the Unicode Miracle - Computerphile

      PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

      It's fine to update your post; however, it's important to indicate that you've done so — especially when your update invalidates an existing response. See "How do I change/delete my post?" for more about that.

      I also note that all lines of your log extracts end with " <br>". I suspect this doesn't reflect the original and were probably added initially to format the log data for paragraph text.

      I am aware that this was your first post here. My comments are intended to be informational; not any kind of rebuke. :-)

      — Ken

        'Unrecognized ICU conversion error' disappeared when the vertica client was upgraded to 23.3.0; However the 'Wide character in print at XXXX line XX' is appearing at line 137 in addition to line 951. This indeed looks like Unicode/UTF8 issue, appreciate any help providing solution.
Re: Unrecognized ICU conversion error
by Anonymous Monk on Aug 09, 2023 at 19:54 UTC
    So it works on the Perl released last year and doesn't work on the Perl released 10 years ago?

    Well there's your problem.

        No, it comes with a modern Perl. Ferdora 38 (which Sainathuni claims to be using) comes with 5.36.1. I don't know why anyone would choose to run 5.16 on such an modern O/S. Perhaps Sainathuni is simply mistaken.


        🦛

        5.34.1