Hi,

I have a script that is running on two different Linux machines. It is parsing a datafile containing some Japanese doublebyte chars (usually Shift_Jis). The problem, is that on one machine, the script isn't correctly, but it is on the other. On the bad machine, at least one portion of the script which I know is not being executed correctly, is a line which substitutes Japanese doublebyte spaces with a regular ASCII space:

$fields =~ s!¥x81¥x40! !g;

I am not sure if this is a clue to the incorrect output of the script, but I am getting warnings such as

Malformed UTF-8 character (overflow at 0xbe26482c, byte 0x20, after st +art byte 0xbf) in substitution (s///) at archives21.pm line 103, <DF> + line 118.

Line 103 simply says

$keywords =~ s!¥n|¥r!!g;

Of course, $keywords contains Japanese text as well.

Since both machines have the same OS and the same version of perl, I am wondering what is different. Although both *should* have been installed the same, when I do #perl -V on each of them, I get the output below. Obviously, the installs are different, but does anyone have any idea of what should I do to make the "bad" machine work like the "good" machine?

On the "good" machine:

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuratio +n: Platform: osname=linux, osvers=2.4.21-1.1931.2.393.entsmp, archname=i386-lin +ux-thread-multi uname='linux porky.devel.redhat.com 2.4.21-1.1931.2.393.entsmp #1 +smp thu aug 14 14:47:21 edt 2003 i686 i686 i386 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -march=i386 -mcpu=i686 - +Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red + Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - +Dvendorprefix=/usr -Dsiteprefix=/usr -Dotherlibdirs=/usr/lib/perl5/5. +8.0 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosui +d -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dm +an3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiono +nly -Dpager=/usr/bin/less -isr' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemulti +plicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS + -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_S +OURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -march=i386 -mcpu=i686', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGI +NG -fno-strict-aliasing -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-19)' +, gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil perllibs=-lnsl -ldl -lm -lpthread -lc -lcrypt -lutil libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.s +o gnulibc_version='2.3.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynami +c -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_ +FILES PERL_IMPLICIT_CONTEXT Locally applied patches: MAINT18379 Built under linux Compiled at Sep 15 2003 10:03:52 @INC: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .

On the "bad" machine

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuratio +n: Platform: osname=linux, osvers=2.4.21-1.1931.2.393.entsmp, archname=i386-lin +ux-thread-multi uname='linux por' config_args='-des -Doptimize=-O2 -g -pipe -march=i386 -mcpu=i686 - +Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red + Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - +Dvendorprefix=/usr -Dsiteprefix=/usr -Dotherlibdirs=/usr/lib/perl5/5. +8.0 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosui +d -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dm +an3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiono +nly -Dpager=/usr/bin/less -isr' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef' useithreads=define usemultiplicity= useperlio= d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=un uselongdouble= usemymalloc=, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS + -DDEBUGGING -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_S +OURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGI +NG -fno-strict-aliasing -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-19)' +, gccosandvers='' gccversion='3.2.3 200305' intsize=o, longsize=s, ptrsize=l, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long' k', ivsize=4' ivtype, nvtype='double' o_no', nvsize=, Off_t='', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc' l', ldflags =' -L/usr/local/lib' ldflags_use' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil perllibs= libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libper gnulibc_version='2.3.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so', d_dlsymun=undef, ccdlflags='-rdynam +ic -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE' cccdlflags='-fPIC' ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5', lddlflags='s Unicode/ +Normalize XS/A' Characteristics of this binary (from libperl): Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_ +FILES PERL_IMPLICIT_CONTEXT Locally applied patches: MAINT18379 Built under linux Compiled at Sep 15 2003 10:03:52 @INC: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0

In reply to Perl version / compilation by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.