graff has asked for the wisdom of the Perl Monks concerning the following question:
Whenever I try to use "perl -d" on a script that handles utf8 data, I get the damndest bad behavior from perl 5.8.6 on my mac. The error messages when the script dies are not always the same, but they seem to share a common theme of running out of memory. Here's the test script:
That will use utf8 mode to read $filedata if there's a "u" on the command line. Here's the test sequence to show what I'm up against -- everything works fine till I use the debugger with utf8 mode:#!/usr/bin/perl use strict; use warnings; my $filedata = "Here is XYZ_ABC_123456.7890 Foo Bar"; my $mode = ( @ARGV and $ARGV[0] =~ /u/ ) ? ":utf8" : ''; open I, "<$mode", \$filedata ; while (<I>) { my ( $id ) = ( /(XYZ_ABC_[\d.]+)/ ); print "ID is $id\n"; }
Note that actual presence of wide characters is not required to break the debugger. As for "perl -V"...$ perl test.pl # no debugger, not utf8 ID is XYZ_ABC_123456.7890 $ perl test.pl u # no debugger, utf8 ID is XYZ_ABC_123456.7890 $ perl -d test.pl # debugging, not utf8 Loading DB routines from perl5db.pl version 1.28 ... main::(test.pl:6): my $filedata = "Here is XYZ_ABC_123456.7890 Fo +o Bar"; DB<1> c ID is XYZ_ABC_123456.7890 Debugged program terminated... DB<1> q ## and now the kicker: $ perl -d test.pl u # debugging, utf8 Loading DB routines from perl5db.pl version 1.28 ... main::(test.pl:6): my $filedata = "Here is XYZ_ABC_123456.7890 Fo +o Bar"; DB<1> c perl(18841) malloc: *** vm_allocate(size=4294832128) failed (error cod +e=3) perl(18841) malloc: *** error: can't allocate region perl(18841) malloc: *** set a breakpoint in szone_error to debug Out of memory! Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. DB<1> q $ perl -v This is perl, v5.8.6 built for darwin-thread-multi-2level (with 2 registered patches, see perl -V for more detail)
Summary of my perl5 (revision 5 version 8 subversion 6) configuration: Platform: osname=darwin, osvers=8.0, archname=darwin-thread-multi-2level uname='darwin b28.apple.com 8.0 darwin kernel version 7.5.0: thu m +ar 3 18:48:46 pst 2005; root:xnuxnu-517.99.13.obj~1release_ppc power +macintosh powerpc ' config_args='-ds -e -Dprefix=/usr -Dccflags=-g -pipe -Dldflags=- +Dman3ext=3pm -Duseithreads -Duseshrplib' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemulti +plicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-g -pipe -fno-common -DPERL_DARWIN -no-cpp-prec +omp -fno-strict-aliasing -I/usr/local/include', optimize='-Os', cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN -no-c +pp-precomp -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='3.3 20030304 (Apple Computer, Inc. build + 1809)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/usr/local/ +lib' libpth=/usr/local/lib /usr/lib libs=-ldbm -ldl -lm -lc perllibs=-ldl -lm -lc libc=/usr/lib/libc.dylib, so=dylib, useshrplib=true, libperl=libpe +rl.dylib gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-bundle -undefined dynamic_lookup -L/us +r/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL +_IMPLICIT_CONTEXT Locally applied patches: 23953 - fix for File::Path::rmtree CAN-2004-0452 security issu +e 33990 - fix for setuid perl security issues Built under darwin Compiled at Mar 20 2005 16:34:19 @INC: /System/Library/Perl/5.8.6/darwin-thread-multi-2level /System/Library/Perl/5.8.6 /Library/Perl/5.8.6/darwin-thread-multi-2level /Library/Perl/5.8.6 /Library/Perl /Network/Library/Perl/5.8.6/darwin-thread-multi-2level /Network/Library/Perl/5.8.6 /Network/Library/Perl /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level /System/Library/Perl/Extras/5.8.6 /Library/Perl/5.8.1/darwin-thread-multi-2level /Library/Perl/5.8.1 .
Well, maybe this is moot... I just tried the same test script with perl 5.8.7 and 5.8.8 on a couple different i386-freebsd boxes, and there seems to be no problem there (so far).
(update:) when I tried a more realistic script on 5.8.7, I got a different error, which is similar to one of the variations I had seen on the mac -- line 61 of my real script had just the sort of regex match shown in the test script above:
I'm puzzled why it didn't fail till chunk 2 of my input data -- i.e. the regex match succeeded once on chunk 1. Ick. (end of update)panic: pp_match start/end pointers at my_real_script.perl line 61, <I> + chunk 2. at my_real_script.perl line 61 ### or, alternatively (if I try to single-step past line 61: Out of memory during ridiculously large request at my_real_script.perl + line 61, <I> chunk 2.
Still, if anyone can offer suggestions on what the hell is going on here, and/or other ideas for work-arounds or fixes, I'm all ears. I could try getting a newer perl installed on my mac... (but why doesn't this just work?!)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Perl debugging vs. utf8: I'm losing
by graff (Chancellor) on Nov 08, 2006 at 06:30 UTC | |
|
Re: Perl debugging vs. utf8: I'm losing
by Tanktalus (Canon) on Nov 08, 2006 at 06:02 UTC | |
|
Re: Perl debugging vs. utf8: I'm losing
by jbrugger (Parson) on Nov 08, 2006 at 06:42 UTC | |
|
Re: Perl debugging vs. utf8: I'm losing
by jethro (Monsignor) on Nov 08, 2006 at 12:55 UTC |