scunacc has asked for the wisdom of the Perl Monks concerning the following question:

Dear folks,

I have an address matching app. Runs on Linux and AIX.

On AIX it uses a reasonably fixed amount of memory.

F S UID PID PPID C PRI NI ADDR SZ RSS WCHAN TTY + TIME CMD 240001 A 352 3567696 3465364 119 119 20 201d28be 30256 29960 + - 2299:11 /lapps/perl32a/bin/perl ./match_nearest.pl addresses.c +sv

On Linux the exact same app uses up all available memory and then segvs - signal 11.

Any ideas why?

Uses the following modules:

use Text::Soundex; use Text::Metaphone; use Text::Compare; use Text::LevenshteinXS qw(distance); use String::KeyboardDistance qw(:all);
Here's how things are built: AIX:
Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=aix, osvers=5.3.0.0, archname=aix-thread-multi uname='aix portia 3 5 000d043d4c00 ' config_args='' hint=previous, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemulti +plicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc_r', ccflags ='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURC +E -qmaxmem=-1 -qnoansialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT - +I/usr/local/include -q32 -D_LARGE_FILES -qlonglong', optimize='-O', cppflags='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -DUSE_NAT +IVE_DLOPEN -DNEED_PTHREAD_INIT -I/usr/local/include -D_ALL_SOURCE -D_ +ANSI_C_SOURCE -D_POSIX_SOURCE -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT + -I/usr/local/include -D_LARGE_FILES -D_ALL_SOURCE -D_ANSI_C_SOURCE - +D_POSIX_SOURCE -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT -I/usr/local/i +nclude -D_LARGE_FILES' ccversion='7.0.0.5', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='ld', ldflags =' -brtl -bdynamic -bmaxdata:0x80000000 -L/usr/lo +cal/lib -b32' libpth=/lapps/local/lib /usr/local/lib /lib /usr/lib /usr/ccs/lib libs=-lbind -lnsl -lgdbm -ldbm -ldb -ldl -lld -lm -lcrypt -lpthrea +ds -lc -lbsd perllibs=-lbind -lnsl -ldl -lld -lm -lcrypt -lpthreads -lc -lbsd libc=/lib/libc.a, so=a, useshrplib=false, libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_aix.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -bE:/lapp +s/perl32a/lib/5.8.8/aix-thread-multi/CORE/perl.exp -bE:/lapps/perl32a +/lib/5.8.8/aix-thread-multi/CORE/perl.exp -bE:/lapps/perl32a/lib/5.8. +8/aix-thread-multi/CORE/perl.exp' cccdlflags=' ', lddlflags='-bhalt:4 -bexpall -G -bnoentry -lpthrea +ds -lc -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API Built under aix Compiled at Dec 21 2006 14:17:17 %ENV: PERL_USERNAME="Derek Jones - CTG" @INC: /lapps/perl32a/lib/5.8.8/aix-thread-multi /lapps/perl32a/lib/5.8.8 /lapps/perl32a/lib/site_perl/5.8.8/aix-thread-multi /lapps/perl32a/lib/site_perl/5.8.8 /lapps/perl32a/lib/site_perl .
Linux:
Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.12-12mdkcustomvm, archname=-linux-thread- +multi-64int-ld uname='linux headnode1.uspa.ibm.com 2.6.12-12mdkcustomvm #3 smp mo +n mar 13 20:23:24 est 2006 i686 intel(r) xeon(tm) cpu 3.20ghz unknown + gnulinux ' config_args='' hint=previous, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemulti +plicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=define usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS +-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local +/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gd +bm', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-stri +ct-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include +-I/usr/include/gdbm -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -f +no-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/i +nclude -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm + -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-strict-aliasing +-pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE +_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm' ccversion='', gccversion='4.0.1 (4.0.1-5mdk for Mandriva Linux rel +ease 2006.0)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off +_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.3.5.so, so=so, useshrplib=false, libperl=libperl. +a gnulibc_version='2.3.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ +INT USE_ITHREADS USE_LARGE_FILES USE_LONG_DOUBLE USE_PERLIO USE_REENTRANT_API Built under linux Compiled at May 25 2006 10:19:17 @INC: /lapps/perl32/lib/5.8.8/-linux-thread-multi-64int-ld /lapps/perl32/lib/5.8.8 /lapps/perl32/lib/site_perl/5.8.8/-linux-thread-multi-64int-ld /lapps/perl32/lib/site_perl/5.8.8 /lapps/perl32/lib/site_perl .

Replies are listed 'Best First'.
Re: AIX vs. Linux memory use
by glasswalk3r (Friar) on Jan 29, 2007 at 17:19 UTC

    Linux usually tries to use all available memory, then it starts to using swap. By reading your description, looks like the process is being killed before that.

    My only guess is that the Linux box is using a compile version of Perl that may not be working fine, or the modules you're using may have an specific bug. Anyway, looks like an issue more related to C code than with Perl.

    Maybe you can use perl -dDprof in the Linux box with a smaller set of data and see what happens. If the program is not killed, you may be able to identify where the program is consuming so much memory.

    Alceu Rodrigues de Freitas Junior
    ---------------------------------
    "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill
      Thanks Alceu See my reply to Tom above. We cross-replied :-) Kind regards Derek.
Re: AIX vs. Linux memory use
by Melly (Chaplain) on Jan 29, 2007 at 16:41 UTC

    If you don't get anything more useful from the other monks, you might try running the script under the debugger and observing memory usage in a separate window to try and track down what's going on. Sorry I can't be more help...

    map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
    Tom Melly, pm@tomandlu.co.uk
      Hi Tom,

      Thanks for the reply,

      Well, I think I *might* have narrowed it down. I think it may be worth a post on the cpan forum site for the module. I didn't go to the debugger, but I did do some incremental code removal in the Linux version.

      It *looks* like Text::Compare is leaking badly. I removed the call to that and that appears to have stabilized the memory at more or less the same as on AIX. What's interesting is to figure out why the difference.

      I thought it could be a version problem, but, no, the module versions look the same.

      Anybody else hit that?

      Kind regards, Derek.
Re: AIX vs. Linux memory use
by samtregar (Abbot) on Jan 29, 2007 at 18:18 UTC
    Your linux Perl is complied with threading enabled. I suggest you try recompiling to match AIX. Perl's memory management is quite different with threading enabled and I wouldn't be at all surprised to find that it still has a few leaks.

    -sam

      Dear Sam,

      If you look at the perl -V output I supplied for AIX, you'll see that the AIX one is already compiled with threads.

      (From the original post:

      Here's how things are built: AIX: ... ... osname=aix ... archname=aix-thread-multi ... usethreads=define ... useithreads=define ...

      )

      (I already have a larege multi-platform suite of modules and scripts that I developing and using and the environments are set up as nearly identically as possible between Linux and AIX already since many / most of the scripts are able to run on either platform. So, I appreciate the input, but that's not the issue here.)

      Kind regards

      Derek.

        Right you are - must have misread the AIX config lines. Still, if I were you I'd try dropping threads from the Linux config. Perhaps AIX's threading libs are more stable than Linux's.

        -sam

      Do you suggest that for non-thread environments that require high performance would be better to turn-off threading by compiling Perl from sources?

      Not that I'm a thread heavy user, but I read that Perl 5.8 support for threading is better than 5.6. Does it still have bugs?

      Alceu Rodrigues de Freitas Junior
      ---------------------------------
      "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill
        Dear Alceu,

        I don't *think* the overhead of compiling with threads on is a big issue for code that does not use threads. I am open to correction on that of course :-) I do have both a threaded and non-threaded available. Maybe I can test at some point.

        I've had to jump thru' hoops to get my *threaded* performance (I mean - in a genuinely threaded application) up to scratch though (bascially reducing the memory footprint per thread) because it clones the environment in the new thread, including loaded modules, and so on. But I do have threaded Perl apps that run with 100's of threads just fine.

        But, again, if you don't actually *use* threads, I'm not sure that matters.

        Kind regards

        Derek.

        Yes, I do recommend that. I haven't run any tests lately, but historically (early v5.8) threaded Perl has been slower than non-threaded Perl even when not using threads.

        -sam