awkmonk has asked for the wisdom of the Perl Monks concerning the following question:
I'll apologise now for the length of this one, but I'm in need of some serious monkery.
I've created a script that reads in several files, splitting each line into separate fields, and storing them all in a hash, to then roll around these hashes to produce an output file. All works well (for once).
The problem comes when I try and implement this on a new box. Under AIX 5.1 it all works fine, under AIX 5.3 it kills the process with 'out of memory'. Both boxes have 1GB of RAM and 2GB swap space - should be more than enough.
Using a cut down set of input files, the working box uses about 53MB of memory to run this job, the new one uses just over 1GB.
This points to altered memory usage, and indeed the build options are different between the two boxes. The trouble is that I have no idea what might cause this.
Any thoughts welcome.
UPDATE: - upgrading to 5.8.8 did indeed cure the problem
On the Old box, perl -V gives:
Summary of my perl5 (revision 5.0 version 6 subversion 0) configuratio
+n:
Platform:
osname=aix, osvers=5.0.0.0, archname=aix
uname='aix shaq 1 5 006044854c00 '
config_args='-de'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef usemultipl
+icity=undf
useperlio=undef d_sfio=undef uselargefiles=define
use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=u
+ndef
Compiler:
cc='cc', optimize='-O', gccversion=
cppflags='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=
+16384'
ccflags ='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=
+16384 -q3'
stdchar='unsigned char', d_stdstdio=define, usevfork=false
intsize=4, longsize=4, ptrsize=4, doublesize=8
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
+ lseeksiz8
alignbytes=8, usemymalloc=n, prototype=define
Linker and Libraries:
ld='ld', ldflags ='-b32'
libpth=/lib /usr/lib /usr/ccs/lib
libs=-lbind -lnsl -lgdbm -ldbm -ldb -ldl -lld -lm -lC -lC_r -lc -l
+crypt -lbv
libc=/lib/libc.a, so=a, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_aix.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -bE:/usr/
+opt/perl5'
cccdlflags=' ', lddlflags='-bhalt:4 -bM:SRE -bI:$(PERL_INC)/perl.e
+xp -bE:$('
Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under aix
Compiled at Nov 22 2000 08:49:49
@INC:
/usr/opt/perl5/lib/5.6.0/aix
/usr/opt/perl5/lib/5.6.0
/usr/opt/perl5/lib/site_perl/5.6.0/aix
/usr/opt/perl5/lib/site_perl/5.6.0
/usr/opt/perl5/lib/site_perl
.
On the new box:
Summary of my perl5 (revision 5.0 version 8 subversion 2) configuratio
+n:
Platform:
osname=aix, osvers=5.2.0.0, archname=aix-thread-multi
uname='aix perlfly 2 5 000ad7df4c00 '
config_args=''
hint=previous, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define usemulti
+plicity=de
fine
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc_r', ccflags ='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURC
+E -qmaxmem
=16384 -qnoansialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT -q32 -D_L
+ARGE_FILES
-qlonglong',
optimize='-O',
cppflags='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=
+16384 -qno
ansialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT -D_ALL_SOURCE -D_ANS
+I_C_SOURCE
-D_POSIX_SOURCE -qmaxmem=16384 -qnoansialias -DUSE_NATIVE_DLOPEN -DNE
+ED_PTHREAD
_INIT -q32 -D_LARGE_FILES -qlonglong -D_ALL_SOURCE -D_ANSI_C_SOURCE -D
+_POSIX_SOU
RCE -qmaxmem=16384 -qnoansialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_IN
+IT -q32 -D
_LARGE_FILES -qlonglong -D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE
+ -qmaxmem=
16384 -qnoansialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT -q32 -D_LA
+RGE_FILES
-qlonglong -D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=163
+84 -qnoans
ialias -DUSE_NATIVE_DLOPEN -DNEED_PTHREAD_INIT -q32 -D_LARGE_FILES -ql
+onglong -D
_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=16384 -qnoansial
+ias -DUSE_
NATIVE_DLOPEN -DNEED_PTHREAD_INIT -q32 -D_LARGE_FILES -qlonglong -D_AL
+L_SOURCE -
D_ANSI_C_SOURCE -D_POSIX_SOURCE -qmaxmem=16384 -qnoansialias -DUSE_NAT
+IVE_DLOPEN
-DNEED_PTHREAD_INIT -q32 -D_LARGE_FILES -qlonglong'
ccversion='', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
+ lseeksize
=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='ld', ldflags =' -brtl -b32 -bmaxdata:0x80000000'
libpth=/lib /usr/lib /usr/ccs/lib
libs=-lbind -lnsl -ldbm -ldl -lld -lm -lpthreads -lc_r -lcrypt -lb
+sd -lPW
perllibs=-lbind -lnsl -ldl -lld -lm -lpthreads -lc_r -lcrypt -lbsd
+ -lPW
libc=/lib/libc.a, so=a, useshrplib=true, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_aix.xs, dlext=so, d_dlsymun=undef, ccdlflags='-bE:/usr/op
+t/perl5/li
b/5.8.2/aix-thread-multi/CORE/perl.exp -bE:/usr/opt/perl5/lib/5.8.2/ai
+x-thread-m
ulti/CORE/perl.exp -bE:/usr/opt/perl5/lib/5.8.2/aix-thread-multi/CORE/
+perl.exp -
bE:/usr/opt/perl5/lib/5.8.2/aix-thread-multi/CORE/perl.exp'
cccdlflags=' ', lddlflags='-bhalt:4 -bM:SRE -bI:$(PERL_INC)/perl.e
+xp -bE:$(B
ASEEXT).exp -bnoentry -lpthreads -lc_r'
Characteristics of this binary (from libperl):
Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL
+_IMPLICIT_
CONTEXT
Built under aix
Compiled at Feb 13 2004 13:18:17
@INC:
/usr/opt/perl5/lib/5.8.2/aix-thread-multi
/usr/opt/perl5/lib/5.8.2
/usr/opt/perl5/lib/site_perl/5.8.2/aix-thread-multi
/usr/opt/perl5/lib/site_perl/5.8.2
/usr/opt/perl5/lib/site_perl
.
'I think the problem lies in the fact that your data
doesn't fit my program'.
Re: Hashing Memory Usage
by Fletch (Bishop) on Jul 12, 2006 at 15:50 UTC
|
Without seeing the code in question (or at least something stripped down that produces similar bloat under the different perls) I don't know if you're going to get a good response.
Having said that, one alternative when you start running out of RAM when processing a hash is to start tossing the data into something on disk using BerkeleyDB or the like. You'll lose some speed but you shouldn't hit the same memory wall.
| [reply] [Watch: Dir/Any] |
|
#!/usr/bin/perl -w
use strict;
my %a = ();
my $res = `ps v $$`;
print "$res\n";
for my $line ( 1 .. 19000 ){
for ( "AA" .. "DZ" ){
$a{$line}{"$_$line"} = $line;
}
}
$res = `ps v $$`;
print "$res\n";
exit 0;
'I think the problem lies in the fact that your data
doesn't fit my program'.
| [reply] [Watch: Dir/Any] [d/l] |
|
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by kwaping (Priest) on Jul 12, 2006 at 15:53 UTC
|
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
64 bit is certainly going to use more memory. Your integers will be bigger. Also, you're using 5.8.2 on the new box and 5.6 on the old. That is likely to make a difference.
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by nothingmuch (Priest) on Jul 12, 2006 at 16:32 UTC
|
IIRC perl 5.8 changed the hash function... It may be that it's performing differently for your key set in such a way that more buckets are needed to store the values (arguably a good thing), to the point where Perl cannot allocate a contiguous chunk of memory for the bucket array (it probably needs to do that but I'm not sure).
Another problem could be that the new machine is shipping with soft/hard ulimits set differently by default. Check ulimit on the command line before running. Tweaking the hard limits may require a kernel recompilation.
Good luck!
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by traveler (Parson) on Jul 12, 2006 at 16:25 UTC
|
It would be most helpful if either (or both) were compiled with PERL_DEBUG_MSTATS or DEBUGGING. The former gives access to Devel::Peek's mstat(); while the latter enables -DL and the warn("!") stuff. Both allow in-program examination of memory usage. If you can rebuild, that might help.
HTH, --traveler | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Hashing Memory Usage
by mrd (Beadle) on Jul 12, 2006 at 18:49 UTC
|
I think this has nothing to do with perl.
You might want to compare your users memory quotas on those mashines (you might need a sysadmin for that).
Also, you might check what applications are running on those mashines. It just might be that there is another app that eats a lot of memory on the AIX 5.3.
HTH.
| [reply] [Watch: Dir/Any] |
|
Also take a look at any ulimit settings, or anything else that may limit how much memory is available to a process. Can you see if the new process is actually using more memory or running into other limitations?
| [reply] [Watch: Dir/Any] [d/l] |
|
I've hijacked a friendly sysadmin bloke - the ulimits are set the same way, there was no other process running on either box during these tests. I've just stumbled into a big problem though - I don't think I'm going to be able to recompile the Perl without major internal wranglings over support on the box. Arrrrrrgh.
'I think the problem lies in the fact that your data
doesn't fit my program'.
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by vhold (Beadle) on Jul 12, 2006 at 19:56 UTC
|
I think I ran into this a long time ago on AIX 32-bit compiled programs.
Check out this page: Large Program Support
Basically give this a shot, make a copy of your 32 bit perl, and do: /usr/ccs/bin/ldedit -bmaxdata:0x80000000/dsa perl | [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by shmem (Chancellor) on Jul 13, 2006 at 06:49 UTC
|
As an irish joke goes: "could you tell me the way to tipperary?" - "well, I wouldn't start from here".
I don't have an idea either, but I see too many changes to identify "the guilty party": OS release change, hardware change, perl release change. To make shure it isn't (or is) perl's fault I would install the same version of perl (5.8.2) on the old box, run the program and start from there.
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
| [reply] [Watch: Dir/Any] |
|
The problem is that the old box is the current production machine. No-one would let me go anywhere near changing that. It's beginning to look like I either need new Perl binaries, which they just might let me load, or I re-write my routine.
'I think the problem lies in the fact that your data
doesn't fit my program'.
| [reply] [Watch: Dir/Any] |
|
Then do it the other way round. Build a 5.6.0 perl on the new box, and test that.
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
| [reply] [Watch: Dir/Any] |
|
|
Compile perl with a prefix, and store it in your home directory, where nobody else can touch it. It should be safe.
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by freakingwildchild (Scribe) on Jul 14, 2006 at 08:36 UTC
|
I've had similar problems and this inbetween 5.6.1 and 5.8.7.
My program was sucking up double as much memory and this only because of differences in Perl versions. Same compilation flags, same way of installing. | [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
If you're dropping 256MB core files (the segment size in AIX), then you're essentially growing past the segment size and the -bmaxdata flag should fix the issue. I've been experiencing this recently and have not yet recompiled, but I am making many efficiency changes (true iterators have come in very handy) to combat the problem right now.
| [reply] [Watch: Dir/Any] |
Re: Hashing Memory Usage
by skyknight (Hermit) on Jul 16, 2006 at 18:09 UTC
|
It sounds like you might want something called a "database". On another random note, maybe the Perl garbage collector never runs in the case where it gets to 1GB. For the one that stays at 53MB, do you cycle through 1GB worth of data but perhaps manage to reuse memory? Sometimes garbage collectors are sloppy when they aren't pressed up against the wall. | [reply] [Watch: Dir/Any] |
|
Ah a red herring - it's not hashes at all! I'm changing values in a large hash as I loop through files. The problem occurs when I come to create the key for the hash. The following code works properly on 5.6 but grows on 5.8. Delete the substitution line, and hey presto, no more leak.
use strict;
my $ret = `ps v $$`;
print "$ret\n";
for ( 1 .. 1000 ){
my $key = ':PERSON-NUM(1:*):NI-NUM(1:*):';
for my $bb ( "PERSON-NUM", "NI-NUM" ){
if ( $key =~ /:$bb *\( *([0-9]+) *: *([0-9\*]+) *\) *:/ ){
$key =~ s/:$bb *\( *[0-9]+ *: *[0-9\*]+ *\) *:/:$bb=1:/;
}
}
}
$ret = `ps v $$`;
print "$ret\n";
Now I'm confused!
'I think the problem lies in the fact that your data
doesn't fit my program'.
| [reply] [Watch: Dir/Any] [d/l] |
|
I ran your code with the loop iterator set 1e6, and see no memory growth under either 5.8.6 or 5.8.8.
I seem to recall that 5.8.2 (the version shown in your OP), was a particularly buggy, short-lived release. I strongly suggest you try upgrading to something newer. My memory tells me that 5.8.3 was a pretty good build as is 5.8.6 which I still use as my default install. I encountered a few problems with 5.8.8 but I think they may have been limited to the AS-Win distribution I use.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [Watch: Dir/Any] |
|
|