in reply to Where's the leak?

One way from preventing the accumulation of memory that is slightly easier than having to save-state/exec/restore-state, is to spin that part of the processing that consumes large chunks off into a seperate process. When that process terminates, its memory is returned to the OS for re-use.

In your case, as you only wish to know if the file contains the string, you put a line something like this at the top of your program...

$cmd = q[ perl -MIO::File -we "exit( do{ local $/; my $io=new IO::File; $io->ope +n( $ARGV[0] ); <$io> } =~ /$ARGV[1]/ ) " ]; ];

If your using unix, you could probably slit that over several lines using single quotes instead of doubles.

Then replace these 4 lines...

undef $/; my $read = new IO::File; if($read->open("< $file")) { if(<$read> =~ /$searchString/g) {

with...

if ( system( $cmd, $file, $searchstring) ) {

Note: You will need to adjust the $cmd string to suit your OS. There are also many ways to improve it.

For instance, you could do as dws suggested and process the file one line at a time rather than slurping it and bottle out as soon as you find the searchstring.

You'll notice that I have removed the /g option from the match. There is no point in looking for more than one occurance unless you are going to do something with the knowledge.

Also, as coded above, any failure in the script will be seen as a successful search. You should decode the return value from system, seperate the returns from perl itself from that from the exit in the one-liner. Or you could use the C-style double negative test; exit( !... ); and the  if ( !system(...) ) {.

This way, your maximum memory usage should be perl X 2 + the biggest file you process.


Examine what is said, not who speaks.

Replies are listed 'Best First'.
Re^2: Where's the leak?
by Aristotle (Chancellor) on Dec 24, 2002 at 01:34 UTC
    A good suggestion, but it's begging for some decoupling.
    sub grepfile { my ($rx, $file) = @_; my $ret = system( qw/ perl -0777 /, -e => q{$a = shift; exit (<> !~ /$a/)}, $rx, $file ); return $ret == 0 if $ret != -1; require Carp; Carp::croak "Failed invoking perl for grepfile(): $!"; }
    Note how using system(LIST) eliminates any quoting headaches in one easy step.

    Makeshifts last the longest.

      Yup! Makes it much easier. That covers most of the "many ways to improve it" I hinted at, though I'm surpised you didn't throw in the -n and exit early. Still it is much clearer haw to add that with your version.


      Examine what is said, not who speaks.