in reply to Out of Memory Error : V-Lookup on Large Sized TEXT File

Rather than read $LARGEFILE once for each line in $REFFILELIST, wouldn't it be more efficient to read it once and check each line against each line of $REFFILELIST?

Something like:

open (FILE, $ReferenceFilePath) or die "Can't open file"; chomp (@REFFILELIST = (<FILE>)); close(FILE); open OUTFILE, ">$OUTPUTFILE" or die $!; open (LARGEFILE, $LARGESIZEDFILE) or die "Can't open File"; while (<LARGEFILE>) { foreach my $line (@REFFILELIST) { print OUTFILE $_ if (index($_, $line); } } close(LARGEFILE); close(OUTFILE);

N.B. untested since the original is incomplete and doesn't provide any data.

Replies are listed 'Best First'.
Re^2: Out of Memory Error : V-Lookup on Large Sized TEXT File
by lonewolf28 (Beadle) on Apr 24, 2015 at 22:41 UTC

    Hi, With a limited information given i have put together a script. Maybe you can use it to improve yours.

    use strict; use warnings; open( my $fh, '<', "input.txt" ) or die "Cannot open input file: $!"; chomp ( my @input_data = <$fh> ); close($fh); open( my $frh, '<', "reference.txt" ) or die "Cannot open reference fi +le: $!"; chomp ( my @ref_data = <$frh> ); close ($frh); my @output = map { my $value = $_; grep { $value eq $_ } @ref_data; } @input_data; open ( my $wh, '>', "output.txt" ) or die ( "Cannot open the output fi +le. $!"); print {$wh} $_ for @output; close($wh);
Re^2: Out of Memory Error : V-Lookup on Large Sized TEXT File
by marinersk (Priest) on Apr 25, 2015 at 03:02 UTC

    Oh, sheesh, thargas -- your post made me realize I'd missed something basic in the original post. The first file he opens isn't the list of files -- it's the list of strings.

    On a gut I'd say he's buffering a Cartesian Product of lines per file x lines in REFFILE. Can't prove it without the actual source code -- but it sure would fit the memory consumption pattern being presented.

    This only enhances what everyone has been saying -- post the actual code, not this mock-up of it -- there's something structurally wrong and we'll need to see the steel to find the rust.