dexter29 has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a script 3 years ago to merge and sort 2 logfiles which has worked perfectly until this week.

Unfortunately now each of the 2 logfiles are nearing 1 gb in size and when I try to combine the files into an array (to sort later), the script crashes below (with no error). I have run the script with smaller logfiles and it combines and sorts correctly.

Any ideas on a work around.

The code where I try to combine the files is below.

sub GroupRows { my ($LogName) = @_; open LOGA, "< $LogName" or die "Cannot open $LogName\n$!\n"; while (! eof(LOGA)) { chomp($line_a = <LOGA>); if (($line_a =~ /^\#/) { undef $line_a; next; } else { push(@LogVals, $line_a); } } close(LOGA); }

Replies are listed 'Best First'.
Re: Issue merging 2 log files.
by Fletch (Bishop) on Jan 09, 2008 at 15:47 UTC

    Rather than merging in memory, open the two input files and keep the next line from each available. Write the next line from those two choices directly to your output file and replace it from the corresponding source file. Repeat until you don't have any more source lines available.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Issue merging 2 log files.
by ww (Archbishop) on Jan 09, 2008 at 15:51 UTC
    Your snippet won't compile (multiple syntax errors) because you have one too many "(" <update> after the if around the condition </update> or one too few ")" at the end of line 6 (and your inconsistent placement of the curly-braces may make that harder for you to spot?).

    Assuming that's only a cut-and-paste error in the post, might the problem be insufficient memory? The failure only after growth of the logfiles into the area of a gigabyte sounds like a clue.

Re: Issue merging 2 log files.
by salva (Canon) on Jan 10, 2008 at 08:41 UTC
    If the log files are already sorted (sometimes they are not fully sorted), you can read and merge them on the fly.

    There are several CPAN modules that allow you to do so: merge

    use Sort::Key::Merger qw(filekeymerger); my $merger = filekeymerger { chomp; return undef if /^#/; make_sorting_key($_) } @filenames; while (defined(my $line = $merger->())) { whatever($line); }
    You can also use Sort::External.
Re: Issue merging 2 log files.
by MidLifeXis (Monsignor) on Jan 09, 2008 at 19:11 UTC

    In addition to memory per process limits, is it also possible that the merged data is > 2GB and you are on a system where that is an issue?

    Update: Never mind - this is an in-memory solution, not a disk-based solution.

    --MidLifeXis