in reply to Log parsing by timestamp dilema

What you are describing sounds like a merge sort, and there is a module File::MergeSort designed for that. It takes a list of files at once along with a key extraction subroutine.

I'm sure there's a straightforward way to implement the getFilename operation, if only by tinkering with the File::MergeSort code.

# Mostly extracted from the documentation. I have not tried it. use File::MergeSort; my $sort = new File::MergeSort( \@file_list, \&index_extract_function ); my $line; while (defined($line = $sort->next_line)) { my $fname = getFilename(); print "$line/$fname\n"; }

Replies are listed 'Best First'.
Re: Re: Log parsing by timestamp dilema
by Limbic~Region (Chancellor) on Feb 01, 2003 at 20:58 UTC
    tall_man,
    THANKS!
    I have no idea what it will take to hack File::MergeSort to give me file name/path, but it is certainly worth a shot. I had already begun writing code to perform the logic I came up with as an interpretation of adrianh's solution when DaveH provided a fully integrated solution with some extra bells and whistles.

    For completeness (and because I think your solution will give the fastest bang for my buck), I will bench all 3 solutions and maybe post the results.

    Cheers - L~R

      I have taken a quick look through the File::MergeSort code. Here are my suggestions about how to get the file names. In the constructor, there is a loop that opens the files and saves the file handles. I would add a line to save the file names too:
      $self->{files}->[$n]->{fh} = $fh; $self->{files}->[$n]->{name} = $file;
      In next_line, capture the value of $self->{sorted}->[0]->{name} when you read from the handle, and do an array return with both the line and the filename. That should do it.