xiaoyafeng has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, Forgive me not to post my question so long. I'm fully attracted by Parrot and perl6. I'm too impatient to wait for perl6 advent in Xmas. ;) Ok, back to my question.
I have a very large file. Below is log format:
20071202231202 #yyyymmddhhmiss blahblah ........ ........ 20071302110230 ........ ........ (Please note that the lines between two timestamp is not certain.)
I'd like to make a index whose Key is line number and date. Instead of store into a database(stupid way), create a index mannually seems be better, although I've never tried it before. ;) Any suggestions or snippets could enlighten me?
Thanks in advance!



I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Replies are listed 'Best First'.
Re: How can I make a index for log?
by GrandFather (Saint) on Jan 16, 2008 at 09:54 UTC

    It's not exactly clear what it is that you want to achieve. However, creating an index at run time from the log file is fairly straight forward. Something based on the following sketch code ought to get you started:

    open my $inFile, '<', $logFileName or die "Failed to open $logFileName +: $!"; my %index; my $lastPos = 0; while (<$inFile>) { next unless /^(\d{14})$/; $index{$1} = [$., $lastPos]; } continue { $lastPos = tell ($inFile); }

    which builds an index keyed by time stamp with values giving the line number and index into the file of the start of the line.


    Perl is environmentally friendly - it saves trees
Re: How can I make a index for log?
by cdarke (Prior) on Jan 16, 2008 at 13:38 UTC
    To creating an index of line numbers is fairly easy using a hash. For this you need a unique key for each line - the line number fits the bill. To create the index, loop through reading the file and store the position of each line. For example:
    while (<FH>) { push @index, tell(FH); }
    $. is the current line number, and tell returns the offset. The array can now be stored using different methods, including Data::Dumper. You could use a hash instead if you wanted to use a key other than the line-number.