Re: Reading from a fast logfile and storing on Oracle

The tail command is likely to be quicker than the perl module (much so on my system), and grep -f likely quicker than multiple invocations of the regex engine.

And by pre-filtering the lines before you give them to Perl, it will provide some (2 levels of) buffering, to give the perl script more time to do the uploads of the filtered data.

It's usually quicker to upload medium sized batches of data rather small ones, so collect lines into arrays until you have a small batch before uploading.

Something along these lines might be worth testing:

#! perl -slw
use strict;
use DBI;

my $dbi = DBO->connect( ... );

## Prepare multiple statments for different DBs
my @sth = map{ $dbi->prepare( ... ) } 1 .. $DBS;

## Piped open pre-filters data thru tail and grep -f
my $pid = 
    open TAIL, "tail -f /path/to/the/log | grep -Pf patterns.file |" 
    or die $!;

my @collected; ## Uploading medium sized batches of data is usually qu
+ickest

while( <TAIL> ) {
    ## Decide which DB this line is destined for
    my $dbn = m[some selection criteria];

    ## and put it into that batch
    push @{ $collected[ $dbn }, $_; ## subselect if you dont want the 
+whole line

    ## And when that batch reaches the optimum(?) size, upload it
    if( @{ $collected[ $dbn ] } >= 100 ) { ## 100 is a guess

        $sth[ $dbn ]->execute( @{ $collected[ $dbn ] );
        @{ $collected[ $dbn ] } = ();
    }
};

close TAIL;
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Comment on Re: Reading from a fast logfile and storing on Oracle Download Code

Replies are listed 'Best First'.
Re^2: Reading from a fast logfile and storing on Oracle by MidLifeXis (Monsignor) on Dec 19, 2008 at 11:33 UTC
I might also check to see if it has been more than some set number of seconds since you last stored the data, to try to reduce the potential of data loss of your perl program dies. --MidLifeXis	[reply]