in reply to faster with threads?

Both reading the log and updating the database are apt to be I/O bound, so threads or a forked process make sense (particularly if the log and db are on different spindles). I'm not too familiar with perl threads, so lets fork the db updater with a pipe from the parent to ship data over,

sub update_db { # ... } sub mung_logline { # ... } pipe my($rd, $wr); my $cpid; { $cpid = fork; die $! if not defined $cpid; unless ($cpid) { close $wr; my $dbh = connect(,,,); while (<$rd>) { update_db( $dbh, $_); } close $rd; exit 0; } } close $rd; { open my $fh, '<', '/path/to/log.file' or die $!; while (<$fh>) { $_ = mung_logline($_); print $wr $_; } close $fh; } close $wr;
That leaves out your desire to accumulate a large instruction, possible autoflushing of the pipe, @SIG{'CHLD','PIPE} handling and/or wait, and much error handling. Take it as a skeleton.

It may be best to produce a $sth = $dbh->prepare('whatever ?') with placeholders right after the $dbh is obtained. Then you can pass the $sth already set up to update_db, instead of $dbh.

After Compline,
Zaxo

Replies are listed 'Best First'.
Re^2: faster with threads?
by hakkr (Chaplain) on Jun 09, 2004 at 13:16 UTC

    One thing to watch out for is DBI thread/fork safe?

    I seem to recall having problems with database connections 'going away' when using database handle inside forked off child processes. I resolved this by each child process recreating it's own database handle.

    Another thing to watch is maybe if the DB and log reside on the same disk then you could have decreased performance if doing major I/O on both at once, Hence maybe try and read the whole log into memory

      The issue with DBI and forking is this: You need to set $dbh->{InactiveDestroy} = 1 in all of your child processes. Otherwise, when one child process dies it may kill database connections in other processes. (This is because as part of the DESTROY for the handle, it tells the databse server that it is done with the connection... even if there's another process that is not done with it)

      Also be warned of the greater conceptual issue that you can only use a DB connection in one process. The connection may exist in other processes (hence the warning above about setting InactiveDestroy), but it can only be used in one.

      Also, no, DBI is not threadsafe (unless this has changed recently... and that would be a big deal).

      ------------ :Wq Not an editor command: Wq
      Necroshine

      Well, I've tried to solve this with this concept, and doesn't work for me. After a some time, the connections was broken and the process stays inactive. Why?


      Cheers!