cormanaz has asked for the wisdom of the Perl Monks concerning the following question:

Howdy bros. I have a multi-threaded script using, Parallel::ForkManager, LWP, and DBI/DBD-MySQL, to get the html from blog posts. I have not posted the code because it's completely ordinary: load URL list, loop thru, fork inside loop, get html, connect db, stick html into db, disconnect db. I am running on windoze, activeperl 5.8.8 build 820.

I am randomly crashing wirh Free to wrong pool 225b28 not 51129d98 at c:/perl/lib/constant.pl line -1 One difference between my case and others I've found on Super Search and Google is that my error is popping-up in /per/lib/constant.pl. The others seem to say "on global descruct."

Anyway the posts I found on this say it is because the windoze implementation of perl fork kind of sucks. On the other hand most of the posts on this subject are old. I did find this one thing where a guy thinks the problem is with LWP somewhere.

So basically I am posting to check if anyone knows any more about this or of a way to debug or fix. It's a real pain in the posterior.

Thx......Steve

Replies are listed 'Best First'.
Re: LWP, DBI and Free to Wrong Pool error
by zentara (Cardinal) on Apr 28, 2007 at 15:37 UTC
    That error generally means you are trying to access an object from another thread, and the object isn't thread-safe. Since you havn't shown any code, remember, every thread gets a copy of the parent and the time of thread creation, so try to localize your objects to individual threads, try to create them before the main thread gets created, so you don't get cross-over copying of object code. Also, don't try to access objects across threads. Finally, on win32, why not just use threads, and avoid the pseudo forking?

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
      I was sharing an LWP::UserAgent object between threads, based on advice from some monks that it was OK to do so. I changed that, and just tested again, and same problem. Here's the code for the main loop:
      my $pm = new Parallel::ForkManager(20); for my $i (0..$#itemstoget) { $pm->start and next; # do the fork my ($id,$url,$title,$excerpt) = @{ $itemstoget[$i] }; my $user_agent = LWP::UserAgent->new; $user_agent->timeout(30); my $request = HTTP::Request->new('GET', $url); my $response = $user_agent->request($request); my $dbh = connectdb('blogdb'); if ($response->is_success) { unless (isforeignlanguage($response,$title,$excerpt,$u +rl)) { my $html = resolve_charset($response->content); # if the html meets the criteria for at least one +client that claims it, extract the text my $itemok = checkhtml($dbh,$id,$html); if ($itemok) { inserthtml($dbh,$id,$html); print "OK $id ",substr($url,0,50),"\n"; } else { print "SKIP $url\n"; dosql($dbh,"update blogitems set getattempts=9 +99 where id=$id"); } } else { print "FOREIGN $url\n"; dosql($dbh,"update blogitems set getattempts=999 w +here id=$id"); } } else { print "FAILED $url\n"; dosql($dbh,"update blogitems set getattempts=getattempts + + 1 where id=$id"); } $dbh->disconnect; undef $user_agent; $pm->finish; } $pm->wait_all_children;
      As for the subs, resolve_charset figures out and decodes the charset, isforeignlanguage applies Lingua::Identify to see if it's English ir not, itemok makes sure the post contains requisite keywords, and inserthtml puts it into the database.

      As for threads vs. pseudofork, it's bc I'm still learning this multi-threaded stuff and pseudofork seems more straightforward. Would using threads instead solve this problem?

        Would using threads instead solve this problem?

        Maybe. You could try something like this, which compiles clean but is obviously untested.

        #! perl -slw use strict; use threads; use threads::shared; use Thread::Queue; use constant NTHREADS => 20; my @itemstoget = (); ## Get the items from somewhere? my $Qwork = new Thread::Queue; $Qwork->enqueue( join chr(0), @{ $_ } ) for @itemstoget; $Qwork->enqueue( (undef) x NTHREADS ); my $Qresults = new Thread::Queue; my $running : shared = 0; threads->new( \&thread, $Qwork, $Qresults )->detach for 1 .. NTHREADS; my $dbh = connectdb( 'blogdb' ); sleep 1 until $Qresults->pending; while( $running or $Qresults->pending ) { ## Modified condition sleep( 1 ), next unless $Qresults->pending; my( $id, $url, $html ) = split chr(0), $Qresults->dequeue; if( $html ne 'FAILED' ) { # if the html meets the criteria for at least one client that +claims it, # extract the text if( checkhtml( $dbh, $id, $html ) ) { inserthtml( $dbh, $id, $html ); print "OK $id ",substr( $url, 0, 50 ),"\n"; } else { print "SKIP $url\n"; dosql( $dbh,"update blogitems set getattempts=999 where id +=$id" ); } } else { print "FOREIGN or FAILED $url\n"; dosql($dbh,"update blogitems set getattempts=999 where id=$id" +); } } $dbh->disconnect; exit; sub thread { { lock $running; ++$running } my( $Qwork, $Qresults ) = @_; my $user_agent = LWP::UserAgent->new; $user_agent->timeout( 30 ); while( my $item = $Qwork->dequeue ) { my( $id, $url, $title, $excerpt ) = split chr(0), $item; my $request = HTTP::Request->new( 'GET', $url ); my $response = $user_agent->request( $request ); my $html = ( $response->is_success and not isforeignlanguage( $response, $title, $excerpt, $u +rl ) ) ? resolve_charset( $response->content ) : 'FAILED'; $Qresults->enqueue( join chr(0), $id, $url, $html ); } undef $user_agent; { lock $running; --$running } }

        It should avoid the reentrancy problem with DBI by only accessing the DB from the main thread. It queues up the work items in a shared queue and starts 20 threads to fetch the urls. It then performs the sanity checks on the response before posting either html fetched or a failure code back to the main thread for processing to the database.

        The threads will stop once the queue empties and the main thread will stop once there are no further results to process.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.