As for the subs, resolve_charset figures out and decodes the charset, isforeignlanguage applies Lingua::Identify to see if it's English ir not, itemok makes sure the post contains requisite keywords, and inserthtml puts it into the database.my $pm = new Parallel::ForkManager(20); for my $i (0..$#itemstoget) { $pm->start and next; # do the fork my ($id,$url,$title,$excerpt) = @{ $itemstoget[$i] }; my $user_agent = LWP::UserAgent->new; $user_agent->timeout(30); my $request = HTTP::Request->new('GET', $url); my $response = $user_agent->request($request); my $dbh = connectdb('blogdb'); if ($response->is_success) { unless (isforeignlanguage($response,$title,$excerpt,$u +rl)) { my $html = resolve_charset($response->content); # if the html meets the criteria for at least one +client that claims it, extract the text my $itemok = checkhtml($dbh,$id,$html); if ($itemok) { inserthtml($dbh,$id,$html); print "OK $id ",substr($url,0,50),"\n"; } else { print "SKIP $url\n"; dosql($dbh,"update blogitems set getattempts=9 +99 where id=$id"); } } else { print "FOREIGN $url\n"; dosql($dbh,"update blogitems set getattempts=999 w +here id=$id"); } } else { print "FAILED $url\n"; dosql($dbh,"update blogitems set getattempts=getattempts + + 1 where id=$id"); } $dbh->disconnect; undef $user_agent; $pm->finish; } $pm->wait_all_children;
As for threads vs. pseudofork, it's bc I'm still learning this multi-threaded stuff and pseudofork seems more straightforward. Would using threads instead solve this problem?
In reply to Re^2: LWP, DBI and Free to Wrong Pool error
by cormanaz
in thread LWP, DBI and Free to Wrong Pool error
by cormanaz
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |