in reply to Multithreaded Script CPU Usage
A disk (with one disk head) is basically a single threaded application. Trying to keep 10 threads happy results in a lot of seeks and very slow performance. Sometimes it is OK to have 2 threads: one waiting for a disk seek and the other parsing a different file.
I'd redesign the application to be pure perl and single threaded. If you need to scan several disks, I'd start one scanner per disk and directly feed into the database using DBI.
The following "code" is just for showing the perl way to do it with prebuild modules. It is mostly copied from perldoc, won't run out of the box and has no error handling. I could not find a good way to get author information from MS office files. Win32::OLE and some digging into MS Explorer OLE should help.
P.S: Basically you have reinvented the unix 'locate' tool and ignored Microsofts indexing files for faster search.use strict; use warnings; use File::Find; use DBI; $dsn = "DBI:mysql:database=$database;host=$hostname;port=$port"; $dbh = DBI->connect($dsn, $user, $password); $sth = $dbh->prepare("INSERT INTO table(foo,bar,baz) VALUES (?,?,?)"); sub wanted { my @stats = stat $_; $sth->execute( $foo, $bar, $baz ); } find(\&wanted, @directories_to_search);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Multithreaded Script CPU Usage
by Zenshai (Sexton) on Aug 26, 2008 at 22:13 UTC | |
by NiJo (Friar) on Sep 01, 2008 at 21:05 UTC | |
by Zenshai (Sexton) on Sep 03, 2008 at 00:59 UTC |