Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Multithreading Parsers

by inq123 (Sexton)
on Mar 22, 2005 at 08:56 UTC ( [id://441402]=note: print w/replies, xml ) Need Help??


in reply to Multithreading Parsers

I totally agree with the suggestion to parallelize the input/output, not the algorithm.

For example, if your parser is parsing XML files in GAME format, then simply in your program glob all files needed to be parsed, split that array into 28 smaller arrays, then fork 27 children working on each of them. I do similar things all the time. Most of those times the output is also independent of each other, the only thing to worry about is competition for things like DB connections/transactions (but changing settings would help).

Or, say, you're parsing the whole Genbank release - even easier, just run 28 of your programs on 28 files! Very simple divide and conquer strategy :)

Trying to parallelize the algorithm, especially since you're using modules others developed, would not be a good investment of time when you could simply parallelize the input and achieve the same savings.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://441402]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-25 16:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found