Re: Using fork for reading and processing a large file in ActiveState perl
by almut (Canon) on Feb 24, 2010 at 18:03 UTC
|
When trying to speed up some program, the first step usually is to figure out what exactly is slow by profiling it (e.g. Devel::NYTProf). Only then you can take appropriate measures. For example, if the bottleneck is mainly IO (reading/writing files), multithreading is unlikely to be of much help.
| [reply] |
|
|
IO is very small compared to other processing which includes an http request and processing the web page HTML.
| [reply] |
Re: Using fork for reading and processing a large file in ActiveState perl
by BrowserUk (Patriarch) on Feb 24, 2010 at 18:14 UTC
|
How big is "huge"?
What are doing between reading the data in, and writing it out?
(I assume you must be doing something complex, because on my very ordinary system with a so-so disk, Perl can read & write 3GB/minute. Which would make your file at least 18 Terabytes.)
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
|
OK, let's put it this way.
I want to read a small file into array once. Then in the loop I do http requests, process results and write a portion into the file. Each portion is small.
| [reply] |
|
|
| [reply] |
|
|
Re: Using fork for reading and processing a large file in ActiveState perl
by zentara (Cardinal) on Feb 25, 2010 at 14:04 UTC
|
See character-by-character in a huge file for how to split your file. Unless you have a real multicore computer, using threads won't speed you up, but if you do, break your file into chunks, and hand them off to your threads.
| [reply] |
|
|
Dude, read the other posts before replying, you're answer is completely wrong for the OP's problem.
| [reply] |
|
|
| [reply] |
|
|
Dude, read the other posts before replying,I did, no one mentioned to him how to bring in his huge file, and effectively split it, in order to hand them off to his threads for the parallel-processing. The OP asked I am reading and processing a huge file and recording results to another file which takes hundreds of hours. I want to run this task in multithreads.. How is it wrong to show how to get his input file split into bite sized chunks for his threads? I question whether you understand what needs to be done in an actual program. Maybe you didn't actually look at the link I provided? I showed him the various ways to achieve the first step needed for his code. See How to break up a long running process for some parallel processing usage.
| [reply] |
|
|
|
|