Re: how to split huge file reading into multiple threads

I have a huge file of millions of record.... But it takes huge time around 2+ hours

How many millions?

Perl can process 4 million records in 2.5 seconds:

perl -MTime::HiRes=time -E"BEGIN{$t=time()}" 
-nle"++$n }{ printf qq[$n records in %f seconds\n], time-$t" 250MB.CSV
4194304 records in 2.518000 seconds
[download]

So, how about you post your code and let us help you fix it?

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: how to split huge file reading into multiple threads Download Code

Replies are listed 'Best First'.
Re^2: how to split huge file reading into multiple threads by sagarika (Novice) on Aug 30, 2011 at 09:39 UTC
Thanks! Yes. even I am under the impression that perl as designed for pattern extraction and report; would be fast and reliable for text processing. Please see my (above) reply to:"AR" of dated:"Aug 30, 2011 at 09:05 UTC" what my code does. Thank you for extending the hand. However, as of now, I completely think that there is not really much I am doing (as in processing the records) that would consume the time. I dont want other monks to get mis-directed by pasting the code.	[reply]
Re^3: how to split huge file reading into multiple threads by BrowserUk (Patriarch) on Aug 30, 2011 at 09:57 UTC
I dont want other monks to get mis-directed by pasting the code. Let "other monks" look after themselves. If your code is taking 2 1/2 hours to process 20 million records against 600 records stored in a hash, then it is your code that has problems. Should we try and guess what mistakes you are making? Are you for instance, treating the hash as an array? Or re-opening the output files for every record you write? Post the code and we won't have to make such guesses. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^4: how to split huge file reading into multiple threads by sagarika (Novice) on Sep 02, 2011 at 08:34 UTC
Hey I posted the code. Can you please suggest now.	[reply]
Re^5: how to split huge file reading into multiple threads by BrowserUk (Patriarch) on Sep 02, 2011 at 09:06 UTC
Re^6: how to split huge file reading into multiple threads by sagarika (Novice) on Sep 07, 2011 at 06:06 UTC