Hello Monks, I am here to seek your wisdom in the following matter. I am developiong a program which reads Large (upto 2 Gb) input files (upto a max of 5 files. each containing the same type of data) and splits it into 18 different parts ..relevant info going to relevant file. There can be upto 5 files for the type I am looking at. As of now it is a non-threaded process. And I am thinking about making it multi threaded. to speed up the process. However, when I tested the potential benefit for the same using the below test program that uses threads... (reads 5 files and outputs slightly manipulated contents to o/p file) this infact shows me that use threads has even slowed down the process. Env : Perl 5.8.8 build 824 Win XP Code :
#!/usr/bin/perl use threads; use Benchmark qw(:all) ; my $line_var :shared = 0; sub main_func { my ($tid, $in_fh , $out_fh, $start , $stop) = @_; # Synchronised block ++$line_var; while (<$in_fh>) { print $out_fh " LineVar.. $line_var\t" . $_ ; } return $tid; } sub super_main { open (OUTFH1 , "< out1.txt"); open (OUTFH2 , "< out2.txt"); open (OUTFH3 , "< out3.txt"); open (OUTFH4 , "< out4.txt"); open (OUTFHO1 , "> outO1.txt"); open (OUTFHO2 , "> outO2.txt"); open (OUTFHO3 , "> outO3.txt"); open (OUTFHO4 , "> outO4.txt"); $thr1 = threads->create(\&main_func , '1', OUTFH1,OUTFHO1, '1' , ' +1000000'); $thr2 = threads->create(\&main_func , '2', OUTFH2,OUTFHO2, '100000 +0' , '2000000'); $thr3 = threads->create(\&main_func , '3', OUTFH3,OUTFHO3,'2000000 +' , '3000000'); $thr4 = threads->create(\&main_func , '4', OUTFH4,OUTFHO4,'3000000 +' , '4000000'); $tid1 = $thr1->join(); $tid2 = $thr2->join(); $tid3 = $thr3->join(); $tid4 = $thr4->join(); } sub main_func2 { my $line_var2 = 0; open (OUTFHO5 , "> outO5.txt"); my ($tid, $out_fh ,$start , $stop) = (5,OUTFHO5,'1' , '4000000'); for ($i = 1 ; $i<5; $i++) { open (INFH , "< out$i.txt"); while (<INFH>) { $line_var2++; print $out_fh " Line.. $line_var2\t" . $_ ; } } return $tid; } #timethese ( 20, # {'before' => \&main_func2 , # 'after' => \&super_main } # ); cmpthese ( 20, {'before' => \&main_func2 , 'after' => \&super_main } );
Bothe timethese & compthese show poor performance for after..
---------- Perl ---------- s/iter after before after 10.9 -- -18% before 8.99 22% -- Output completed (12 min 45 sec consumed) - Normal Termination
So the question is... Have I made a mistake in the program... OR is threading not tht beneficial ??

In reply to Threads Doubt by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.