Hm. I guess I read the OPs post differently.

His task description is: read records from a (single) huge file, and write them to one of many (600) output files depending upon their contents. He asked how he could use thread to improve the performance.

You suggested splitting the huge file into several smaller files so that each thread could work on a different part.

He pointed out that would mean he would have many threads writing to each of the output files.

You are suggesting that he has many pipes and another thread running a select loop to coalesce the records for each output file before writing them.

Lets say he has split the huge file into 10 parts and he runs 10 threads. Using your schema, he would require one pipe for each of the 600 output files in each of the 10 threads; and another 600 threads running select loops to coalesce the records and write them to the 600 output files. So 610 threads and 6000 pipes!

And that's before we consider that he has exasperated the problem by reading the input from 10 separate files concurrently, which will cause the read head to be dancing all over the disk just to get the input.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^10: how to split huge file reading into multiple threads by BrowserUk
in thread how to split huge file reading into multiple threads by sagarika

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.