As mentioned previously, Perl offers the LWP module that can fetch all 5000 rar files from within a single process. Also, there is an Archive::Rar module (though I haven't used this, and I'm not sure if it would help you). Depending on how intricate the move/rename step is (e.g. are you changing the name of each data file in a large set, as well as putting it into a separate directory?), perl could do this much faster than a shell script -- see my reply above.

I believe it's very likely that "a series of sed filters", applied iteratively to thousands of files to alter their contents, would be slower than a single perl script that applies all the filtering over the full list of files in a single process. And the "File::Copy" module might compare quite favorably to "cp" commands in a shell script -- again, depending on how complicated the process is. (On the other hand, a single "rsync" job might be best for this last step.)

When manipulating files by the thousands, it really makes a difference when you can run just a few distinct processes to do it all, rather than thousands of distinct processes. Also, whenever you can do anything to reduce the total number of "intermediate" files created and destroyed in the overall procedure (e.g. keeping whole archive sets in memory and/or doing in-place edits), you will find this to be worthwhile.


In reply to Re: BASH vs Perl performance by graff
in thread BASH vs Perl performance by jcoxen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.