Well I am a bit puzzled about the subroutine calls eating the processor.

You are recursively traversing a subtree, opening all the files and generating MD5 checksums. This will consume a lot of processor time as the math involved in calculating MD5s is cpu intensive. The cost of a subroutine call is miniscule by comparison and is a complete red-herring.

You say it is taking 10-15 minutes as if that is too long. How many files, and how big are they? It doesn't sound unreasonable to me.

Other than that, it is not clear to me exactly what problem you are asking for help with. I have your code, but I obviously cannot run it without creating a subdirectory tree that contains files with the names of those you are looking for, and I could not verify your timing without having the same number and sizes of files as you have.

The biggest problem I see with your code is that you are reading all the directory entries into an array at each level of recursion. And recursing whenever you encounter a nested directory. That means that if your directories have lots of files and/or the directory structure is very deep, you are consuming large amounts of memory as you descend the tree.

I think that perhaps your process is consuming so much memory that it is pushing your machine into swapping?

If you are determined to continue to use your own directory traversal routine, then you should avoid "slurping" the whole directory into an array. Instead, call readdir in a while loop and process one entry at a time. This will require that you avoid using a BAREWORD directory handle (like DIRECTORY) and use a lexical instead. Otherwise you will run into conflicts during recursion.

If none of that previous paragraph makes sense to you, then you should probably consider using File::Find or similar instead.

BTW. You should have use strict; (not use Strict;).


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

In reply to Re: Subroutine speed by BrowserUk
in thread Subroutine speed by prad_intel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.