I've just run your program (with slight modifications) under Linux on a dual SMP P2-350 machine, on my home directory, whose subdirectories contain about 20 text files and quite a lot (about 500MB) of html files in several directories. The results amazed me. So I did run this test four times in a row, and the last three results were identical but really amazing :

t1:  7 wallclock secs ( 2.43 usr +  4.27 sys =  6.70 CPU)
t2:  7 wallclock secs ( 2.43 usr +  4.32 sys =  6.75 CPU)
t3: 14 wallclock secs ( 8.25 usr +  5.73 sys = 13.98 CPU)
t4:  7 wallclock secs ( 1.62 usr +  4.77 sys =  6.39 CPU)
t5:  1 wallclock secs ( 0.84 usr  0.21 sys +  0.00 cusr  0.01 csys =  0.00 CPU)

The trend we can see is, that everything is faster in general, about the factor 3 or 4, but what really is amazing is, how little time &t5(); takes, only 1 wallclock second. So I did interchange &t4() and &t5() to see if that result was order dependant :

...
t4:  1 wallclock secs ( 0.95 usr  0.18 sys +  0.00 cusr  0.01 csys =  0.00 CPU)
t5:  7 wallclock secs ( 1.75 usr +  4.65 sys =  6.40 CPU)

But it wasn't. This is really strange and sheds some new light on File::Find which I always considered clumsy, and which is one of the slower routines under Win32. Wonders of Perl :).

To see how the results would change, I then reran your test for files that match .html (while going through the source code, there were some things with your regular expressions - the ".txt" RE will match anything consisting of at least four letters with "txt" not at the start and the directory matching will leave out directories which start with a "." (so unix "hidden" directories will not be searched). I ran the test three times and threw away the first test results on about 500 MB of html files.

t1:  8 wallclock secs ( 2.59 usr +  4.65 sys =  7.24 CPU)
t2:  8 wallclock secs ( 2.47 usr +  4.66 sys =  7.13 CPU)
t3: 17 wallclock secs ( 8.65 usr +  5.90 sys = 14.55 CPU)
t4:  9 wallclock secs ( 1.67 usr +  5.42 sys =  7.09 CPU)
t5:  2 wallclock secs ( 1.04 usr  0.23 sys +  0.00 cusr  0.01 csys =  0.00 CPU)

And amazingly, the trend continues, with &t5() beating the rest by far, even though I had thought the whole results should have become console bound anyway, but that wasn't so.

I wonder what my tests under NT 4 will bring us :)


In reply to Benchmarks under Linux 2.2.x by Corion
in thread Getting a List of Files Via Glob by Zoogie

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.