So I was giving this some thought on my commute this morning as I was stuck in traffic, and the list produced by your code actually has helped convince me even more. I think that there is a problem with ignoring the revision number.

As I looked over your list, I noticed at the bottom was a module of mine IOC. Now, rather then be flattered by this, I know that it's height on the list is quite artificial. I decided to adopt the XP idea of "release early and release often" with this module. The first released version (0.01) was on Oct. 15th of this year, and there have been 18 subsequent versions released, the last one on the 15th of Dec. and at least 10 versions have been released within the range of this log file (Nov. 1 - Dec. 15th). When I ran my script (see above) like this:

grep 'IOC' ~/Desktop/cpan-gets | perl test.pl
I got the following output:
+--------------------------------------- | Total Downloads by Module +------+-------------------------------- | 609 | IOC +--------------------------------------- | Total Downloads by Distro +------+-------------------------------- | 10 | IOC-0.06.tar.gz | 10 | IOC-0.01.tar.gz | 10 | IOC-0.17.tar.gz | 10 | IOC-0.03.tar.gz | 10 | IOC-0.04.tar.gz | 10 | IOC-0.05.tar.gz | 11 | IOC-0.02.tar.gz | 18 | IOC-0.07.tar.gz | 44 | IOC-0.09.tar.gz | 46 | IOC-0.13.tar.gz | 50 | IOC-0.12.tar.gz | 52 | IOC-0.14.tar.gz | 54 | IOC-0.10.tar.gz | 59 | IOC-0.08.tar.gz | 64 | IOC-0.15.tar.gz | 66 | IOC-0.11.tar.gz | 85 | IOC-0.16.tar.gz +------+--------------------------------
Clearly this module is not one of the top 100 on CPAN.

I think we need to give some thought as to how to include revisions into the analysis. My first thought is to maybe take the number of revisions found on the list, and to use that to somehow weight the results. The more revisions the less weight basically. Another thought is to somehow account for the number of downloads per-revision. As I mentioned above, the fact each revision is being downloaded shows that someone is following the development of the module, and so that should taken into account.

In the end I agree, this is going to be a mixture of art and science to come up with these top 100.

-stvn

In reply to Re^2: Help update the Phalanx 100 by stvn
in thread Help update the Phalanx 100 by petdance

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.