in reply to Gathering module usage statistics

To my mind, fascination with number of downloads is like fascination with XP. Large amounts of both might provide an ego boost, but the correlation of both with quality of product is weak. Even if you could get a reliable count, what would it tell you?

Personally, I would have to look hard at a module that uploads personal information to the author, before I used it. If you are interested in the breadth of systems that your code is being subject to, try looking into the CPAN Testers. They do smoke tests on a wide range of systems and can provide invaluable feedback.

-Mark

Replies are listed 'Best First'.
Re: Re: Gathering module usage statistics
by dragonchild (Archbishop) on May 04, 2004 at 19:30 UTC
    I support three templating modules that work with HTML::Template. I also have a very limited amount of time to do open-source work. If I know that Excel::Template, for instance, was downloaded 1500 times in the past month, but Graph::Template was only downloaded 3 times, I know where I'll be putting the few hours I have. And, vice versa.

    Also, if I know that a bunch of people on Darwin are downloading one of my modules, I'll work a little harder to get a Darwin testing platform. But, if I know that not a single VMS user has downloaded it, I won't care so much.

    The other point is that I, as the user, would like this information. A module that's heavily installed has a weak correlation to a module that's heavily used. If it's heavily used, then it's more likely to be actively supported.

    Is it a strong correlation? Probably not. Is it more info than we have? Yes.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      If I know that Excel::Template, for instance, was downloaded 1500 times in the past month, but Graph::Template was only downloaded 3 times, I know where I'll be putting the few hours I have.

      The problem here is that you don't know why Graph::Template was only downloaded 3 times. Speaking purely hypothetically, maybe people looked at the documentation of Graph::Template, decided it sucked, and moved on to something else. But the people looking at Excel::Template (which targets an almost completely different format, and thus likely has a different userbase) thought it was pretty good as it is and use it all the time. In that case, you'd probably want to put more effort into fixing Graph::Template.

      ----
      : () { :|:& };:

      Note: All code is untested, unless otherwise stated

        That's one way to look at it. Here's another:

        I once was a participant in a seminar on testing strategies. The speaker was discussing various ways of allocating testing time. He touched on one strategy which basically went like this:

        1. 90% of all usage of Quicken will be in the ledger.
        2. 90% of all user perception of our product will be based on the performance of the ledger.
        3. Maybe we should focus a larger proportion of our testing of Quicken on the ledger.

        This isn't to say that the other 90% of the code shouldn't be tested. But, that which has a greater proportion in consumer perception should be more heavily tested.

        This has a direct correlation to open-source development. If a module isn't being downloaded, I should not spend my extremely small amount of time on it. Instead, I should focus where I get the greater bang for the buck.

        Yes, Graph::Template may have all sorts of reasons why it isn't being downloaded. But, the fact is that it's not being downloaded (in your example). So, until it is, or I have more free time, it's not a priority.

        It's basic triage of development time.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Re: Gathering module usage statistics
by Juerd (Abbot) on May 04, 2004 at 20:08 UTC

    Even if you could get a reliable count, what would it tell you?

    It would tell me that what I do has meaning.

    Personally, I would have to look hard at a module that uploads personal information to the author, before I used it.

    So would I, but I do not consider the OS name and Perl version personal information.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      When information like perl and OS characteristics are uploaded, the originating IP address is also uploaded. What could one do with this information? You can sometimes discover who owns this machine, starting with the IP.
      • Ah, I know an exploit for perl 5.x.x. I'll bang on this IP to get root privleges.
      • What? He's still using a 2.2 kernel? Wait till the chat forum hears about this!
      • This fellow is using linux? I bet he'd be a good addition to the targetted list I am selling to Microsoft.
      Now, I am not saying that you would do such dastardly deeds, but giving out such information does potentially decrease security.

      If you implement something like this, it really should be opt-in, as people do install using CPAN, sometimes in an unattended fashion.

      -Mark

        the originating IP address is also uploaded.

        Not necessarily, but for this discussion, let's assume direct connections.

        You can sometimes discover who owns this machine, starting with the IP.

        I can do the same by scanning networks. Why would I wait for you to install my module? :)

        Ah, I know an exploit for perl 5.x.x. I'll bang on this IP to get root privleges.

        Perl is not a netwok service and network services written in Perl can usually not be identified as such.

        What? He's still using a 2.2 kernel? Wait till the chat forum hears about this! This fellow is using linux? I bet he'd be a good addition to the targetted list I am selling to Microsoft

        Unrelated to the module installation.

        decrease security

        If you really think connecting to another host and thereby letting the other party know your IP address decreases security, please DISCONNECT IMMEDIATELY! THE WEB IS A DANGEROUS PLACE!

        sometimes in an unattended fashion.

        That's their fault. They shouldn't do that. Those who choose to do so take enormous risks already. They already implicitly agree to whatever license the module has, as not every module has the same license.

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }