pileofrogs has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, monks, both great and small!

I'm a sysadmin and I want to keep track of what modules get used and which ones don't. The obvious way to do that is to wander through them and check the access times on each file. Unfortunately, if I use cpan's autobundle to make a list of all modules, it reads each file's VERSION which updates the access time, making access times useless for my purposes.

What do think is the best way to go here? I'm thinking:

Just to restate in case I've been unclear (as usual) I want to be able to make a list of all my installed perl modules and a list of which ones get used the most. I want to be able to do this on a regular (nightly?) basis.

Thanks!
--Pileofrogs

Replies are listed 'Best First'.
Re: tracking module usage
by tirwhan (Abbot) on Dec 03, 2008 at 22:42 UTC

    If you're on Linux (with a kernel newer than 2.6.13) you might be able to use inotify via Linux::Inotify2 for this. You'd have to write a daemon which is running continually and logs/counts access times as they occur, but this should not be a significant drain on system resources. (Disclaimer: I've not used Linux::Inotify2, so don't know how well it works, but at least it seems to be somewhat actively maintained). If you're willing to use a non-perl tool, maybe inotifywatch from the inotify-tools will do you.

    Mind you, I do see a problem with the whole approach concerning long-running processes using your Perl modules. For example, if this is a web server running Apache and mod_perl, it will only access the Perl module once (the first time it is loaded) and not change the access time of the module again unless Apache is restarted (which for a well-maintained server could be weeks), regardless of how often the module gets called. For this kind of scenario you'll need a different approach.

    Update: Ah, I see almut beat me to it ;-). Just a note concerning performance, from what I know of inotify I'd guess that having a process watch all Perl modules (with less than many thousand accesses a day) would be less resource intensive than going through the whole of @INC once a night. Granted, you probably don't care about resources used at night, but still, inotify is more elegant.

    And to be honest, now that you've said why you want to do this, I'd question the wisdom of wanting to do it any way. Are your servers really that constrained for disk space that you need to cut down on the number of Perl modules installed? On any modern machine the disk space consumed thereby should be far too little to really make it worth the bother.


    All dogma is stupid.

      Hmm.... Excellent point about the long running processes...

      It's not about disk space.

      I have to upgrade my systems OS on a regular basis, and because I can't have downtime for more than a few minutes, I build a new system with the new OS version and swap it with the old one. I need the new system to act just like the old one, so obviously I want to install all the perl modules I had on the old one. Except, it can be a pain trying to install a massive autobundle when certain flakey modules don't install correctly or they fail to have the correct dependencies listed etc... So, it's nice to leave out any modules I don't actually need.

      Also, it's nice to keep things tidy, and it's just fun to know what gets used.

      --Pileofrogs

Re: tracking module usage
by almut (Canon) on Dec 03, 2008 at 22:25 UTC

    Don't know how well it would scale for a large number of files, but just to throw in another idea: maybe you could try inotify (or the respective Perl module Linux::Inotify2).  In case it doesn't turn out to become a resource hog... it might give much more detailed access statistics than a nightly scan of atimes would provide.

      That's a good suggestion. If I don't find myself happy with atimes, I may go that route. I don't need very good resolution, just a general idea. Basically, when I do an OS upgrade or whatever, I don't want to keep every single PM, just the ones I actually use. So I only need to know, has this PM been used at all for the last X day/months/whatever.

      Good suggestion though.