Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I want to write perl modules that keep a dbm file (or other disk-based hash) of programs that use them. So I can find out "who uses this code?"

I have perl 5.10 and want to use the "best" dbm for the job.

So, what is the best dbm to use? gdbm?

I think I would store keys along the lines of "__PACKAGE__ $FindBin::Bin/$FindBin::$Script" -> localtime().

And I would remove keys with time values sufficiently old, say older than 2 years.

I need to support concurrency to a degree, as programs that use the same module may run simultaneously. It's okay if a record doesn't get written (I just want to have a way to inspect who is a likely consumer of the code in a module). It is *not* okay if the dbm gets corrupted or "breaks" the rest of the code. Maybe I wrap the whole thing in a module that uses caller() to keep track.... "use ModuleAuditor;" Is there an existing CPAN solution? Thanks Wise Monks!

  • Comment on best dbm for small dictionaries? write locking or concurrency needed

Replies are listed 'Best First'.
Re: best dbm for small dictionaries? write locking or concurrency needed
by Corion (Patriarch) on Oct 14, 2010 at 14:33 UTC

    A very simplicisitic approach would simply append all information to a log file and keep the reporting separate. This means that you won't really have to deal with locking as long as you append "short" lines to the logfile, as they are written atomically. Of course, the downside is that reporting about the programs becomes a bit more tedious and maybe you'll have to write a script that runs every week to prune unwanted information from the logs.

      The problem is that some programs get run frequently and some infrequently. I'd have to prune at a certain length, but the last 50k lines of output might be dominated by a single program. I guess I could prune using "uniq" to coalesce similar lines together.... not exactly what I wanted, however.
Re: best dbm for small dictionaries? write locking or concurrency needed
by roboticus (Chancellor) on Oct 14, 2010 at 15:34 UTC

    You could use a directory tree to hold your data, and then have a little thing like this to log your module usage:

    # plugh.pl my @module = caller; my @program = caller(1); open X, '>', "/module/logger/base/path/$module[1]/$program[1]" or die "Can't log module $module[1] use by program $program[1]\nReas +on: $!\n"; print X time(), "\n"; # content doesn't really matter... whatever you +like close X;

    Then your modules need only do something like:

    # MyModule . . . require "plugh.pl"; . . .

    Obviously, if you want to use this, you'll have to do a little setup & such. The reason I proposed it is that you don't need to rely on any modules since the OS would provide your "database". You could even use find to do your reporting...

    ...roboticus

    Yes, it's small and crufty--but simple, too....

      I like this solution, except for the creating of potentially lots-a little files.
Re: best dbm for small dictionaries? write locking or concurrency needed
by jethro (Monsignor) on Oct 14, 2010 at 16:58 UTC

    It seems you only want the last time script X accessed your script Y. Not usage statistics like "Y was accessed 2050 times from X".

    If that is the case, roboticus solution seems a good fit. On average scripts will use (lets say) 5 modules, so the amount of small files is just 5 times the amount of scripts you have in use.

    But if you want usage statistics, the log file seems the better idea. Just accumulate the numbers into a summary file whenever you prune. Another advantage of the log is that you can easily use more than one computer to get statistics from (as it is easy (on linux at least) to send log messages to a central machine)