Recently, the Perl module system has been receiving a major overhaul. CPANPLUS, Module::Build, and the recent MakeMaker refactoring are all examples of this. But one thing that hasn't changed is storage of metadata. Metadata in Perl still consists of setting $VERSION in your module and plugging some values into your Makefile.PL (or Build.PL).

I've recently written a module that could change that, but I'm not sure if I should pursue it.

It's not even in alpha--I haven't actually tried it out--but here's what its metadata would look like:

use Module::Metadata author => 'BRENTDAX', version => ($VERSION=0.01), #compatibility license => 'perl', threads => 'none', thrsafe => 'unknown', depends => { perl => 5.005 #might work in earlier versions };

No fancy attributes or formats involved--just a use line. There isn't even a global or even computer-specific metadata database--it's just there to be extracted. Module::Metadata would know how to extract its own metadata from any module text, somewhat like the $VERSION extraction code that's in half a dozen modules, but more sophistocated and general.

Obviously, to make it work as well as it could I would need cooperation from MakeMaker, Module::Build, and CPAN and PAUSE (to modify the indexing scripts). Is this idea worth pursuing, or is it a total pipe dream?

=cut
--Brent Dax
There is no sig.

Replies are listed 'Best First'.
Re: Module::Metadata
by adrianh (Chancellor) on Jan 07, 2003 at 09:44 UTC

    Getting at meta-data is a useful thing. However, the meta-data is usually associated with distributions rather than individual modules.

    You can get at the info with things like Module::Info and CPANPLUS::Backend, and (IMHO) extending these APIs and making them easier to use is the way to go, rather than adding per-module information.

      Getting at meta-data is a useful thing. However, the meta-data is usually associated with distributions rather than individual modules.
      I dunno about that. Larry has at least hinted that Perl 6 will have metadata attached to each module. Besides, what if Foo::A is (say) threadsafe but Foo::B isn't? (This isn't as farfetched as it seems--an example would be Foo::FastC vs. Foo::SafePerl.) Or what if you have half a dozen modules in your distro, each of which had a different author, and you wanted the machine to be able to list them all?

      Of course, if you did want distro-wide metadata, you could set it up in whatever you consider to be the main file. And I can make special allowances for distro-wide data, if it'll help.

      You can get at the info with things like Module::Info and CPANPLUS::Backend, and (IMHO) extending these APIs and making them easier to use is the way to go, rather than adding per-module information.
      On this point, I understand what you're saying, but I don't know if I agree with it. Module::Info is (currently) more for information the computer can detect by itself--whether it uses eval STRINGs or gotos, not whether it's threadsafe or who wrote it. In its current role, at least, it's basically used to probe the computer's installed module set. A useful thing to be sure, but a different one than what I'm talking about.

      CPANPLUS is almost a neutral party in this--it gets its information from the CPAN, so we should really be talking about what the CPAN uses. It already has things organized by author, and it scans for a $VERSION to get the version number, but most of the other stuff indicated by this module it understands poorly at best. Dependencies have always been a MakeMaker-level affair (with CPAN(PLUS)? detecting and responding to such errors) and AFAIK threadsafety and thread use have never been indicated in a computer-readable form. Module::Build indicates licensing terms, but it so far hasn't been widely adopted.

      Part of the effort to make this module workable would be to get all these other modules talking to it. I can imagine Module::Info interfacing with Module::Metadata and presenting the metadata as part of its own interface--it just makes too much sense--but I can't imagine Module::Info doing this by itself.

      =cut
      --Brent Dax
      There is no sig.

        Fair points :-)

        My brain seems to have been disengaged when I read your original post - you're talking about setting metadata, and I was thinking about getting it.

        Getting the relationship between the per-module and per-distribution meta-data could be interesting (e.g. modules that are in more than one distribution, modules that have difference licences depending on the distribution, etc.)...

Re: Module::Metadata
by Jenda (Abbot) on Jan 07, 2003 at 13:11 UTC

    Well ... this is an interesting idea. It IMHO would be better to keep those in a specialy formated =pod section. (Like =begin PerlCtrl for ActiveState's PerlCtrl or =begin PDKcompile for my PDKcompile).

    That way the data are easy to find, yet they do not get in the way unless you search for them. Your format would require the module to be installed everywhere where you want to use the module. If you put the data into POD, then you only need some module for their extraction if you really want the metadata.

    Jenda

      Your format would require the module to be installed everywhere where you want to use the module. If you put the data into POD, then you only need some module for their extraction if you really want the metadata.
      Hmm, that's a good point. OTOH, if you're worried about not having the module, it's always possible to support something like this:
      eval { register Module::Metadata ... };
      That way it doesn't even try to save the metadata unless the module is already loaded.

      There's also the question of the version number. If you give Module::Metadata the version number, presumably it should set $VERSION so you can say use Module 1.0.

      I haven't thought out all the implications of this, but I could provide a "micro" implementation of this module that searched for the full version elsewhere in @INC and, if it couldn't find it, just set up the version number. Such a module would be perhaps a kilobyte.

      =cut
      --Brent Dax
      There is no sig.

        BEGIN eval require or similar constructs are a solution, but IMHO that gets in the way of laziness. On the other hand, every good module should have at least a minimum of POD, and the metadata, conceptually, is nothing other than documentation. So it'd be only natural to store it in the POD, which I'm going to write anyways. After some thinking, it seems to me the format could be very simple - rather than cooking up some special syntax, just use the key => value, list you intended for Module::Metadata's import to expect. It would effectively be a section of POD-commented-out Perl code - something like this:
        =begin metadata author => 'BRENTDAX', version => 0.01, license => 'perl', threads => 'none', thrsafe => 'unknown', depends => { perl => 5.005 }, =end metadata
        Note how this also eliminates the source filtering headaches since a POD parser can unambiguously extract the section, unlike any manual attempts to find the semicolon which terminates a use statement. The data found could be revaled just as per the example in your other post. A metadata-aware POD browser could effortlessly present this information along with the documentation, too.

        Makeshifts last the longest.

Re: Module::Metadata
by theorbtwo (Prior) on Jan 07, 2003 at 22:31 UTC

    IMHO, it's important to be able to extract the metadata without running the module, for several reasons: security, speed, interoperablity. The last requires some explanitaion: if a module is only designed to work from Win32, it should be possible to detect that from the given metadata easily, without having to code around it.

    This is, of course, in direct conflict with having modules be able to determine their own metadata dynamicly. I'm inclined to sacrifice that completly, but most people don't seem to see it that way.


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

      Because it's in one use call, it's relatively easy to extract--just pull out everything between use Module::Metadata and the next semicolon, rework it into something a bit safer, and eval it. Or even better, reval it. That's the approach the module's extract_metadata function takes--it constructs a very restrictive Safe container, massages the code into a call to Module::Metadata->new, and evalutes it in the Safe container.

      $safe->permit_only(qw(:base_core :base_mem)); $safe->deny(qw(repeat range));

      The actual code is a bit more complicated than this, because it handles nesting and (basic) quoting correctly in case a future metadata field accepts a coderef or something, but that's the gist of it. It also responds to use Module::Metadata 1.0 (version numbering) correctly by calling Module::Metadata->VERSION.

      By the way, thanks for mentioning interoperability. I should add in a field for operating systems.

      =cut
      --Brent Dax
      There is no sig.

Re: Module::Metadata
by PodMaster (Abbot) on Jan 08, 2003 at 01:21 UTC
    I like it.

    I don't know if you're aware of Module::MetaInfo, so i'd like to quote from it:

    This module is designed to provide the primary interface to Perl meta information for perl module distribution files (this, however, is a prototype and hasn't yet been accepted by the "perl community", so don't count on it yet). The module is designed to allow perl modules to be easily and accurately packaged by other package systems such as RPM and DPKG.

    The Module::MetaInfo module isn't actually designed to get any meta information from the perl module. Instead it serves as an entry point to other modules which have their own way of doing that. Since there isn't yet any agreed way to store meta-information in perl modules this may not be very reliable.

    Currently there are two ways of getting meta information: a) guessing from the contents of the module and b) using a directory structure which has not yet been accepted by the perl community. The default way this module works is to first try b) then try a) then to give up.

    I'm all for this idea.

    The CPAN community has been very friendly, but it has also been very loose.

    We need official community accepted/defined standards.

    You can depend on ExtUtils::MakeMaker for installing perl modules, well that's not where it should stop.

    A Module::Signature is also a good idea.


    I use Module::ScanDeps
    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      Interesting. I hadn't noticed this module in the CPAN before.

      OTOH, it seems to be all about divining metadata, whereas mine is about encoding it. I would imagine that there could be a Module::MetaInfo::ModMetadata (or some such) backend for this module.

      =cut
      --Brent Dax
      There is no sig.