Re: Project Metadata Model

I've heard it said, and believe it to be true, that the accuracy of meta data decreases with the distance between the storage of the data and meta data.

For example if I edit a file, I type its name (or a part thereof, followed by the tab key) in the command line, and I am basically forced to notice a divergence between file name and the purpose of a file, should such a divergence occur. But if the purpose of a file was written down in a meta data file, I simply wouldn't notice unless I explicitly cared to remember updating the meta data.

That's why I find that storing meta data in storing so many different places (file names, comments and code in the file, plus some extra in Build.PL/Makefile.PL) isn't actually such a bad idea. It might be a bad idea for the author of a tool that needs the meta data, but for the author or maintainer of the distribution I believe it's the only viable way. After I all I want to program, not spend much time keeping my meta data in sync.

(I also believe that this is the deeper reason for the hatred that many windows users and developers feel towards the "registry", a huge central place for meta data that is in many ways too disconnected from the data it describes).

That said, I don't believe we have found the optimal sweep spot for meta data yet. Dist::Zilla and similar tools are an exploration in the opposite direction to what you propose: they try to derive meta data (for example dependencies) from the code as much as possible, allowing the author to focus even more on the code itself. I've never quite followed that path, though I'm not sure why. Maybe because it feels like giving up control. (Yes, dzil is flexible to let you decide which parameters you want to control yourself, and which it derives itself. But it requires learning about yet another complex system, and somehow I haven't yet felt the need to do so).

Given a single, consistent project metadata store, I begin a project by writing some of that metadata first.

I know this is just a use case, but it does sound pretty much like a top down approach. My projects usually work quite differently: They start as some throwaway .pl file, and if I happen to run it more often I copy it to ~/bin/, and if I expand it and think it might become beneficial to others, I start to extract most subroutines into a module and then CPANize it. That's also the reason why I don't care too much for Module::Starter and the like: they assume I start from scratch, but I don't. Adding the boilerplate usually seems like less work then starting with the boilerplate and adding my code to it.

I don't know what the summary of my somehow disconnected ramblings is; maybe it is that I find the idea intriguing, but that I don't think it will work for me. I have the feeling that it will violate the motto "don't repeat yourself"; the proposed meta data scheme seems to encourage repeating information that is already there in some way.

Perl 6 - second systems done right

Comment on Re: Project Metadata Model Download Code

Replies are listed 'Best First'.
Re^2: Project Metadata Model by Xiong (Hermit) on Jan 29, 2012 at 07:40 UTC
I agree absolutely that the accuracy of metadata decreases with distance from its subject. I'll go further: The usefulness of metadata decreases with distance from its subject. That's why I favor one file : one metafile. It's the least practical distance. I think of each metafile as the *flip side* of the subject file. You have a document, you edit the document; you want to make notes about the document, you flip it over and scribble there. I am opposed to the fat bloated trapdoor spider sucking goo from everywhere approach. Synchronization is exactly my issue. moritz doesn't want to keep metadata in sync but that's exactly what we're doing manually when we copy from one place to another, with or without some format translation. My projects tend to fall apart as little pockets of metadata desynchonize. When I do release, then later I find getting back into an old project a staggering task. I simply can't remember all the informal relationships and unwritten rules. Developers generally seem to dislike writing tests and documentation. Some will feel that writing yet more metadata only increases the burden of not-code. But my object is to streamline all of the not-code tasks as well as some of the coding by providing a means of interchange. Multiple points of entry and exit mean that you are not required or expected to master yet another language and grand interface. Rather, the benefit comes incrementally. You continue to work as you always have. Perhaps you pull a feature branch from a contributor. You now have the opportunity to accept updates to your project's `README`, POD, and test suite. You can dismiss the offers or accept them and go straight to the updatated elements and re-tailor them. Of course if your contributor is using PMM then he may already have accepted these offers and delivered a complete, all files patch instead of a vague bug report. Which means less work for you. If your projects don't have a single boilerplate start then you won't want that kind of tool. You may be more interested in a tool that tracks metadata as it accumulates and assists you later on when you want to throw on another Jenga block. You may not want any of this. Picasso could paint with a toothbrush. I need more structure. I'm not the guy you kill, I'm the guy you buy. —Michael Clayton	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Project Metadata Model
by Xiong (Hermit) on Jan 29, 2012 at 07:40 UTC

I agree absolutely that the accuracy of metadata decreases with distance from its subject. I'll go further: The usefulness of metadata decreases with distance from its subject. That's why I favor one file : one metafile. It's the least practical distance. I think of each metafile as the flip side of the subject file. You have a document, you edit the document; you want to make notes about the document, you flip it over and scribble there. I am opposed to the fat bloated trapdoor spider sucking goo from everywhere approach.

Synchronization is exactly my issue. moritz doesn't want to keep metadata in sync but that's exactly what we're doing manually when we copy from one place to another, with or without some format translation. My projects tend to fall apart as little pockets of metadata desynchonize. When I do release, then later I find getting back into an old project a staggering task. I simply can't remember all the informal relationships and unwritten rules.

Developers generally seem to dislike writing tests and documentation. Some will feel that writing yet more metadata only increases the burden of not-code. But my object is to streamline all of the not-code tasks as well as some of the coding by providing a means of interchange.

Multiple points of entry and exit mean that you are not required or expected to master yet another language and grand interface. Rather, the benefit comes incrementally. You continue to work as you always have. Perhaps you pull a feature branch from a contributor. You now have the opportunity to accept updates to your project's README, POD, and test suite. You can dismiss the offers or accept them and go straight to the updatated elements and re-tailor them. Of course if your contributor is using PMM then he may already have accepted these offers and delivered a complete, all files patch instead of a vague bug report. Which means less work for you.

If your projects don't have a single boilerplate start then you won't want that kind of tool. You may be more interested in a tool that tracks metadata as it accumulates and assists you later on when you want to throw on another Jenga block.

You may not want any of this. Picasso could paint with a toothbrush. I need more structure.

I'm not the guy you kill, I'm the guy you buy. —Michael Clayton

[reply]
[d/l]