in reply to Re^2: RFC: A YAML config module to be posted to CPAN
in thread RFC: A YAML config module to be posted to CPAN

Assuming your YAML is correct, there will never be a confusion between 'the second item from the section named servers' and 'the section named 2'. The reason for this is that the value at 'dbconfig.servers' is either a scalar, a hash ref, or an array ref, so assuming it is not a scalar, it has to be one of the two - not ambiguous.

The YAML in question will be a config file, which means it's likely that it will be created by a hand by a person -- and not auto generated by code, so assuming it is "correct" is a bad assumption. Frankly I can't possibly imagine how you can consider that syntax unambiguous: you tell me, is "a.b.2.3" referring to $conf->{'a'}->{'b'}->{'2'}->{'3'} or $conf->{'a'}->{'b'}->[2]->[3] ?

...because you're working with configuration data, you generally know what it looks like before you run the program. It's not the same as (eg) parsing user submitted data from a website.

You may know what the data looks like, because you wrote the config parsing module, and the config files, and the apps that use the config parsing module. Joe Blo who writes an app using your module may not understand it as well, and John Smith who uses Joe Blo's app and needs to write the config files for it REALLY may not understand what's going on.

In practice, it is really easy to use ... the only slight annoyance being that you have to create a file as follows ... Then you load the data once (eg for mod_perl, in startup.pl) ... Then for any module that needs access to the config, you just add this to module file

That does not sound easy to use ... particularly the part about having to create a seperate mini little perl module for each config dir i want to read from, or the fact that i have to use the module one way in exactly one palce in my app, but every other place in my app i have to use it a different way.

Replies are listed 'Best First'.
Re^4: RFC: A YAML config module to be posted to CPAN
by clinton (Priest) on May 13, 2007 at 00:07 UTC
    ... assuming it is "correct" is a bad assumption.

    If somebody is editing the config file, they're probably changing something that you originally wrote. They are much more likely to get the values themselves wrong, rather than to change the type of a data structure. So obviously, if you're in this situation, you would need to validate the data before using it. Just as with any data from a suspect source. Fine. Being certain that it's a hash or an array will be nowhere near enough to be sure that you have good data coming in.

    ...I can't possibly imagine how you can consider that syntax unambiguous: you tell me, is "a.b.2.3" referring to $conf->{'a'}->{'b'}->{'2'}->{'3'} or $conf->{'a'}->{'b'}->[2]->[3]?

    I'm not suggesting that we replace all references to variables in Perl with this, but for the kind of data that I need to use, this notation makes for very easy reading.

    The typical use of lists of values in configuration data (in my experience), is that you want the whole list so that you can iterate through it. So you ask for the whole list:

    @hostnames = C('db.servers.search'); foreach my $hostname.... etc

    So practically speaking, I do not find this confusing. I can't be so way out there - lots of Template::Toolkit users do exactly the same thing without complaining.

    ...having to create a seperate mini little perl module for each config dir i want to read from

    You misunderstand me. You need one subclass per application, and one directory tree with all your config data.

    I've been using this module for mod_perl applications, and there it makes sense to load all your data once at startup, so that the memory is shared between the child processes. I have all my config data in one directory tree. I have a single config module (which subclasses Burro::Config; for the entire application. I specify use MyApp::Config 'dir'; once in my application, and in every other module where I want to use it, I specify use MyApp::Config

    Is that really so difficult?

    UPDATE : That said, I realise that this is not the normal way that Perl modules work, and this is the part I have the most doubt about. It would be easy enough to change it to returning a config object which you could then pass around all of your modules. It pretty much does that already, behind the scenes. I just figured that this way was shorter and more convenient, but I concede that it may not fit into the way others like to do things.

    Clint

      You misunderstand me. You need one subclass per application, and one directory tree with all your config data.

      Again, you're making assumptions that may make ense to you as the author and the user ... but if you want this to be generally reusable for other people (which is i'm assuming your reason for posting it as an RFC and ultimatley to CPAN) you have to acknowledge that not everyone wants to do this ... there are *lots* of good reasons why an application may want o use multiple unrelated directories to store configuration.

      even if it turely is one per applcation, why should i rewrite the same 3 line module over and over for every application with the only difference being the directory name?

      Is that really so difficult?

      No, but it's cumbersome, and limiting -- not things that generally make a module reusable. Loading data only once at startup may make sense to you, but i may want to make it possible to reload my configs on demand ... i can't do that with your module as is. Loading the data in a way that makes it globally accessible may make sense in some applications, but in other applications it may make sense to have the config data more protected, and only pass it to the methods that need it.

      For the use case you seem to want to make really easy (parse config files once on startup, make available globally) your approach actually seems like requires the application writer to write a lot more code then if you just returned an object and letting them assign to a global variable...

      use Burro:Config; $main::config = Burro::Config->parse('/my/conf/dir');

      Two lines. Done. No custom subclass per config directory, no importing the custom subclass in all the other modules that need to know about the config -- every piece of code that you suggest needs to "use My::Config" before calling your C function can now just know to call $main::config->C instead.

        ...actually seems like requires the application writer to write a lot more code then if you just returned an object and letting them assign to a global variable...

        Do you know, I think I may agree with you. In mod_perl, people tend to preach against globals, because of the number of people trying to convert their CGI to run under mod_perl. But of course, there are times when it makes sense to use a global.

        I remember now, the other reason I wanted to import a sub rather than using an object directly was for more conciseness, as in:

        @hostnames = C('db.servers'); as opposed to: @hostnames = $c->C('db.servers');

        ... not much of a difference, I'd agree, but at the time, I was trying to get it as small as possible.

        It suits me to use it this way (ie the subclassing), but I can imagine more people than yourself experiencing the same objections. It wouldn't take much for me to change it to be able to work in either way, as the user desires.

        So given that change, would you recommend that I release this module? And if so, called what? There is an old module of mine called Config::Loader which I doubt anyone has ever used - would this be a reasonable name? It doesn't mention the YAML, but that may be no bad thing, in case I want to add the ability to read other types of data.

        thoughts?

        thanks clint