in reply to Re: Converting a Flat-File to a Hash
in thread Converting a Flat-File to a Hash

Get someone else to do the work for you - use a module...

I'd say take that advice with a grain of salt... For things that are basically pretty simple -- easy to set up and easy to validate, like the OP's case -- it can be quicker and more reliable to roll your own. A module posted by someone else may have been written to do a slightly different task, and finding that out, and figuring out whether/how it can be shoe-horned into your particular task, might end up being more work with a less satisfying outcome.

But for the harder things that make you scratch your head and say "I'm not sure how to solve this", definitely go to CPAN and look for help. Even if there is no single module that does exactly what you need, you're likely to learn about how to break the problem down into manageable chunks, and/or find useful references, and so on.

Replies are listed 'Best First'.
Re^3: Converting a Flat-File to a Hash
by GrandFather (Saint) on Aug 14, 2006 at 04:19 UTC

    Many things look simple, few things are simple. Take OP's "simple requirements" for example: "I'd like my program to read in some data at start, rather than hard-code it into the program". Ok, OP is talking about storing some configuration information and the sample data given indicates simple key:value data.

    But how simple is that? The code given breaks in all sorts of ways - empty lines, lines without a colon, values containing colons, duplicate keys, nasty configuration file names (two param open rather than three), missing or otherwise unreadable configuration file and very likely other things as yet unthought of.

    Ok, OP spends a little time up front trawling through CPAN (with a little guidance) and comes up with a tool kit for solving the problem today and, well golly, solving the problem again in a different context tomorrow. Sounds like well spent time to me, and there is still time left this afternoon for a beer.

    Sure, at some point you have to write some code to solve your own specific problem, but the more glue and the less new code the more likely it is that you don't have to deal with all the edge cases and stuff you've not thought of. In this case half an hour research and five minutes coding is likely to save several hours down the track bodging up the holes in the first implementation - and they are likely to be hours with people breathing down your neck as you sort out problems with a live system. Saving those sort of hours is worth several up front hours any day!


    DWIM is Perl's answer to Gödel

      Did you ever look at the Config::* namespace on CPAN? Even if you ignore the 500+ hits that don't start with Config::, that still leaves 170 to wade through. Whilst there are undoubtedly one or two excellent modules amongst that number, there is also an aweful lot of dross.

      Pick the wrong one, and you can be letting yourself in for a great deal more grief than having to clean up a bug or two in your own code.

      Settle upon your own code, make it into a module for your own internal use and it can evolve to meet the requirements of new applications or extensions and new requirements to existing ones, as they arrive.

      Settle upon the wrong CPAN solution, even a good one for your initial requirements and down the line you are faced with the problem of pursuading the author that your new requirement fits with the nature of his module; or backing it out from existing applications; or maintaining multiple Config modules going forward.

      Many algorithms and requirements are clearly defined and universal enough that one or two good CPAN implementations are sufficient to encompass them.

      Others, like configuration--and a favorite bugbear of mine, commmand line argument processing--are sufficiently application specific, or subject to variation according to personal and/or corporate preference, that having YACM for each variation means that the namespace is over subscribed with a confusing and time consuming array of possibilities. Some of which may be great modules, but many of which are ill thought through, over engineered, or just down right shoddy.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Yes. I have looked. And I asked in the CB and got good answers. As I implied in Re^3: Converting a Flat-File to a Hash, half an hour's research (including consulting the CB) got me a few CPAN modules to look at (about the same list bobf suggested), one of which solved the problem for me.

        To an extent the problem is similar to the command line problem, and the answer is similar: go with the prior art. It may not be an exact fit for your initial specification, but over time there are likely to be fewer surprises by adopting standard technique. Certinally there is less learning Yet Another Configuration Technique required if you adopt one technique and use it wherever possible.

        Context alters most things of course and there are no hard and fast rules in providing configuration information. It's a whole lot closer to the gray areas than trying to parse HTML with regexen for example. However there is still great virtue in taking a look at the prior art, even if just as a way of sorting out edge cases earlier rather than later and for gaining insight into where your own code may need to evolve. Looking at CPAN to find out how stuff doesn't work and what it doesn't do can be just as rewarding as finding the module that does exactly what you need.


        DWIM is Perl's answer to Gödel