Re: Parsing the command line: manual or module?
by Velaki (Chaplain) on Aug 17, 2006 at 17:48 UTC
|
In a word: Consistency.
By adhering to a standard method of parsing the command line, you ensure that future programs will conform to accepted standards, that best practices are followed, and that existing code is maintainable by any and all resources.
Additionally, using a module such as Getopt::Long is advantageous in that it enforces the behavior of various command line options, e.g. -v -f filename, which is notoriously time-consuming to code well by hand. Also, it keeps the user from having to use -h for a program, when --history could be understood more easily.
In all fairness, TMTOWTDI, but why not use a well-tested, code-proven module, like Getopt::Long? I see only advantages; no disadvantages with it -- other than maybe a small learning curve.
Pax vobiscum, -v.
"Perl. There is no substitute."
| [reply] [d/l] [select] |
Re: Parsing the command line: manual or module?
by andyford (Curate) on Aug 17, 2006 at 17:52 UTC
|
Not at all trying to be disingenuous, but if you rewrite your arguments, you are expressing it as "the risks for not using a module". At first I was kidding, then when I got done, I realized I was serious.
Here's a quick inversion of some of your points, just to give some flavor:
- positional params are inflexible and therefore hard to maintain
- positional params are error-prone because they are order-dependent
- positional params make it difficult from the caller's perspective because they have to remember the exact order of the arguments
- arbitrary and/or short param names inhibit memory and clarity
- positional params require manual validation (required/optional/string/integer/etc)
- hand-rolled command lines options make using flags and named params simultaneously difficult
| [reply] |
|
|
You're absolutely right. You've pointed out more clearly, however, something that I'd already noticed when reading the root node: that the response did not, indeed, actually indicate any possible introduction of bugs. These are problems, but not opportunities for more bugs to be introduced. Rather, they come across as characteristics of a bug. That's not to say that hand-rolled option parsing isn't prone to introducing bugs: it is, as indicated by Fletch. This just means that no indications that it is prone to bugs were clarified in the OP's reply.
In other words, hand-rolled option parsing such as is described here is itself a bug. It introduces its own problems at runtime. It should be fixed as a bug. Luckily, it's a bug with a known, relatively easy solution.
On the other hand, I think that the OP's "positive" approach was more diplomatic than the "negative" approach taken in your rephrasing, andyford. What you posted is excellent for illustration purposes in answering the original question, but is not how I'd address the questions of a coworker (if I was thinking properly at the time) because it might be perceived as accusatory.
| print substr("Just another Perl hacker", 0, -2); |
|
- apotheon
CopyWrite Chad Perrin |
| [reply] |
Re: Parsing the command line: manual or module?
by Fletch (Bishop) on Aug 17, 2006 at 17:48 UTC
|
| [reply] |
|
|
| [reply] |
|
|
Your third point is sort of an outgrowth of your second (here's hoping you don't edit that unordered list in such a way that this sentence becomes meaningless). It's a darned good point: drifting code is one of the biggest disadvantages of DRY principle violation. This is also the major opportunity for actual bugs to be introduced due to the by-hand specification of command line option parsing in every single program individually.
Plus, y'know, a good programmer should be lazy enough about stuff like this to want to do it "right" in the first place, since it's less work to use someone else's command line option parsing module than to write your own code every time.
Of course, Getopt::Std is simpler to use, so that might end up being the really lazy answer.
| print substr("Just another Perl hacker", 0, -2); |
|
- apotheon
CopyWrite Chad Perrin |
| [reply] |
Re: Parsing the command line: manual or module?
by davido (Cardinal) on Aug 17, 2006 at 17:56 UTC
|
I'll add another reason to advocate code reuse: collaboration.
Thousands of developers have used Getopt::Long. It has been proven, tested debuged, refined, pondered, enhanced, and applied through countless test and real-world use cases. There is no way that a home made solution will have evolved through as rigorous a refining forge as a core module. Code reuse, and in particular the use of widely popular modules, is inherently and consistantly safer than inventing your own solution to a problem that was solved ages ago.
If you've got a new problem, not addressed by a trusted and proven module, you earn the fun of inventing your own solution. But command line parameter parsing has been done before, the right way, a lot. Unless you've got a unique need not solved by existing solutions there's no need to risk making a mistake building your own approach to a problem that's already been solved.
| [reply] |
Re: Parsing the command line: manual or module?
by talexb (Chancellor) on Aug 17, 2006 at 18:38 UTC
|
Before I wrote my code in Perl, I was a C programmer. One of the first C applications that I wrote started off as a prototype, and as often happens, the prototype became the working piece of Production code.
One of the routines started out with a few parameters, then grew, finally needing eight or ten parameters. Every time I'd add another, I'd think, "Gee, this is getting really unwieldy".
Well, that's hindsight.
In C, you have no choice but to pass in a long list of parameters .. but in Perl, there's no need to cripple your code with that kind of limitation. As soon as a function requires more than two or three paramters, make them an arg hash. If it's the comman line arguments, use Getopt::Long.
The risks of not using a module are that you'll reinvent your own wheel. This is not a bad thing, except that your solution isn't going to have the attention paid to it that the equivalent module did, and it'll take longer.
That could pay off in the long run, but it might lead to difficult discussions with your manager about why the three week development schedule has expanded out to eight months. Part of being a Senior Developer is knowing when to write it yourself, and when to use something that someone else has written.
Let CPAN make you look good today.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] |
|
|
I want to say there's a refactoring (from the book of the same name) that is recommended for when a routine starts getting too many parameters. I want to say something along the lines of encapsulating some (or all) of the arguments into an object, and/or possibly moving the behavior onto that object so the current implementer becomes a client using an instance of the new class.
(Unfortunately I don't have my copy at hand, but someone may chime in that actually remembers it or does have a copy nearby . . .)
Update: Found it: Introduce Parameter Object, p295. The (simple) example given is a series of calls which all take a start and end Date; the refactoring is to create a DateRange class which encapsulates both.
| [reply] |
|
|
You can pass mutiple arguments via a struct in C in just the same way you use an arg hash in Perl. You can use getopt(3C) in C in just the same way you use Getopt::Long in Perl. Similar techniques exist for other languages.
| [reply] |
|
|
Yeah, I know that -- but every time you want to add a parameter you have to modify the struct definition and do a make, so it's not a simple thing to do. I didn't want to confuse the post with that situation.
In Perl, if you want to add something to an arg hash, the caller and the callee need to be modified; no one else cares, and that's the way it should be.
I deleted a paragraph from my original post that talked about how I wrote a device independent video graphics module for two graphics cards (heh), CGA (if you can call 640x400 useful) and Hercules (720x348 or something like that -- ok, wikipedia says it was 720x350, close enough). To use these two cards, I would call a subsystem with function pointers for pointers to each of a dozen function pointers functions, when obviously a pointer to a structure containing function pointers would have been way more efficient way to implement that.
Like I said, it was my first big project. My coding standards have improved immensely since then -- hey, I discovered make back then and thought it was a pretty advanced tool. It was only later that I discovered it had been ported from Unix.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
Updated Sunday August 20, 2006 at 1222 After re-reading, realized that my purple prose needed a little clarification. Old is struck through, new is in italics.
| [reply] [d/l] |
Re: Parsing the command line: manual or module?
by GrandFather (Saint) on Aug 17, 2006 at 19:36 UTC
|
In a way it is like parsing CSV or HTML/XML with regexen - it's easy for the easy stuff, but you will get bitten by the edge cases. With a good module someone has already thought about the edge cases and provided ways of managing them.
The down side with the few command line parsing modules I've glanced at is that they all focus on *nix style command line conventions. They simply don't handle DOS/Windows conventions (that I've noticed). Because of that I tend to use a command line parsing "template" chunk of code that gets pasted (along with a help/error exit routine) into whatever new script I'm writing that needs command line processing. I should at least generate a module from it, but it hasn't happened yet.
The fish hooks in command line processing come from duplicate flag processing, quoted parameters and intersperced flags and parameters. Handling defaults, required parameters and help processing and error handling tend to be related issues. By the time you've handled all that lot there is a fair chunk of code involved. Add in the test suite and you really have something worth a decent sized module. At that point letting someone else do the work starts to seem worth while!
DWIM is Perl's answer to Gödel
| [reply] |
Re: Parsing the command line: manual or module?
by CountZero (Bishop) on Aug 18, 2006 at 06:11 UTC
|
| [reply] [d/l] [select] |
Re: Parsing the command line: manual or module?
by tilly (Archbishop) on Aug 18, 2006 at 02:14 UTC
|
Whenever caller and callee have to exactly synchronize, with no error checking, you will tend to get bugs because people will sometimes make mistakes.
The hand-rolled parsing in this case requires exactly this kind of synchronization, and lacks error checks.
Using Getopt::Long allows for more flexibility and better error checking. On the one hand you'll get fewer errors because, for instance, getting arguments in the wrong order won't be a problem. On the other hand when there are errors, you are more likely to be told about it so they won't survive. Both are Good Things.
Less code, easier maintainance, etc are just icing on the cake. | [reply] |
Re: Parsing the command line: manual or module?
by graff (Chancellor) on Aug 18, 2006 at 00:41 UTC
|
# first arguement is a -f; collect it and ignore it for now
my $optionalArgument = shift;
# get filename of the config file
my $configFile = shift;
If that's all there was in terms of handling command-line args (if there really was no checking, and no reporting about expected and invalid usage), then the script is nearly unusable. What if the first arg isn't "-f"? What if the next one isn't the name of a config file?
And as already pointed out, if there's a chance the script will need to be adapted someday to handle additional variations on its behavior, it will be more unusable (and unmaintainable as well) until some sort of Getopt treatment is brought into play.
I use either Getopt::Std or Getopt::Long in many of the command-line scripts I write, and even though I haven't taken the time to memorize all the techniques, even though I have to refer to a previous script or to the module's perldoc output just about every time I use them, it still saves me time, and makes it easier to add new options to my scripts when I need to.
(Having said that, I'll confess that there are also a few occasions when I somehow conclude that I can handle what's needed myself, without Getopt -- but even then, I at least allow for flexibility in the ordering of args, verify that args are as expected, and die with an appropriate error and usage summary when they aren't.)
| [reply] [d/l] |
Re: Parsing the command line: manual or module?
by eyepopslikeamosquito (Archbishop) on Aug 18, 2006 at 13:01 UTC
|
| [reply] |
Re: Parsing the command line: manual or module?
by odha57 (Monk) on Aug 18, 2006 at 13:02 UTC
|
For all of you, thanks for a great thread! I am fairly new to this site, but have been working with Perl for about 10 years to do various things in telecom labs. So I am pretty much self taught, bumping into new things as the need arises. I had never run across Getopt::Long. I just read the documentation and see that I can do things a better (and easier) way.
Thanks!
| [reply] |