passing arguments

In a recent thread, the monastery weighed the relative merits of using shift() to extract function parameters. That's fine as far as it goes, but I have some opinions about argument passing that pretty much moot the question of how to extract them.

My basic rule is: Never write a function that takes more than one argument.

By and large, function parameters fall into one of three categories:

Primary data -- what the code actually crunches.
Switching parameters -- options that regulate the crunching.
Tramp data -- information the function doesn't use at all. It's just passing through on its way to another function.

To illustrate those three ideas, let's look at a toy program:

    sub print_multiple {
        my ($count, $string, $format) = @_;

        for (1..$count) {
            print &format_string ($string, $format), "\n";
        }
        return;
    }

    sub format_string {
        my ($string, $format) = @_;

        my %templates = (
            'bold'   => '<b>%s</b>',
            'italic' => '<i>%s</i>',
        );

        return (sprintf ($templates{$format}, $string));
    }

    &print_multiple (5, 'hello, world.', 'bold');
[download]

(I admit this is a contrived example, but non-contrived examples tend to be big. If you want to see switching parameters in real code, flip through a copy of Numerical Recipes in C)

In print_multiple(), $count is a switching parameter, while $string and $format are tramp data.

In format_string(), $string is primary data and $format is a switching parameter.

The whole point of using functions is to isolate tightly-cohesive bits of logic so you can call them from other parts of the program. That gives you primary data. Then you want your functions to be general enough that they're useful in lots of settings, which gives you switching parameters. Then you build high-level functions that call low-level functions, but you still have to get all the parameters to the low-level code, which creates tramp data. Then the whole thing gets so convoluted that you can't keep track of it any more, and you start buying books about OOP.

The way to break out of that trap is to think about data independently from the logic. Instead of thinking "what parameters does this function need?" think "what does this information describe?". Assemble your data in coherent structures, then pass those structures to functions that use whatever information they need.

Applying that concept to the toy program gives us this:

    sub print_multiple {
        my $b = shift;

        for (1..$b->{'count'}) {
            print &format_string ($b), "\n";
        }
        return;
    }

    sub format_string {
        my $b = shift;

        my %templates = (
            'bold'   => '<b>%s</b>',
            'italic' => '<i>%s</i>',
        );
        return (
            sprintf (
                $templates{ $b->{'format'} }, 
                $b->{'string'}
            )
        );
    }

    $block = {
        'string' => "hello, world.",
        'format' => "bold",
        'count'  => 5,
    };
    
    &print_multiple ($block);
[download]

Instead of dealing with random data, we now have a structure that represents a block of formatted text. We can add other attributes to that structure if we need to, without imposing any change on the functions we already have. Heck, we can store it and use it again, which isn't true for the previous version.

The control flow and logical structure of a program are only one half of the story. The data model is the other half. Instead of worrying about how you're going to get parameters into a function, work out a data model, then build your logic to fit that model. Put the burden of organizing information into the data structures where it belongs, instead of smearing it out across your function signatures. You'll find that your parameter lists shrink dramatically, and that your functions will be easier to write.

Comment on passing arguments Select or Download Code

Replies are listed 'Best First'.
Re: Passing Arguments by tadman (Prior) on Apr 24, 2002 at 23:17 UTC
Just an observation. grinder's remark about the perils of calling functions with ampersand, such as `&foo` versus `foo()` got me thinking "Do people still use ampersand calls?" But I figured that was just me being silly. Of course they don't. Then, just moments later, I find a living, breathing example. Yikes. Anyway, the "one parameter" rule is kind of absurd. Sometimes a function needs to know a lot of things to get the job done, sometimes nothing. Any language which strictly mandated One Argument Only is probably stack based anyway, or is just an exercise in programmer abuse. Passing anonymous hashes around is fine and all, but sometimes it breaks down, and when that happens, it can get really, really bad. What if you want to merge parameters, or filter them? Imagine the pain, the agony, the disgusting things you would have to do. `format_string({foo => 'bar', %$set1, %$set2, format => $set3->{format}});` That seems to be just the tip of the iceberg. Hash-style parameters, though, are fun. Just look at CGI.	[reply] [d/l] [select]
Re: passing arguments by Super Monkey (Beadle) on Apr 24, 2002 at 22:25 UTC
I like your approach. It's clean and simple. However, I don't agree with your basic rule : Never write a function that takes more than one argument. Never is a pretty strong word and always following this rule can lead to some complicated solutions to simple problems. Still, for more complex problems, I think your suggestions could save a lot of people some headaches.	[reply]
(MeowChow) Re: passing arguments by MeowChow (Vicar) on Apr 25, 2002 at 04:47 UTC
There's a time and place for everything. For some functions, it makes sense to accept an ordered paramenter list. For others, it makes sense to accept a list of key/value pairs. And sometimes, it's useful to pass around an anomyous hash by reference, as you've suggested. Saying there's exactly one right way to do it is an open invitation to a holy war. As long as you understand the benefits and tradeoffs of each method, you'll do fine. Blind adherance to an arbitrary rule such as "never write a function that takes more than one argument" is just silly. MeowChow s aamecha.s a..a\u$&owag.print	[reply]
Re: (MeowChow) Re: passing arguments by mstone (Deacon) on Apr 25, 2002 at 20:11 UTC
Saying there's exactly one right way to do it is an open invitation to a holy war. That's very true. And I feel free to break any of my own rules if the situation warrants it. I state this one strongly because I try to avoid breaking it if at all possible. I find that reshuffling the data to support one-parameter logic helps me build better, more robust code. I'll be happy to scale this one back to "try to code for single parameters if possible" if the rest of the monastery will give up equivalent dicta, like "use CGI.pm or die." ;-)	[reply]
Re: passing arguments by pdcawley (Hermit) on Apr 25, 2002 at 08:27 UTC
Um... am I missing something here? You seem to have merely reinvented Object Oriented Programming without the really good bits like Polymorphic method dispatch. Also, what happens when you want to tell your `print_multiple` subroutine to print on a different file handle? Now your data structure is going to have to carry a file handle around with it, and everything that ever prints is going to have to worry about defaulting. I submit that in the example you give, an object structure like the following may prove to be a little more loosely coupled and, in the long run, more maintainable. package Printable; sub print_on { my $self = shift; my($fh) = @_; print $fh $self->as_string; } sub as_string { $_[0]->value } sub new { my $proto = shift; my $scalar; my $self = bless \$scalar, ref($proto) \|\| $proto; } sub set_value { my $self = shift; $$self = $_[0]; return $self; } sub value { ${$_[0]} } sub print { $_[0]->print_on(\*STDOUT) } package Decorator; use base 'Printable'; sub decorate { my $proto = shift; my($target) = @_; $self->new->set_target($target); } sub new { my $proto = shift; my $self = bless {}, ref($proto) \|\| $proto; $self->init; return $self; } sub init { } package MultiString; use base 'Decorator'; sub count { my $self = shift; exists($self->{count}) ? $self->{count} : 1; } sub set_count { my $self = shift; $self->{count} = $_[0]; return $self; } sub as_string { my $self = shift; $self->target->as_string x $self->count; } package FormatBold; use base 'Decorator'; sub as_string { my $self = shift; '<b>' . $self->target->as_string . '</b>'; } package main; $thing = MultiString->decorate( FormatBold->decorate( Printable->new->set_value("Hello, world"); ) )->set_count(5); $thing->print; [download] Yes, there's a lot of code here. But most of it's setup code that's reusable. Each Class is responsible for a small reasonably well defined fragment of the overall task and I'd argue that as requirements grow, the whole thing is more maintainable. In the 'real world' I'd probably factor the 'FormatBold' thing into a subclass of some Format class. Once I've got the Format class factored out it becomes relatively easy to bring in a Formatter strategy class which knows that if we're formatting for HTML then bold looks like `<b>...</b>`, but if we're formatting for LaTeX, it looks like `\textbf{...}`, and so on. Yes, there's slightly more initial setup involved. Yes, OO dispatch means things are going to be slower. But as a programmer I can go faster because, when it's done right, OO leaves you with small methods, that take a small number of arguments and which do exactly they say on the tin. Done badly, well, Object Ravioli can be just as bad as (worse than?) procedural spaghetti. You just have to exercise Good Taste.	[reply] [d/l]
Re2: passing arguments by mstone (Deacon) on Apr 25, 2002 at 20:36 UTC
Um... am I missing something here? You seem to have merely reinvented Object Oriented Programming without the really good bits like Polymorphic method dispatch. Well spotted.. building a data model that's independent of the actual logic is indeed the first step toward grokking OOP. Taking this idea to a full OO perspective produces the Model-View-Control program architecture: the Model is responsible for all data storage and manipulation. the View is responsible for all communication across the boudnary of the program's address space. the Control accepts messages from the View, and tells the Model how to set or manipulate its stored values. Ivar Jacobsen disccusses a similar architecture (at length) in Object-Oriented Software Development: A Use Case Driven approach. More generally, we're talking about separation of concerns. You can apply the same concepts perfectly well within the framework of procedural programming. The OO/procedural/functional styles are all just syntactic sugar for the basic computing machinery -- state machines, pushdown automata and the like -- and establishing a clear separation of concerns between your data and the logic that manipulates it is a best practice of any programmming style.	[reply]
Re: passing arguments by drewbie (Chaplain) on Apr 25, 2002 at 13:39 UTC
I agree with your principle of keeping things as simple as possible. Simple is the right solution 99% of the time. But I disagree with your restriction of a single parameter 100% of the time. Technically, I would be in compliance with your rule if I passed in a hashref with 10 key=>value pairs. I just don't think that it makes sense to ALWAYS limit yourself so severely. As pdcawley mentioned, a lot of what you are trying to achieve can be done through good OO design principles and inheritance. And for me, I'll take the speed hit for a method call any day if it makes my life easier as a programmer. Is anyone going to notice if my program takes 100ms longer because of OO method lookups? I really don't think so. In the end, I always come back to doing what makes sense for the programmer. The latest hardware can usually take care of any speed problems.	[reply]
Re: passing arguments by perrin (Chancellor) on Apr 25, 2002 at 17:58 UTC
Passing multiple parameters in a hash ref is still passing multiple parameters. It would be better to just say that using named parameters is a good technique, which has been discussed here before.	[reply]
Idea vs realization by powerman (Friar) on Apr 25, 2002 at 15:13 UTC
I think that many monks replied here pay too much attention to realization of concrete fake example and hmm, yeah, too categorical tell about "more than one argument". As far as I understand, all of this is not very important. The only important thing is idea. Why not meditate on idea, not on realization? And idea, as I understand it, is making program based on data structures used in this program instead of actions required from this program. Yes, I agree, this is much more like OOP. And in fact we have three (and may be more) different designes of how to write program: logic oriented data oriented object oriented And I sure, all of them is more usable than other in some tasks. So, the question to mstone: describe please reallife task where your style give advantages. P.S. I have a plan to make post in Meditations how style like this one used by me can help in writing CGIs.	[reply]
Re: Idea vs realization by demerphq (Chancellor) on Apr 25, 2002 at 17:02 UTC
I can give you an example where you are going to have a hard time with only using one parameter. I want you to write a rotuine that only takes one parameter that will allow me to determine is several scalars are the same entity or not. So I have `my ($x,$y,$z); $x='a'; $y='b'; $z='c';` [download] I want to be able to take any selection (including duplicates) and determine the number of distinct values. Thus `Input Output. $x,$y,$x,$y => 2 $x,$y,$z => 0 $x,$x,$x => 1` [download] Oh, dont bother trying to hard to do this with just one parameter. It cant be done. :-) And if that isnt a red rag to a bull then I dont know what is. Yves / DeMerphq --- Writing a good benchmark isnt as easy as it might look.	[reply] [d/l] [select]
Re2: Idea vs realization by mstone (Deacon) on Apr 25, 2002 at 22:05 UTC
Aggregation is a wonderful thing: ;-) sub find_unique_items_in { my $set = shift; my (%uniq, @out) = (); for $i (@$s) { $uniq{ $i }++; } push @out, sprintf ( "%d unique entities (%d duplicates):\n", scalar keys %uniq, scalar ( grep { $uniq{$_} > 1 } keys %uniq ), ); for $i (sort keys %uniq) { push @out, sprintf ("%-16s -- %d", $i, $uniq{$i}); } return (join ("\n", @out)); } @list = (0) x 10; for (1..10) { $set = []; push @set_list, $set; for $i (0.. rand(5)+10) { push @$set, \$list[ rand @list ]; } } for $s (@set_list) { print "\n", '-'x72 ,"\n\n"; print join ("\n", @$s), "\n\n"; print &find_unique_items_in ($s), "\n"; } [download] Not only does it keep your parameter lists trim, it lets you store your data sets so you can use them again. Now, I don't pretend that the above is an exhaustive search for matching entities, but as far as I know, it's fundamentally impossible to create a parameter list that can't be stored as a data structure. And thinking about those data structures -- especially the possibility of using them again -- helps you build well-organized code.	[reply] [d/l]
Re: Re2: Idea vs realization by demerphq (Chancellor) on Apr 26, 2002 at 09:57 UTC
Re: Idea vs realization by mstone (Deacon) on Apr 25, 2002 at 22:01 UTC
GUI programming is the first example. Web applications also benefit, because they're a subset of the Model-View-Control architecture. Signal-based programs and multi-threaded programs also benefit from keeping the data isolated from the logic. So does any code that needs to be reentrant -- scripts written for mod_perl, for instance. Think of it this way: your function signatures are your program's internal communication protocol. So how good is your program's protocol? Is it structured and well-organized, or is it more like the transcript of half a dozen random bar conversations?	[reply]