Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a package with many subroutines that return arrays. In many cases the arrays are a fixed size and small (less than 100 elements). But in some cases the arrays can be very large (greater than 1_000_000). What is the best practice here in terms of returning a reference to the array or the array itself? As a user of the package, I'd rather get an array returned than a reference as there are fewer syntactical hoops I have to jump through to use the result. However, returning very large arrays is not as efficient as returning a reference to an array. Furthermore, if in some cases I return a array and in others I return a reference, then a user of the package might be bothered by the inconsistencies. Do any monks know of a CPAN package that handles this issues well (as an example)?

Replies are listed 'Best First'.
Re: Return a Reference or Array?
by moritz (Cardinal) on Apr 06, 2009 at 20:58 UTC
    I don't know what your array elements are, but for example one million strings of a few characters length each can eat up quite some memory; I'd rather go with the references.

    There's some convenience you can provide: You can return an array reference in scalar context, and the array itself in list context:

    return wantarray ? @array : \@array;

    The the caller can decide which version she wants.

Re: Return a Reference or Array?
by GrandFather (Saint) on Apr 06, 2009 at 21:05 UTC

    Perl Best Practices is a little thin in this area. In (very brief) synopsis Damian suggests that you should "return what the caller expects".

    In this case that may be return a list in list context and return an array reference in scalar context. You can achieve that by using wantarray:

    sub genMeggaArray { my @array; ... return wantarray ? @array : \@array; }

    True laziness is hard work
Re: Return a Reference or Array?
by mr_mischief (Monsignor) on Apr 06, 2009 at 21:40 UTC
    One way to keep people from getting confused when using your module is to have different versions of the methods or subroutines whose names make it clear what is returned. Compare, for example, the DBI methods fetchrow_array and fetchrow_arrayref.

    Another way to help avoid the confusion is to, as mentioned already, return the ref in scalar context and the array in list context consistently across all your data-returning methods or subs and to document the fact that it is always that way across the whole API as part of your API documentation.

Re: Return a Reference or Array?
by morgon (Priest) on Apr 06, 2009 at 22:02 UTC
    Have you ever thought about returning neither but something else instead?

    I could imagine that you do pretty much the same things with all the arrays you return from your various subs (checking their sizes, iterating over them etc).

    So why not wrap your arrays in an object that provides nice convenience methods, iterators etc do make life easy for the client code?

    In that way you ALWAYS return the same entity (an object - contextual returns are nice but can be confusing to some people) regardless of the list-size (with the memory consumtion of a returned reference), so no special cases and you could build an interface like this:

    my $ret = &my_sub(@args); # returns an object that wraps a list if(! $ret->isEmtpy ) { $ret->each( sub { print $_ } ); }

    Just an idea...

      If the array actually represents a vector or matrix, then there's already a good library for it: PDL.

      If not, chances are that there's already a good library for it on CPAN anyway ;-)

      my $ret = &my_sub(@args);

      In general there's no need to add the & in front of sub calls. On the contrary it has some effects that aren't desirable most of the time, like disabling prototype checking.

        And I thought prototype checking was undesirable :-)

        I just added the "&" for clarity - and I sometimes use it myself in some one-off hacks where I call a sub before I define it (just to keep the parser happy).

        Apart from prototype checking what else are the differences between

        &foo(@args);
        and
        foo(@args);
        ?
Re: Return a Reference or Array?
by CountZero (Bishop) on Apr 07, 2009 at 06:07 UTC
    Actually, the subroutine will never return an array. Subroutines return a single list only.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James