Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

RFC: Should join subsume reduce?

by Roy Johnson (Monsignor)
on Feb 21, 2006 at 17:06 UTC ( [id://531739]=perlquestion: print w/replies, xml ) Need Help??

Roy Johnson has asked for the wisdom of the Perl Monks concerning the following question:

It occurred to me today that join and List::Util reduce are not only similar functions, but they operate in separate problem spaces. The first argument to join is a string; the first argument to reduce is a coderef. So I thought that it would be a useful improvement to Perl to make join behave like reduce if you pass it a coderef.

Is there any reason that would be a bad idea? Is it an appealing idea? Worth submitting to P5P?

Update: Having received some input, I can flesh out the proposal a bit. I propose that there be a join BLOCK LIST form that would be easily distinguishable from the join LIST form with no possibility of affecting existing code.

The proposal to add reduce as a built-in has already been rejected by P5P on the grounds that it could break existing code. My proposal would not do so. So while reduce is a better name, going that route is a non-starter. My proposal is a zero-impact alternative.

The argument that it's a completely unrelated (or too dissimilar) to the old join has some merit, but it's a matter of perspective. There's a string-join and a functional-join. Both put the separator between all the elements of the list. String-join concatenates the whole mess; functional-join effectively evaluates it as a big expression (where the BLOCK acts essentially like an infix operator). Considering that string-join is a special case of reduce, it is not too much of a stretch (IMO) to have a specific and a general form with the same name.


Caution: Contents may have been coded under pressure.

Replies are listed 'Best First'.
Re: RFC: Should join subsume reduce?
by kvale (Monsignor) on Feb 21, 2006 at 18:15 UTC
    Although I agree that join and reduce have similar parameter signatures, the semantics of the two functions is very different.

    join concatenates elements together with a separation string sandwiched between all elements. That's it.

    reduce does arbtrarily complex mangling of elements to produce a single result. It could be anything.

    Technically, a join is a special type of reduce with a string and an implied coderef. But join semantically implies not mangling the elements, but simply concatenating them with a bit string. I think that association is so ingrained in the minds of perl programmers that you need this specially named reduction as a separate function called 'join'. Save reduce for the general, functional version.

    -Mark

Re: RFC: Should join subsume reduce?
by samtregar (Abbot) on Feb 21, 2006 at 17:51 UTC
    Worth submitting to P5P? I doubt it. Why not just do it in a module and put it on CPAN? You should have no trouble overriding CORE::join with code to call List::Util::reduce() when passed a code-ref. I doubt I'd use the module but I'm sure some people would take great joy in confusing the heck out of their co-workers with it.

    -sam

      One nice thing that putting it in core could do that I (with a module) couldn't is allow it to have two forms, like map. If I prototyped it as (\&@), it wouldn't work with the old syntax. If I don't prototype it, I have to spell out sub.

      Even if that weren't an issue, a good chunk of the benefit is that you don't have to use any modules. Admittedly, having a built-in vs. a module is a small benefit, but it's a small change with no apparent downside, except that you apparently find it confusing as heck.

      Obviously, it would require an adjustment to the way one looks at join, but it's not a complete transformation, it's merely an extension. Currently, a join operation is a special case of reduce. My proposal is that it be given the ability to operate in a more general way. This introduces no keywords and doesn't affect backward compatibility (except for the very unlikely case that someone wanted to construct strings that look like aCODE(0x225e84)b or aHASH(0x2250f8)b (in which case they'd need to explicitly stringify).


      Caution: Contents may have been coded under pressure.
Re: RFC: Should join subsume reduce?
by Anonymous Monk on Feb 21, 2006 at 20:24 UTC
    Is there any reason that would be a bad idea?

    It would make join() more confusing.

    Is it an appealing idea?

    Not to me. You're taking a very simple, concrete, useful concept, and confounding it with a very abstract, complicated, and silly one. I can describe the purpose of a join in a single sentence: it joins together a list of stuff into a single string, using a given character as a separator.

    I can't do that with reduce(), because what reduce is useful for varies wildly depending upon the helper function I give it. All I really know is that it produces some sort of summary statistic as a scalar from a list; which isn't that descriptive or helpful.

    I don't like it. Taking something very simple and making it terribly complex is not the way perl should go, IMHO. Perl is hard enough to understand; there's too much overloading of concepts as it is. Making it worse is just, well, worse. ;-) --
    Ytrew

      I'm not entirely clear what your objection is. Are you saying that it would become terribly difficult to read code, because you wouldn't know what form of join was being called? If that's your concern, the general-reduce version could be limited to when join is called with an explicit block:
      join { $a + $b } @list; # Computes the sum of the list join foo, @list; # builds a string in the old-fashioned way, ev +en if foo returns a coderef
      I think that restriction is a Very Good Idea.

      If you think it would require you to write joins in a new and unfamiliar way, you've simply misread my proposal.

      If you hate the idea of people writing reduce expressions, and are afraid this would encourage it, I'm afraid that cow is already out of the barn. It's a fundamental function for array processing; a built-in one in many languages. The fact that it's included in List::Util (and that so many of the functions therein are merely applications of it) indicates that it's generally considered useful rather than silly. I don't consider it particularly abstract or complicated, either; certainly no more so than map.


      Caution: Contents may have been coded under pressure.
        You're making the definition of the function more complex. It no longer does one single thing; it now does different things when called in different contexts.

        If you hate the idea of people writing reduce expressions, and are afraid this would encourage it, I'm afraid that cow is already out of the barn. It's a fundamental function for array processing; a built-in one in many languages.

        That cow is called "LISP", and it's been the toy of academics, and a total commerical failure, for approximately the last fourty years. The fundamentals of functional programming haven't changed since then. It's still too abstract, too pointless, and too far removed from the actual imperatives of daily life (there's a reason we think in an imperative style -- it's how we live).

        Don't get me started on map() -- I've seen more twisted code written using map() idioms than any other. In general, when I see a map() statement, I have to mentally start refactoring, because odds are, the code won't work.

        90% of the time, I'm right. People who use map() tend to be too clever for their own code, and write 80% solutions that have to be re-written from scratch to actually solve the real problem at hand. Maybe you're smart enough to pull off a functional language, but I've been bitten by too many people who weren't to be impressed.

        So, if you want to program in a functional language, do so. But don't ruin perl any more than it is already -- God knows how many messes I've had to clean up thanks to overzelous use of map() statements! :-( I shudder to think what life will be like if reduce() (by any name) becomes part of the perl core! :-(
        --
        Ytrew

Re: RFC: Should join subsume reduce?
by duff (Parson) on Feb 22, 2006 at 05:41 UTC

    FWIW, I'd submit a patch to add reduce as a built-in rather than conceptually overload join in what is potentially a confusing manner. If such a join existed, every time I saw it I would have to stop and think to myself, "oh, this is really a reduce with an odd name"

    Besides, this would be working against the trend towards perl6. Things like the 2 versions of select are going away AFAIK rather than multiplying :-)

      Notwithstanding that the smart-match operator, the crowning jewel of Perl6, is going to do about 15 times as many things as it currently does in Perl5, Perl6's trends are not particularly relevant to Perl5 development. It is not the goal of Perl5 to have it gradually become Perl6.
      If such a join existed, every time I saw it I would have to stop and think to myself, "oh, this is really a reduce with an odd name"
      Do you really believe that you have no ability to assimilate idioms into your understanding? Do you still see a map and think "Oh, that's really a foreach with an odd name and an accumulator"?

      I have updated the original post to address the issue of reduce as a built-in, and some other issues that have been raised (including a way of understanding the proposed form of join).


      Caution: Contents may have been coded under pressure.
Re: RFC: Should join subsume reduce?
by ambrus (Abbot) on Feb 22, 2006 at 09:44 UTC

    Firstly, I don't like the idea of overloading join as reduce. They're two quite different functions. But others have already told this, so I'll also show something else.

    As I see, the builtin functions of perl are almost frozen and almost no change has been done on them till 5.004.

    • There's been a general rework of open (three-arg open and vvarious other extensions).
    Apart from that, the only ones I know of are implications of how builtin data-types have changed:
    • delete and exists now work on arrays,
    • hash functions respect restricted hashes,
    • file functions work on lexical handles and autovivify them,
    • string functions (including pack and unpack) work on both character and byte strings depending on the utf8 flag,
    • binmode and open now accepts perlio layers,
    • my and sub accepts attributes.

    To back up my statement, I've compared the perlfunc of perl 5.004 and 5.8.8, and here are the other significant changes of semantics I've found.

    • The argument of close is optional and defaults to the selected handle.
    • The argument of cos, eval is optional and defaults to $_ (or this got documented because sin was already like this, I don't know).
    • The argument of dump is optional and I don't care what it defaults to.
    • Exec now accepts an indirect argument to tell which program to run, if not $ARGV[0].
    • The argument of localtime is optional and defaults to time (so that's why I see localtime(time()) so often).
    • Lock is made a new builtin function.
    • The second argument of mkdir is optional and defaults to 0777.
    • No now accepts a version (with or without a module; use did this already).
    • Our is a new builtin function.
    • Pack and unpack accepts some new templates: parenthesis, q Q j J F D, bracketed repeat counts, exclamation mark, x!.
    • The argument of package can be omitted but this feature is depreceated.
    • Prototype works for builtin functions.
    • Qr precompiled regular expression quoting is introduced.
    • Sort accepts a ($$) subroutine instead of implicitly passing values in $a and $b.
    • You can now call splice with only one arguments, when it would remove all elements (no-one knows this, so this is a nice obfu-feature).
    • Sprintf supports a new %b conversio to output an integer in binary, the v flag, the ll flag.
    • Substr accepts a fourth argument as a replacement.
    • Sysread accepts a fourth argument to specify an offset within the string (this is useful).
    • You can omit the length from syswrite, which now deafults to the length of the scalar.

    Of course, I could have easily skimmed over some other difference.

    Update: listified changes. This way it seems much more.

    Update: see also Builtin functions defaulting to $_

Re: RFC: Should join subsume reduce?
by CountZero (Bishop) on Feb 22, 2006 at 12:07 UTC
    Who in his sane mind would ever think of join doing anything else than joining things?

    If we ever go that way, I suggest that we further extend join by allowing it to take a filehandle instead of a coderef and then make it print both to STDOUT and to a file (it would be in a certain way join both output methods together).

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: RFC: Should join subsume reduce?
by Sec (Monk) on Feb 23, 2006 at 10:24 UTC
    As all the others seem to think this is a bad idea, I will voice my opinion that I think it is a good idea. It does not break existing code. It is similar enough that to me it seems logical that join should support more complex operations than only the simplest case. And finally i think that it would be great to have such functionality in base perl without installing yet another module.

      And finally i think that it would be great to have such functionality in base perl without installing yet another module.

      Just so you know this argument is a no-op. List::Util has been in core as of 5.8.x.

      ---
      $world=~s/war/peace/g

        In fact, the argument is not just a no-op, it contradicts the sentiment: List::Util has been in core since 5.7.3 (to be precise), whereas this newfangled join would only appear in some future version of Perl – so if you wanted broader coverage, you’d use reduce anyway.

        Makeshifts last the longest.

Re: RFC: Should join subsume reduce?
by nothingmuch (Priest) on Feb 24, 2006 at 00:49 UTC
    reduce is also known fold, and foldr and foldl can be used to implement any of grep, map, join, sum, etc etc etc.

    In Haskell Perl's join can be reimplemented as this:

    join _ [] = [] -- this means that join on the empty list is the empty +string join delim strings = foldl1 (\left right -> left ++ delim ++ right ) s +trings -- this is join implemented with reduce -- also this could be written as join delim strings = foldl1 ((++) . (++ delim)) strings -- or more perlishly join delim strings = foldl1 (\left right -> concat [left, delim, right +]) strings -- or with autocurrying fun join = foldl1 . ((++) .) . flip (++)
    And and similarly we can implement this with List::Util's reduce very elegantly too:
    sub join { my ( $delim, @strings ) = @_; reduce { $a . $delim . $b } @strings; } but in this case the concatenation operator is not used directly as th +e a curried higher order function
    A tutorial on the universality and expressiveness of fold is a wonderful article on this topic. There are some diagrams to assist you in understanding the article, too.

    Since you seemed to like to reference c2.com, see The Wheel Gets Reinvented.

    Lastly, the argument that adding reduce to Perl will break code is wrong - err, lock and others were added post factum as weak keywords - keywords that are only available if there is no sub by that name in the current package already.

    -nuffin
    zz zZ Z Z #!perl

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://531739]
Approved by TStanley
Front-paged by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-24 19:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found