Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Undocumented join() feature, now defunct?

by johngg (Canon)
on Oct 29, 2014 at 22:55 UTC ( [id://1105563]=perlquestion: print w/replies, xml ) Need Help??

johngg has asked for the wisdom of the Perl Monks concerning the following question:

I thought it might be useful to have an equivalent to join that could stitch a LIST together with an EXPR that actually varied. Just for fun, and not expecting it to work, I first tried supplying a tied scalar incrementor as the first argument to join but, to my great surprise, it did produce a string with a varying EXPR joining the LIST elements. This code

use strict; use warnings; use feature qw{ say }; # =========== package Incrementor; # =========== use Tie::Scalar; our @ISA = qw{ Tie::StdScalar }; sub TIESCALAR { my( $pkg, $value ) = @_; $value //= 0; return bless \ $value, $pkg } sub FETCH { my $self = shift; return ${ $self } ++; } package Main; my @arr = qw{ a b c d e }; tie my $inc, q{Incrementor}; say join $inc, @arr; say q{-} x 20; say join $inc, @arr;

produces this output

a1b2c3d4e -------------------- a6b7c8d9e

Note that the invocation of join seems to consume the first iteration before actually constructing the string. The documentation says

Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string.

and the wording looks the be the same for for all versions. My expectation was that the EXPR would be evaluated once and the invariant result used between every element of the LIST. This doesn't appear to be the case for the following perl versions - 5.8.8, 5.10.1 and 5.14.2 on various Linuxen or Cygwin and 5.16.1 on Windows 7. However, under 5.18.2 on Mint 17 I do get what I was expecting

a0b0c0d0e -------------------- a1b1c1d1e

I don't have 5.20 installed anywhere so can't test on that. Is the behaviour of join with a tied scalar prior to 5.18.2 correct but not clearly explained in the documentation or is it a bug that has now been fixed?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re: Undocumented join() feature, now defunct? (optimization)
by tye (Sage) on Oct 30, 2014 at 00:48 UTC
    My expectation was that the EXPR would be evaluated once and the invariant result used between every element of the LIST.

    Then you should probably adjust (and loosen) your expectations about exactly how much optimization has or hasn't been done to a particular feature in a particular build of Perl.

    If you manage to construct code that behaves differently depending on whether the implementation of some feature (especially one that uses the value more than once) evaluates the value once or twice, then you've written code that is likely to break when an optimization gets done.

    If I had reason to pass a magical scalar that changes values to join, then I'd tell Perl to fix the value first via:

    say join "$inc", @arr;
    and the wording looks the be the same for for all versions

    Yeah, people don't usually update the documentation of the basic functionality when an optimization is made.

    In your mind, one of these two has to be broken?

    sub join1 { my( $sep, @vals ) = @_; my $str = ''; while( @vals ) { $str .= shift @vals; $str .= $sep if @vals; } return $str; } sub join2 { my( $sep, @vals ) = @_; $sep = "$sep"; # Added my $str = ''; while( @vals ) { $str .= shift @vals; $str .= $sep if @vals; } return $str; }

    Now if I go and optimize out the copying of most of @_ into @vals, that breaks something too? Or are both of those broken because they don't avoid copying from @_?

    Next you'll tell me that join2 is broken because it evaluates $sep even if when @vals is empty and so $sep isn't actually used.

    IMO, none of these are bugs. They are implementation details that reasonably should be expected to change at any time when some bug gets fixed, some code gets optimized, some code gets refactored, etc.

    Making scalars that change every time you look at them is weird stuff. You have to put up with weird things happening and take extra care when you do stuff like that.

    - tye        

      Thank you for your reply but I think you might have misunderstood the direction I'm coming from. I was not attempting to write code to exploit an undocumented, or unclearly documented, feature of join. Rather, I was going to write an equivalent function that did allow for an EXPR that evaluates more than once. The discovery that earlier versions of join also did this was purely by chance.

      Then you should probably adjust (and loosen) your expectations

      My expectation was based on the wording of the documentation which does not explicitly state that EXPR could be evaluated more than once.

      then you've written code that is likely to break when an optimization gets done.

      To my mind, an optimization improves performance but does not alter behaviour. A bug fix alters behaviour. I should, of course, have read perl5180delta before writing the OP. In it under "Selected Bug Fixes" we find

      join and "@array" now call FETCH only once on a tied $" [perl #8931].

      This answers my question: the behaviour was considered to be a bug which has now been fixed. Anonymonk's reply informs us that it was "a bug fixed by accident" as part of an optimization but perhaps the documentation ought to clarify the behaviour and how it has now changed.

      I should add that none of this was for production code but was just playing around exploring language features.

      Cheers,

      JohnGG

        Thank you for your reply

        You are most welcome.

        but I think you might have misunderstood the direction I'm coming from.

        No. I didn't assume you were trying to do any of the things you speculate about (including impacting production code). I was just commenting on your expectations that you explicitly expressed.

        My expectation was based on the wording of the documentation which does not explicitly state that EXPR could be evaluated more than once.

        Well, EXPR is certainly only evaluated once (per call). But it also didn't explicitly state that the (possibly tied) scalar resulting from EXPR could be accessed more than once.

        Of course, it doesn't explicitly state that the scalar might be accessed only once. Documentation doesn't generally (for good reason) specify exactly how many times the implementation might decide to just look at some value you gave to it. Documenting it would mean that you'd be breaking a documented feature if you came up with an optimization that involved just looking at the value one more or one fewer time. It is wise to not tie your implementers' hands so tightly.

        Documenting that there is no guarantee as to how many times join() (in particular) might look at the value of one specific argument would be quite silly. Documenting this rather mundane consequence of internal details of implementations changing in a general manner would be fine (and it may already be done somewhere in the Perl docs).

        The root problem is your expectation that documentation will mention how many times a value is looked at. It usually doesn't. It shouldn't.

        join and "@array" now call FETCH only once on a tied $" [perl #8931].
        This answers my question: the behaviour was considered to be a bug which has now been fixed.

        But you can tell that it was considered an optimization "bug" not a feature "bug", because no feature documentation was updated to assure users of this detail (so not really a "bug" by how I would use that word, just an optimization). This "behavior" wasn't even nailed down. The comment says "only once" not "exactly once". Based on that comment, how many times will FETCH be called for join( $tied, $one ) ? Maybe 1, maybe 0. Neither choice would be a feature bug. And the answer is fairly likely to change at some point (even if accessing it many times continues to be considered an unfortunate/inefficient implementation choice).

        To my mind, an optimization improves performance but does not alter behaviour.

        Then I guess you don't have much experience with optimizations. Optimizations very often change subtle behavior. Optimizations should not break feature behavior. How many times the code simply looks at something isn't feature behavior. The fact that using tie makes it possible to notice how many times your variable is looked at doesn't mean that how many times your variable is looked at is something that must be specified and controlled for every feature implementation. Far from it.

        Using tie to make a scalar whose value is different every time you look at it isn't hard and certainly can be cute, but it also is fundamentally fragile. And there are tons and tons of optimizations that can have an impact when such is done. That isn't due to a problem with those optimizations. It is due to somebody doing something so fragile. When you do that, you should expect weird stuff and/or be very careful about how you pass that value around.

        You most certainly should not expect to not be surprised by the results you get.

        - tye        

      Hi Tye,

      You make some good points with which I agree. But don't you think Perl (or join?) should reliably specify the behavior for say:

      print join ++$i,  qw( a b c);

        The first argument to join is not code to be executed over and over again like with map and grep. So your example of join ++$i, @a doesn't actually pose a problem, unlike passing in a magical scalar that changes each time you look at it.

        So, the behavior of your example code is already well defined.

        - tye        

        What makes you think it isn't?
Re: Undocumented join() feature, now defunct? (bug)
by Anonymous Monk on Oct 29, 2014 at 23:24 UTC

      That commit is unrelated. There's no constant folding in the OP's code.

      There's nothing to submit with perlbug. The current behaviour is correct. As the OP said, it's expected.

      Thank you. I found this informative and helpful.

      Cheers,

      JohnGG

Re: Undocumented join() feature, now defunct?
by ikegami (Patriarch) on Oct 30, 2014 at 14:30 UTC

    My expectation was that the EXPR would be evaluated once and the invariant result used between every element of the LIST.

    The EXPR is only evaluated once, before join is even called.

    $ perl -MO=Concise,-exec -e'join($s, $x, $y, $z)' 1 <0> enter 2 <;> nextstate(main 1 -e:1) v:{ 3 <0> pushmark s 4 <#> gvsv[*s] s 5 <#> gvsv[*x] s 6 <#> gvsv[*y] s 7 <#> gvsv[*z] s 8 <@> join[t5] vK/2 9 <@> leave[1 ref] vKP/REFC -e syntax OK

    The result is a scalar, and scalars aren't invariant as you yourself demonstrated.

    Is the behaviour of join with a tied scalar prior to 5.18.2 correct but not clearly explained in the documentation or is it a bug that has now been fixed?

    The old behaviour is correct. The new behaviour is even more correct. The change was intentional.

    The documentation says Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string and the wording looks the be the same for for all versions

    That's an accurate description of all versions of join.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1105563]
Front-paged by boftx
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (1)
As of 2024-04-18 23:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found