John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

I have a function that uses an array of strings, and before it works it needs to put the strings into canonocal form. Now, I only want to do that the first time I use that array, by "correcting" the elements in-place.

So, how can I tell if I've seen that array (passed by reference to the function) before? If I store a flag in a hash indexed by the ref's string, is that string always the same? That is, could I memo-ize it based on the ref's stringification? The other idea is to bless it, and then test to see if that's been done. But I wonder if there are other ways to "mark" something. Attributes currently work on variables, not values, right?

—John

Replies are listed 'Best First'.
Re: Marking an array as already seen
by sauoq (Abbot) on Jan 17, 2003 at 06:38 UTC

    Tie::RefHash comes with perl.

    I don't think I'd use Memoize.

    Untested:

    use Tie::RefHash; { tie my %refhash, 'Tie::RefHash'; sub canonicalize_array { my $ref = shift; return $ref if exists $refhash{$ref}; # Otherwise canonicalize the array. $refhash{$ref}++; $ref; } }

    Edit: Added the obvious $refhash{$ref}++; line.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: Marking an array as already seen
by PodMaster (Abbot) on Jan 17, 2003 at 07:02 UTC
    Try this on for size ;)
    sub foy { my $ref = shift; canonicalize($ref) unless $ref->isa('_Canonical'); }
    update: I guess I should've read the entire thing first ;) but not to fear, read on ...

    If I store a flag in a hash indexed by the ref's string, is that string always the same?

    Yes. A reference to the same structure will always stringify the same (in the same instance of the program of course), otherwise, the perl garbage collector couldn't work (reference counting is, well, reference counting, so it has to stringify the same).

    Also, to learn about attributes, check out the pragma.


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      Yes. A reference to the same structure will always stringify the same (in the same instance of the program of course), otherwise, the perl garbage collector couldn't work (reference counting is, well, reference counting, so it has to stringify the same).

      Unless, of course, it's a reference to a class that has overloaded ""... in which case you'll have to use refaddr from Scalar::Util :-)

Re: Marking an array as already seen (yes)
by tye (Sage) on Jan 17, 2003 at 18:07 UTC

    Yes, "$ref" is always the same if $ref= \@array; for the same array. Just go ahead and use it as the key in a hash. This is how Tie::RefHash is able to work.

    You can even use $hash{0+$ref} which is probably faster.

                    - tye
      OK, why is {0+$ref} faster? I can see the different semantics... $hash{$ref} converts the ref to a string, while $hash{0+$ref} converts the ref to a number and then to a string. A quick test shows that converting a ref to a number returns (no surprise) a number, probably the machine address of an underlying unique piece of data in the interpreter, the same as the hex value shown in the stringification form.

      So why would the latter be faster?

      —John

        Just a tiny bit faster, but really "simpler" might have been a better description. It won't get confused if someone (re)blesses the array, for example. It will (probably, as I said) be slightly faster because it doesn't even have to look up blessedness and put that into the string and it will use a shorter string. *shrug*

        It was a quick note I added at the end. I prefer 0+$ref over "$ref" in general. But the speed isn't really the point. Just not doing extra work that might get in the way.

                        - tye
Re: Marking an array as already seen
by shotgunefx (Parson) on Jan 17, 2003 at 07:40 UTC
    I'd go with sauog's suggestion but would use a regular hash. You already have the reference so you don't need the tie to preserve it. If you wanted to delve deeper, I do believe that there are some user defined flags you could set on the SV itself IIRC, but I would just use the hash method.

    -Lee

    "To be civilized is to deny one's nature."
Re: Marking an array as already seen ('~'magic virtual table)
by shotgunefx (Parson) on Jan 17, 2003 at 19:06 UTC
    John,
    I'd like to thank you for asking a good question that has blown my work schedule out of the water and ruined my day ;).

    I've wanted the same thing on occasion,(For lazy iterating, etc without tieing blessing, bla,bla) so I did it.
    I wrote an XS module using the '~' user defined magic vtable. It respect's other magic and let's specify a name space to avoid collisions. The only problem I have is that I'm trying to set it to use one context sensitive routine for get/set. Once I try and process the argument stack manually it blows up. (Just saying SvOK(target) core dumps.) I'm new to XS and guts so it is probably something stupid I'm doing. The main implementation was quick, but I've been slamming my head off the wall with the optional arugments for like 7 hours. I'll post later today about that problem and then I'll post the module.

    When it is it finished it will work like so.
    use Magic::Attach; sub bind { my ($ref,$shadow) = @_; attach($ref,$shadow, 'optional namespace'); } # Later sub foo { my $ref = shift; my $shadow = attach($ref); $shadow = attach($ref,{}) unless $shadow; my $count = $shadow->{Someattribute}; }
    -Lee

    "To be civilized is to deny one's nature."
      Interesting. You're saying that in the perl implementation, there is a user-defined "slot" that can hold ... what? A ref to another thing, which you choose to be a hash for your property table, it seems. But you called it a user-defined v-table, like it's a table rather than a single slot. Or maybe that's just the inteded use?

      Care to tell us more about that? Describing it might help gell your knowledge about XS guts.

      —John

        There are different types of magic (this is how tie is implemented amongst other things.), I am using the '~' user extension type. Multiple variables can be attached to a reference as well and as long as they specify a different 'namespace', they won't clobber each other.

        I choose to let my implementation use any ref blessed or not. The table portion refers to additional methods that can be implemented.
        Function pointer Action taken ---------------- ------------ svt_get Do something after the value of the SV is retr +ieved. svt_set Do something after the SV is assigned a value. svt_len Report on the SV's length. svt_clear Clear something the SV represents. svt_free Free any extra storage associated with the SV.
        To get the value bound to the variable, I'm simply returning it's pointer stored in mg_ptr. See the Magic Virtual Tables section in perldoc:perlguts for a better explanation (It's late and I'm bug-eyed tired). I'm only implementing free though it could be extended to open up the other functions. I probably will do that. If I did, let's say you were trying to make a cache of the output of some larger process, you would be able to add a callback to the data structure via set && clear magic that would invalidate that part of the cache when updated and it wouldn't require any direct action by the user.

        The implementation as it stands is exposed through a single routine but this might change as the function attach is almost too DWIM.
        Here's a run down of what it looks like.
        attach ( $somepointer, { seen=>1 }, 'opt_name_to_avoid_collision') or + die "Couldn't attach; # Later on... my $attached = attach($somepointer); $attached->{bar}= uc $attached->{bar};
        Not a compelling example above but I can think of some better ones which I'll post when it's cleaned up. If you think about how matching works and how it tracks where it is in a string, well you can do the same thing with any sub now without trying to compute unique keys from caller() + data and storing it in a hash. Lazily processing lists non-destructively is another good example.

        -Lee

        "To be civilized is to deny one's nature."
Re: Marking an array as already seen
by dragonchild (Archbishop) on Jan 17, 2003 at 15:14 UTC
    I like blessing it and then checking for the bless later. That seems to be the easiest solution. You could also just make the entire data structure a class and encapsulate the issues. (But, I'm OO-happy.)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.