Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

ref, no, maybe?

by Ovid (Cardinal)
on Jan 11, 2001 at 04:07 UTC ( #51024=perlquestion: print w/replies, xml ) Need Help??

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

Without naming names and starting another CB log/nolog war (just kidding), I noticed some well-respected monks asserting in the CB that most uses of ref were wrong. Essentially, ref if used for deep magic that is beyond the means of most monks (such as this blowhard, for example).

Recently, I found a need to create a query string based upon some data in a hash. I couldn't find a module that did this, so I wrote the following:

sub createQueryString { # This routine expects a hash as an argument and will return a val +id query string with # hash keys as the names and hash values as query string values. +If multiple values # are associated with one key, pass them as an array reference. my %data = ( @_ ); my ( $key, $value ); my $query_string = ''; # The following is a hexadecimal list of the characters that shoul +d be # uri encoded. This will be passed as the second argument to the # uri_escape() function. my $cgi_chars = '\x00-\x29\x2b\x2c\x2f\x3a-\x40\x5b-\x5e\x60\x7b- +\xff'; while (($key, $value) = each %data) { $key = uri_escape( $key, $cgi_chars ); if ( ref( $value ) eq "ARRAY" ) { # <- that' +s a ref! # We have multiple values for this key foreach ( @$value ) { my $array_value = uri_escape( $_, $cgi_chars ); $query_string .= "$key=$array_value&"; } } else { $value = uri_escape( $value, $cgi_chars ); $query_string .= "$key=$value&"; } } chop $query_string if $query_string; # remove trailing ampersand return $query_string; }
This code is very specific to my needs and does not guarantee to be appropriate for all uses. For example, whether or not a name is sent when there is no corresponding value depends upon the type of input field. Also, error checking is handled long before this routine gets the data, so this code is NOT robust. If you want it, use at your own risk.

So, putting those issues off to the side, is there something wrong with my use of ref here, or is there a better way to approach this?

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
(tye)Re: ref, no, maybe?
by tye (Sage) on Jan 11, 2001 at 05:00 UTC

    I say that the only valid use of ref() is doing something like ! ref($r). I think it is just fine to have string manipulating code that refuses to work on a reference (if you really wanted it to do that, then you can stringify the reference before passing it in). So your code would be fine with me if you wrote it either like this:

    if ( ref( $value ) ) { # <- that's a ref! # We have multiple values for this key foreach ( @$value ) { my $array_value = uri_escape( $_, $cgi_chars ); $query_string .= "$key=$array_value&"; } } else { $value = uri_escape( $value, $cgi_chars ); $query_string .= "$key=$value&"; }
    but I'd prefer this:
    if ( ! ref( $value ) ) { $value = uri_escape( $value, $cgi_chars ); $query_string .= "$key=$value&"; } else { # We have multiple values for this key foreach ( @$value ) { my $array_value = uri_escape( $_, $cgi_chars ); $query_string .= "$key=$array_value&"; } }
    but note that these will die fatally and rather confusingly if given a hash reference.

    The correct substitute for ref() is UNIVERSAL::isa, as in:

    if( UNIVERSAL::isa( $r, "ARRAY" ) ) { # access @$r here } elsif( UNIVERSAL::isa( $r, "HASH" ) ) { # access %$r here

    The problem with "HASH" eq ref($r) is that it will fail if $r is a reference to a blessed hash. Why would you want to refuse to do useful hash stuff on something just because it is blessed?

    I also don't approve of using:

    if( ref($r) && "$r" =~ /(^|=)HASH/ ) {
    as this could break if stringification is overloaded.

    Another misuse of ref() is the old:

    sub method { my $self= shift; die "Invalid object" unless ref($self) eq "My::Package";
    I'd probably leave such a test out completely most of the time, but if you want to do that, then isa() is the right tool:
    die "Invalid object" unless UNIVERSAL::isa( $self, __PACKAGE__ ); # or die "Invalid object" unless eval { $self->isa(__PACKAGE__) };
    The awkward syntax is required because you can't call a method on an unblessed reference (and calling a method on a non-reference tries to treat it like a string containing a package name, which is only what you want here for a class method such as a constructor).

    By the way, how do you tell if a reference is blessed without using the fragile hack of:

    if( ref($r) && "$r" =~ /=/ )
    ?

            - tye (but my friends call me "Tye")
      >By the way, how do you tell if a reference is blessed without using the fragile hack

      I've always used variations of:

      if ($type = ref $t) { eval { $t->isa("UNIVERSAL"); }; if ( $@=~/unblessed/ ) { print "It's a $type, not blessed.\n" } else { print "It's blessed into class $type"; } } else { print "Not a reference at all\n"; }
      with great success. It may not catch everything, and normally I'm not hunting around to figure out what's in a variable... If you don't know, you're in bigger trouble than this. :)
        This thread strongly reminds me of Readonly error on $_. In his response, merlyn displays some trick using the can method. /me thinks it can be applied here also. Saves the use of UNIVERSAL, but uses eval.
        if eval( $r->can('can')) { #we have an object } elsif ref( $r) { #we have an unblessed reference } else { #no reference at all }
        Merlyn's solution looks somewhat cleaner to me. What do you think?

        Jeroen
        I was dreaming of guitarnotes that would irritate an executive kind of guy (FZ)

        Ah, thanks. I was wanting to use it to avoid the eval, so I guess I'll just keep the eval. (:

                - tye (but my friends call me "Tye")
      tye said:
      The problem with "HASH" eq ref($r) is that it will fail if $r is a reference to a blessed hash. Why would you want to refuse to do useful hash stuff on something just because it is blessed?
      Because a blessed hashref is not a hashref; it's an object. The whole point of an object is encapsulation; to quote The Camel, "In Perl culture, by contrast, you're expected to stay out of someone's home because you weren't invited in, not because there are bars on the windows" (pg. 278, 2nd ed.). i.e., Just because a blessed hashref is a hashref - with modifiable values - at heart doesn't mean i should manipulate it. That could break an object in spectacular ways.

        Yes, you shouldn't go messing with the hash behind my blessed ref when I give it to you as an object. But that isn't what I'm talking about.

        You have some function that does something useful with a hash. I have an object that consists of a blessed reference to a hash. Inside the code for that class, I have to mess with my own hash. Well, if this manipulation of my own hash would be made easier by using your useful function, then why shouldn't I be allowed to do that?

        Now, it'd be nice, being inside the class code, if I could get a non-blessed reference to my hash. But I can't. If that one thing were different, then I wouldn't mind the use of "HASH" eq ref($r) nearly so much.

                - tye (but my friends call me "Tye")
      Note that even UNIVERSAL::isa is not perfect.

      What happens if someone decides to create a package called ARRAY?

      At some point you have to assume some reasonableness on the part of the programmer using your library...

        What happens if someone decides to create a package called ARRAY?

        Then they deserve to be flogged and banned from CPAN and have no right to complain if anyone's code breaks with their stuff. Did you expect anything different?

        There are other cases like this. I'd rather that there weren't, but you can't expect things to work if you have packages named SUPER or CORE either. In fact, I've never seen an all-caps package name.

        Update: mdillon was kind enough to remind that I actually have seen a few all-caps package names. He mentioned B and O and I'll add DB. These are all examples of horrid package names. Anyone who doesn't already know care to guess what any of these are for? mdillon also mentioned DBI and CGI. Well, those are reasonably named modules. But I reserve the right to shoot anyone who makes a module whose name is an ALL-CAPS word. ;)

                - tye (but my friends call me "Tye")
Re: ref, no, maybe?
by merlyn (Sage) on Jan 11, 2001 at 04:23 UTC
    It's the "pass them as an array reference sometimes" that's biting you. Perhaps two different interfaces would be useful... one where they're always scalars, and one where they're always array references for the other.

    -- Randal L. Schwartz, Perl hacker

      I'm not sure I follow you on this one. I have one basic need: to convert a hash to a query string. Are you saying that I should have two functions to build one query string? That doesn't strike me as an optimal programming solution. Currently, the hash looks something like this:
      my %hash = ( name => 'Ovid', color => [ 'red', 'blue' ], COBOL => 'sucks' );
      That's fed to the routine that creates the query string and everybody's happy. What you're suggesting seems to imply that I should break the hash in two and feed them in separately, or send them to different functions. That seems less efficient. Is there a benefit to approaching it that way, or did I misunderstand your response?

      Cheers,
      Ovid

      Update: Hmm... it occurs to me that I could have made *all* values into array refs. The code would be smaller and easier to follow. The while{} loops becomes this:

      while (($key, $value) = each %data) { $key = uri_escape( $key, $cgi_chars ); foreach ( @$value ) { my $array_value = uri_escape( $_, $cgi_chars ); $query_string .= "$key=$array_value&"; } }
      Saved about six lines of code and made it cleaner, to boot. Damn. merlyn strikes again!

      So here's the interesting question: is it coincidence that I was able to improve this subroutine and eliminate the ref, or is seeing a ref in code generally indicative of a poor algorithm that bears further investigation? Here's another question: is that last sentence pompous enough for you?

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        For what it's worth, in my opinion, your use of ref here is perfectly valid and makes your function very flexible and easy to use, for the uses you have for it.

        I remember something in a project I was involved in in the distant past that made use of data structures a little like this:

        $struct = { key => { sub_key => 'value', sub_key2 => 'value2', }, key2 => 'other_value', };
        Or maybe it was:
        @items = ( 'item1', [ 'nested1', 'nested2', 'nested3' ], [ 'nested4', [ 'nested5', 'nested5a' ] ], 'item6', 'item7', [ 'nested8', 'nested9' ], );
        I don't fully remember the rationale for building a data structure like that, but I think it had something to do with decision-making, where if you'd come across a reference, only one of the references would be used, with control passing off to the next item when it was completed (something like that). I just remember that it required us to use 'ref' when processing it, and perhaps a bit of recursion. I'm perfectly willing to accept that this is bad practice, and I've learned a lot of Perl between then and now and it's likely that I might have come up with alternative way to accomplish what we were doing... *shrug*.
        My gut level feel is that being overly flexible on input parameters is often a sign of premature optimization. And yes, a "ref" in the code means you're looking in the mirror just a little too hard, most often. There are clearly valid uses for ref, but the first reaction I always have is "can we simplify this", and as you discovered, you could.

        -- Randal L. Schwartz, Perl hacker

        My way of implementing merlyn's suggestion of having multiple functions with different interfaces is to have one function that constructed your query string from one of those interfaces (eg arrays) and then have another that accepts a hash-ref, repackages that in terms of array refs, and then calls the first.

        So you have exactly one real implementation (should you want to change the construction of the query string it is easy to do so) yet you have two functions. One of which can't be called with a hash ref, the other which needs an array ref. Without doing any test within the function or making the implementation of the critical code more complex.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://51024]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2022-08-14 09:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?