John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

A v-string, such as v65.66.67, is encoded as a character string with no special type designation. So, when given a parameter that might be a v-string or might be something else, how do you tell them apart?

I think I came up with an efficient and useful way:

I'll treat it as a v-string if it contains any characters in the range \0-\1F. Most v-strings will contain entirely values in that range anyway, and real words (including specifically version designators spelled out as ascii numbers and identifiers) will not contain non-graphic control codes. Maybe I'll leave out \t and \n and \r just to be safe.

Two key points to this: I can perform the check using the fast counting mode of tr; and more importantly a v-string that is not usual, like the one in the first sentence, can be distinguished by adding a trailing .0 which does not affect the meaning.

What do y'all think? Any better ideas?

—John

Replies are listed 'Best First'.
Re: Distinguishing a v-string from something else
by theorbtwo (Prior) on Dec 04, 2002 at 21:32 UTC

    Taint neccessarly so. My IP address, for example, is 68.83.61.190. (Actualy, it isn't; I fudged the numbers both so I wouldn't get cracked and so it'd work.) Also, adding a trailing .0 /will/ effect the string, and may or may not effect the meaning. There is /no/ general way to tell a v-string from somthing else. For example, if you get '1.0', that could have really been written '1.0', or 49.46.48. There's no way to tell, other then context.

    Sorry.


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

      Good point. My context is that of a version number. Adding trailing zeros will not affect the ordering of greater/less, but will affect exact matches using eq (I figured stripping trailing zeros to normalize before comparing for equality).

      A function that takes an IP address as a buffer of 4 bytes, a v-string (which will UTF-8-encode the values above 127), or a string can tell the difference by looking for the presence of only digits and dots and of the total length.

      —John

        a v-string (which will UTF-8-encode the values above 127)

        actually, they're UTF-8 encoded if they're above 255.

        > perl -MDevel::Peek -e"$a=v256;$b=v128;Dump$a;Dump$b" SV = PV(0x182ebf4) at 0x182383c REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x182013c "\304\200"\0 CUR = 2 LEN = 3 SV = PV(0x182ec24) at 0x1823848 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x182012c "\200"\0 CUR = 1 LEN = 2

        you can differentiate between a number and a v-string by examining the result of the SvPOKp(SV*) macro. a true return means a v-string, false indicates a number.

        if you can differentiate between a v-string and any other string by your above method, you should be able to determine whether or not you have a v-string, no?

        ~Particle *accelerates*

      For example, if you get '1.0', that could have really been written '1.0', or 49.46.48. There's no way to tell, other then context.

      i'm not so sure this is correct. the code below seems to illustrate the storage of v-strings is different than other text strings, as v-strings are stored as octals and periods are not included. \1 is not the same as chr(49).

      > perl -MDevel::Peek -e"$a=v1.0; $b='1.0'; Dump$a; Dump$b" SV = PV(0x182ebf4) at 0x182383c REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x182013c "\1\0"\0 CUR = 2 LEN = 3 SV = PV(0x182ec24) at 0x1823848 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x182012c "1.0"\0 CUR = 3 LEN = 4

      ~Particle *accelerates*

        Ahh, but I didn't say that v1.0 eq "1.0"; I said '1.0' eq 49.46.48, and they are:

        >perl -MDevel::Peek=Dump -e "$a=49.46.48; $b='1.0'; Dump $a; Dump $b" SV = PV(0x225484) at 0x19245fc REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x22ce4c "1.0"\0 CUR = 3 LEN = 4 SV = PV(0x2254b4) at 0x1924608 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x22b424 "1.0"\0 CUR = 3 LEN = 4

        Oh, two BTWs: all strings are stored as an array of chars (c-style); base doesn't enter into it, in the same way as 0x10 and 16 are both the same number. It's just that Devel::Peek prints nonprintable characters as octal escapes.


        Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Re: Distinguishing a v-string from something else
by particle (Vicar) on Dec 04, 2002 at 20:01 UTC

    out of curiousity, why do you need to deal with v-strings?

    i believe they're considered a failed experiment, and will be removed soon (perhaps perl 5.010_000 ;-)

    ~Particle *accelerates*

      v-strings are used by the VERSION argument in a use or require. You cannot pass anything else!

      As far as I've found, the only thing wrong with them is not being able to distinguish from a string. Giving it its own ref type, like compiled regex's do, would be great.

      When I first posted the Exporter::VA strawman, someone else (tye?) mentioned that, but then couldn't find any such basis for the rumor.

      If perl v5.10 changes things and starts taking something else (string?) for the VERSION indirect-object parameter, I'll accomidate the new way too.

        v-strings are used by the VERSION argument in a use or require. You cannot pass anything else!

        that is not correct.

        #!/usr/bin/perl require 5.006_001; package vstring; our $VERSION = '1.0.1'; package number; our $VERSION = 1.34_001e-5; package string; our $VERSION = '1.2beta4a'; package main; print $vstring::VERSION, $/; print $number::VERSION, $/; print $string::VERSION, $/; ## prints: ## 1.01 ## 1.34001e-005 ## 1.2beta4a

        don't use v-strings, there are alternatives.

        ~Particle *accelerates*

      They're staying around in perl6, so I assume they'll be there in 5.10. I think you're thinking of psudo-hashes.


      Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Re: Distinguishing a v-string from something else
by converter (Priest) on Dec 04, 2002 at 23:25 UTC

    The v-string is just a convenient way for the programmer to construct a literal string composed of (possibly) non-printable characters.

    $string = v1.2.42; # or $string = 1.2.42; # is just an easier way of writing: $string = "\x01\x02\x2a";

    It doesn't matter how the programmer using your code wrote the string expression, the result is the same. If your code expects a string limited to a certain subset of characters, then you should validate the input.