Not_a_Number has asked for the wisdom of the Perl Monks concerning the following question:

Please don't -- me before reading on. I know this is a FAQ, and I know the answer(s) in the FAQ. I also know it's in the Q&A section here (although some of the answers there are rather strange...).

It's just that a couple of times recently I've come across this sort of thing:

my @array = qw ( a b c d x ); my $sought = 'a'; print 'Found!' if index ("@array", $sought) > -1;

So, my question: is there anything wrong with this? (if there is, I'm quite happy to forget I ever saw it :-).

TIA
dave

Replies are listed 'Best First'.
Re: Is X in my array?
by dws (Chancellor) on Jun 17, 2003 at 18:20 UTC
    So, my question: is there anything wrong with this?

    If your target array is only going to contain single-character strings, then no. But if the strings get larger, you risk matching "food" when you're seeking "foo". And if the strings ever contain spaces, you risk more false positives.

    Better, I think, to use grep.

      ...if the strings get larger, you risk matching "food" with when you're seeking "foo". And if the strings ever contain spaces, you risk more false positives.

      Right, thanks. I should have thought of that myself. In other words, this 'idiom' is highly dangerous.

      dave
Re: Is X in my array?
by Zaxo (Archbishop) on Jun 17, 2003 at 18:25 UTC

    That's actually pretty sane, except for the false positives you get from various substrings: @array = qw{foo bar baz}; $sought = qq(o$"b);. If you know your data well enough to eliminate that, this ought to be pretty efficient.

    After Compline,
    Zaxo

Re: Is X in my array?
by antirice (Priest) on Jun 17, 2003 at 18:31 UTC

    It really depends. Suppose you're looking for 'a b' as a value within the array. You would get a false positive if you use this method. To correct it, I guess you can do something along these lines:

    my @array = qw( a b c d x ); my $sought = 'a'; $" = '|-nothing we expect-|'; print "Found!" if index($" . "@array" . $", $" . $sought . $") > -1;

    Look at that disgusting code... Please note that the preferred way to do this is grep in a scalar context:

    my @array = qw( a b c d x ); my $sought = 'a'; print "Found!" if grep $_ eq $sought, @array;

    With this method, you only iterate over the array instead of all the characters within the array plus $". Since grep is used in a scalar context, it returns the number of matches within the array.

    Update: Ah geez.. walk away to grab a drink, come back to finish your answer and two people have posted. You guys are quick :P

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      Thanks, antirice. But I always thought it wasn't good practice to use grep for such purposes. From perlfaq4, How can I tell whether a certain element is contained in a list or array?:

      Please do not use ($is_there) = grep $_ eq $whatever, @array;

      Am I missing something?

      Thanks again

      dave

        If you are searching through the array multiple times, then right, this isn't the best option. If you are reusing the array to compare against, the best practice is this:

        my @array = qw ( a b c d x ); my %hashcheck; @hashcheck{@array} = (); my $sought = 'a'; print "Found!" if exists $hashcheck{$sought};

        The reason they suggest not using grep is because it is a O(n) algorithm whereas checking for exists is O(1). When you continuously check @array to see whether or not items can be found, it would be much slower to use grep.

        As for their suggestion in the case that the array is checked only once, I could go either way. Their method is actually 17% faster on a million iterations (my benchmarks, YMMV), but I find grep easier to read.

        antirice    
        The first rule of Perl club is - use Perl
        The
        ith rule of Perl club is - follow rule i - 1 for i > 1

Re: Is X in my array?
by sauoq (Abbot) on Jun 18, 2003 at 00:07 UTC
    I know this is a FAQ, and I know the answer(s) in the FAQ.
    .
    .
    .
    index ("@array", $sought) > -1;

    Please note, however, that isn't one of the suggested methods that is in the FAQ. ;-)

    I disagree with the FAQ in this case. Often, grep is exactly what you want. If grep isn't good enough, then you probably chose the wrong data structure in the first place. (The FAQ also shows a C-ish example of hopping out of a loop when an element is found. Even though that halves your number of comparisons on average, it's still O(N); given that grep is quick, for small arrays, it probably isn't worth it.)

    As far as doing it this way, well... you've already gotten some responses that outline the issues. As dws said, if it is a single character you are looking for, then no problem. (What he didn't mention was that if it is a single character you are looking for, then you might be better off using tr// instead of index().)

    You can use it for longer data if you no a particular substring will never show up in the data. For obvious reasons, it's best if that substring is short, like a single character. A good choice might be "\0". Once you choose your substring, you have to set $" appropriately and you still have to be careful...Some examples of what to watch for:

    @a = qw( food bar baz ); $" = "\0"; print "Found\n" if index "@a", qq/foo/; # Wrongly succeeds. print "Found\n" if index "@a", qq/foo$"/; # Rightly fails. print "Found\n" if index "@a", qq/$"bar$"/; # Rightly succeeds... print "Found\n" if index "@a", qq/$"baz$"/; # Wrongly fails.
    Here's how to make it work:
    print "Found\n" if index qq($"@a$"), qq/$"foo$"/;
    In words, you have to tack the delimiter on to both the front and back of both the string you are searching through and the one you are searching for.

    Not only is it a mess, stringifying the array might use a lot of memory if the array is big... and if the array isn't big, you aren't really buying yourself much.

    Conclusion: In most cases, grep will do fine. If it won't choose a better data structure. If that isn't an option, write a loop and bail out after finding the first match. But there's no good reason to use index on a stringified array, IMHO.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: Is X in my array?
by sgifford (Prior) on Jun 17, 2003 at 18:46 UTC
    If your data will work like this (there's no foo vs. food problems), it's actually almost twice as fast as grep.
Re: Is X in my array?
by Aristotle (Chancellor) on Jun 20, 2003 at 19:13 UTC
    All caveats aside for the many cases where this will cause false positives, I'd write it with a regex for expressiveness:
    my @array = qw ( a b c d x ); my $sought = 'a'; print 'Found!' if "@array" =~ /\Q$sought/;
    While it is very slightly less efficient as it needs to be compiled, the regex engine will see it is dealing with a fixed string and use shortcuts rather than generic pattern matching, so that will only matter in a tight loop processing a lot of data. However I find it reads a lot more naturally.

    Makeshifts last the longest.