Delusional has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm trying to check for the occurrence of a string/value within another string. In this case:

$thestring = "13,130,213"; $thecheck = "13"; if ($thecheck =~ /$thestring/) { print "$thecheck is in $thestring"; }

Currently I get no output. I could for loop the $thestring splitting on the comma, however, I'd prefer not to do it. At the same time I don't want false positives to be returned either (in this example only 13 should be returned not 130 or 213). The actual value is not displayed only checked if the string/value is in the compare string. Yes, these are numbers, however I'm intentionally handling them as strings as numbers or letters can be in either string.

I'm most likely doing this wrong, but for the life of me, I can't figure out the right way. Can someone please enlighten me?

Replies are listed 'Best First'.
Re: Checking for occurrence in comma separated string
by phenom (Chaplain) on Oct 25, 2005 at 10:54 UTC
    I think you want that the other way around:
    if($thestring =~ /$thecheck/) { ... }
Re: Checking for occurrence in comma separated string
by prasadbabu (Prior) on Oct 25, 2005 at 10:58 UTC

    You change it as shown below.

    $thestring = "13,130,213"; $thecheck = "13"; if ($thestring =~ /$thecheck/) { print "$thecheck is in $thestring"; }

    Prasad

Re: Checking for occurrence in comma separated string
by mulander (Monk) on Oct 25, 2005 at 10:59 UTC
    You get no output beacouse you get no match, instead of doing:
    if($thestring =~/13/) { print "13 is in the string"; }
    You are doing:
    if("13" =~ /13,130,213/) { print "13 is in the string"; }
    So in fact you are searching for the 10char string inside a 2 char string, the thing on the left of =~ is the variable/string in which you want to search, and the thing on the right is the regular expression you want to search for, so to make your code snippet work the way you want it do it like this:
    my $thestring = "13,130,213"; my $thecheck = "13"; if($thestring =~ /$thecheck/){ print "$thecheck is in $thestring\n"; }
Re: Checking for occurrence in comma separated string
by bart (Canon) on Oct 25, 2005 at 11:03 UTC
    You could turn the string into an array, using split, and then use grep to check existence of an item. (Or use a hash as a set if you need to test this one list a lot.)
    my @list = split /,/, $thestring; if(grep { $_ == $thecheck } @list) { print "Found it!\n"; } # Or, if you need this a lot: # preparation: my %set; $set{$_} = 1 foreach split /,/, $thestring; # actual test: if($set{$thecheck}) { print "Found it!\n"; }

    If you do insist on using a regular expression, you need to be aware that a number can appear at the beginning or at the end of a string. Example check:

    if($thestring =~ /(?:^|,)$thecheck(?=,|$)/) { ... }
    but I prefer the double negative:
    if($thestring =~ /(?<![^,])$thecheck(?![^,])/) { ... }
    (If there's anything in front or right after the match, it may not be anything other than a comma.)
      Thanks Bart, I think your right. I was hoping on a simple if compairson method without using arrays etc, but it appears that the array method is the method I should be using for this.

      Thanks...
        You can also prepend and append a comma to your strings, and use index to quickly search for it. It may likely be faster than a dynamically built regexp.
        if(index(",$thestring,", ",$thecheck,") >= 0) { ... }
        That will search for ",13," inside the string ",13,130,213,".
Re: Checking for occurrence in comma separated string
by blazar (Canon) on Oct 25, 2005 at 11:38 UTC
    Seems you already received quite a lot of answers. And it all boils down to basically having switched the two variables you are working on. Said this, it seems to me that no one has mentioned yet that if you want to interpolate a variable into a regex, then you should take into account \Q, and possibly, for more complex stuff, \E: read about them in perldoc perle. But then you'd be just checking for the occurrence of a substring in which case I think the regex engine would be optimized away to use (something like) index internally. So why not resorting to it directly?
Re: Checking for occurrence in comma separated string
by Delusional (Beadle) on Oct 25, 2005 at 11:13 UTC
    Geez, your right. I knew something was wrong with that.... Thanks

    Now that that is working, how can I resolve the false positives? A small change to the code, shows that false positives are also counting as 'True':
    $thestring = "15,130,213"; $thecheck = "13"; if ($thestring =~ /$thecheck/) { print "$thecheck is in $thestring"; } else { print "$thecheck is not in $thestring"; }

    What would need to be changed here? The value in $thecheck should only be true if it is in $thestring, without variants.
      You need to be sure that the whole number is either surrounded by , or the beggining or end of line:
      $thestring = "15,130,213"; if ($thestring =~ /(^13,)|(,13,)|(,13$)/) { print "13 is in $thestring +"; } else { print "13 is not in $thestring"; }
      This is not a very good regexp but it's only an example. I suggest to read more about regular expressions, then start asking question about the difficult parts not the standards covered in almost every regex tutorial.
      If I knew I was specifically searching for a number in a comma-separated list of numbers, I'd incorporate word boundaries in the search pattern:
      if ( $thestring =~ /\b$thecheck\b/ ) ...
      We're building the house of the future together.
        If I knew I was specifically searching for a number in a comma-separated list of numbers, I'd incorporate word boundaries in the search pattern:

        Which would work right up until you got a number like 1.234e-05... at which point it would die horribly or, worse, quietly go on "working". :-)

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Checking for occurrence in comma separated string
by murugu (Curate) on Oct 25, 2005 at 11:31 UTC

    Just exchange the varaibles in the match part....

    if ($thestring =~ /$thecheck/) { print "$thecheck is in $thestring"; }

    Regards,
    Murugesan Kandasamy
    use perl (;;);

Re: Checking for occurrence in comma separated string
by Delusional (Beadle) on Oct 26, 2005 at 08:00 UTC
    Thanks for all your comments. The result of changing the if ($thecheck =~ /$thestring/) to if ($thestring =~ /$thecheck/), got the check working, however returned the false positives, as thought it would.

    Using the followup from bart at 11:28, I got it working as desired.

    As far as taking into account numbers verses words, spaces and even commas and decimal points in the variable, the two variable string are created from database information. They *should* always be numbers, but may not. Any spaces are removed when the information is added to the variable/database, and the commas are added when the actual data is written to the database. So all in all, a check string of 13 against the datastring of say 13,130,213 are very real possibilities, and each one would be written exactly as shown (no spaces, unwanted commas etc), and then read out at a later time for processing.

    The actual content (the different numbers, letters, or words) is irrialavent (have no (human) meaning), however, the numbers (in the case) are id links to other database enteries, thus I compair the information against the databases (data linking). So no information is user entered, rather generated by Perl and/or, in the case, MySQL.

    Based on that, I'm sure someone will say the usage of bart's method if(index(",$thestring,", ",$thecheck,") >= 0) { ... } shouldn't be used, rather I should use something else. However, I tested the array method verses the index method, and I prefer the index method. It does exactly as desired. Perhaps there is a short comming one should be aware of here, considering how I'm actually using it? For example memory limitations.... In any event the two variables will not exceed 10 and 100 chars (including commas) for the respective variable.
Re: Checking for occurrence in comma separated string
by saskaqueer (Friar) on Oct 25, 2005 at 16:03 UTC

    Taking from bart's example, you can also make it one line if you wish. I've also replaced the == with eq, as you said you are "trying to check for the occurance of a string/value withing another string". This way you'll be able to manage strings as well as numbers (well... stringified versions of numbers, which in the world of perl is sufficient for this task).

    my $string = '13,130,213'; my $check = '13'; if ( grep { $_ eq $check } split(/,/, $string) ) { print "'${check}' is in '${string}'\n"; } else { print "'${check}' is NOT in '${string}'\n"; }

    Though if you need to check the string for multiple substring checks, you're better off skipping the one-liner and working off a an array containing the split entities:

    my $string = '100,two hundred,300,400,500,600,700,800'; my @checks = ( qw/200 400 600 700 900/ ); my @items = split(/,/, $string); for my $check (@checks) { if ( grep { $_ eq $checks } @items ) { print "'${check}' is in '${string}'\n"; } else { print "'${check}' is NOT in '${string}'\n"; } }