in reply to Re: Another regexp question
in thread Another regexp question

Hi There,

In asking my question for clarification I belive I have answered myself but anyway:

Can I have some clarification on the wonderful line:

my %prices = $str =~ m/((?:$animal)s?)\s(?:is|are)\s(\$\d+)/g;
As I understand it the brackets are use to return the values as $1, $2 .. $9

It looks like (?: .. ) is something special and is not returning a numbered variable so all we get out are two variables $1 and $2 used respectively in the hash as the key and value?

I have often wanted to check to see if a string contains one of multiple sub-strings.. I assume I could do it like this:

use strict; my $something = 'This is a bang of a bing thing'; if($something =~ m/((?:bing|bong|bang))/i) { print "Found '$1' in '$something'\n"; }
Woo Hoo!
Found 'bang' in 'This is a bang of a bing thing'
It returned the first one found.. which is fair.. I wonder if this could return all matches found, in this case 'band' and 'bing'?

And could I check for a string contain anything from a list?

... my @list = qw / bing bong bang /; if($something =~ m/(?:@list)/i) ...
naturally does not work :-(

thanks

___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com

Replies are listed 'Best First'.
Re: Re: Re: Another regexp question
by Roger (Parson) on Nov 20, 2003 at 04:32 UTC
      It looks like (?: .. ) is something special and is not returning a numbered variable so all we get out are two variables $1 and $2 used respectively in the hash as the key and value?

    You bet. ;-) The ?: in the bracket tells Perl not to capture the pattern inside the bracket. You can find the documentation on (?:pattern) on the CPAN perlre documentation here

      And could I check for a string contain anything from a list?

    Well, yes you can. The method I use is to construct the search pattern with a join, as the following example demonstrates -
    my $something = 'This is a bang of a bing thing'; my @list = qw /bing bong bang/; # want to search for these my $list = join '|', @list; # construct my pattern if($something =~ m/($list)/i) { print "Found '$1' in '$something'\n"; }
    If you want to capture all occurances of the patterns, you could use the @array = $str =~ m/pattern/g idiom.
    my @search = $something =~ m/($list)/ig; # <- added the g modifier
    or you could do this in a while loop -
    while ($something =~ m/($list)/ig) { print "Found '$1' in '$something'\n"; }
    The problem with your code is that m/(@list)/i is looking for the pattern of the interpolated list items, the pattern "bing bong bang", in the string, and of cause it is not found.

    use strict; my $something = 'This is a bang of a bing thing bing bong bang'; my @list = qw / bing bong bang /; if ($something =~ m/(@list)/i) { print "Found '$1' in '$something'\n"; }
    And the output is -
    Found 'bing bong bang' in 'This is a bang of a bing thing bing bong ba +ng'
      Ah.. thankyou.

      So what I am reading between the lines here is the OR ability in regex.. which I iether never knew, or knew and forgot.

      So the next question (and you dont have to answer it cos just pondering outloud) is: where does the OR finish and the next thing begin?

      $string = 'this is a bing'; # sample strings.. $string = 'bing is my name'; $string = 'cows go bonging'; $string = 'cows go bang99'; $string =~ m/^bing|bong|bang\d\d/;
      Would I need to put ^ infront of each OR case of I want them to match at the beginning of the line only?
      Similarly if I want all to only match if the end with \d\d do I include it at the end or in each case?
      How does it know the end of the start of the first OR case and the end of the last OR case?
      ___ /\__\ "What is the world coming to?" \/__/ www.wolispace.com
        Hi wolis, you have to be more specific with what you are searching for in your regular expression.

        Your regular expression will look for ^bing (bing at the start), bong and bang\d\d (anywhere on the line). If you want to search for all these at the beginning of the line, you can add brackets - ( ... ) -
        m/^(?:bing|bong|bang\d\d)/
        Note that I also added '?:' in front of the patterns to tell Perl not to capture any matches for just a bit faster.

        The regex "or" (Alternation, it's called) has fairly low precedence. That means that the ^ binds more closely than the |. The result is that you've got this going on:

        m/ ^bing | bong | bang\d\d /x;

        I used an "extended regular expression" so that I could group each subexpression (each alternate) on its own line. If you want the ^ to bind to all three, and the \d\d to bind to all three, you must use parenthesis to constrain the alternation. And if you aren't trying to capture, use non-capturing parens:

        m/^b(?:i|o|a)ng\d\d/;

        (Note: I factored out everything that is common to all three alternates. That step is unnecessary. You could use (?:bing|bong|bang) too.)

        Alternation may be the best route to follow. But sometimes when you see it factored down as the previous example, you might suddenly realize, hey, I can do this with a character class too:

        m/^b[ioa]ng\d\d/;


        Dave


        "If I had my life to live over again, I'd be a plumber." -- Albert Einstein