in reply to Regular Expression To Extract Multiple Matches Pattern

Use the g modifier in the while loop to iterate over the string:
$teststring = 'blah not-so-good blah not-too-shabby '; while ($teststring =~ /([a-z]+-[a-z]+-[a-z]+)/gi) { print "$1\n"; }

Replies are listed 'Best First'.
Re: Re: Regular Expression To Extract Multiple Matches Pattern
by rob_au (Abbot) on Jan 07, 2002 at 16:18 UTC
    I must be missing something - Why are we using the character set match of [a-z] in place of \w ? The use of \w would make the resulting code a lot more readable. Eg.

    while ($teststring =~ /\b(\w+-\w+-\w+)\b/gi) { print "$1\n"; }

    Also too, the boundary markers \b as suggested in the reply by Kanji have merit and I think warrant inclusion.

     

    Update

    As busunsl rightly points out, \w includes the underscore character in matching which has not been specified for inclusion ... [\w[^_]] anyone? :-)

     

    perl -e 's&&rob@cowsnet.com.au&&&split/[@.]/&&s&.com.&_&&&print'

      Perhaps because \w includes the underscore and that was not asked for.
      [\w[^_]]
      Nested character classes aren't implemented yet... That will parse somthing like this:
      [ # start char class \w # any word char [ # or a literal '[' ^ # or a literal '^' _ # or an underscore (redundant...) ] # end char class ] # followed by a literal ']'
      If you want a character class consisting of all the word chars except underscore, you need to use the double negative (and somewhat non-intuitive):
      [^\W_]
      Which matches a character that is not a non word char (i.e a word char) and not an underscore.
      % perl -le '/[^\W_]/ && print for qw(a b _ c d)' a b c d

      -Blake