Dogg has asked for the wisdom of the Perl Monks concerning the following question:

I've used the index function to find occurrences of a substring within a string, but what about finding a regrex? I'd like to search a number of strings for the location of all occurrences of numerous regular expressions. Is there a way to do this (other than manually converting each regrex into each substring and then indexing each one)? Thanks for the help.

Replies are listed 'Best First'.
Re: index with regex
by btrott (Parson) on Jul 14, 2000 at 05:08 UTC
    From perldoc -f pos:
    pos SCALAR pos Returns the offset of where the last C<m//g> search left off for the variable is in question (C<$_> is used when the variable is not specified). May be modified to change that offset. Such modification will also influence the C<\G> zero-width assertion in regular expressions. See L<perlre> and L<perlop>.
    If you want the position in the string where the match *starts* (rather than where it left off, as pos gives you), you could use
    pos() - length $&
    or something. For example:
    my $R = "foo bar quux baz"; while ($R =~ /ba.\b/g) { print pos($R) - length $&, "\n"; }
    Prints:
    4 13
RE: index with regex
by Russ (Deacon) on Jul 14, 2000 at 04:52 UTC
    My first thought is to use a combination of $& (which holds "the string matched by the last successful pattern match") and index.

    Update: pos, as btrott points out, is the right answer. My idea won't find multiple instances of your search string per target string (and besides, why re-invent a perfectly round wheel?).

    my $R = 'Russ Ethan Jason Eric'; $R =~ /Ethan/; print index($R, $&), "\n"; $R =~ /Eric/; print index($R, $&), "\n";
    Prints:
    5
    17

    So, adapting my first code to the better answer, this is how I might look in multiple strings for multiple patterns:

    my @R = ('Russ Ethan Jason Eric JAPH', 'JAPH vroom Ozymandias neshura +Russ'); for (my $i = 0; $i != @R; $i++){ while ($R[$i] =~ /Russ|JAPH/g){ print "Found $& in string $i at: ", pos($R[$i]) - length $&, "\n"; } }
    Prints:
    Found Russ in string 0 at: 0
    Found JAPH in string 0 at: 22
    Found JAPH in string 1 at: 0
    Found Russ in string 1 at: 30

    Russ
    Brainbench 'Most Valuable Professional' for Perl