in reply to Using variables in regex search

If you're looking for a literal string, use index instead of a regex. If the string is a regex itself, then what you have is fine.

Replies are listed 'Best First'.
Re^2: Using variables in regex search
by lancer (Scribe) on Aug 14, 2011 at 13:29 UTC
    Thanks, Tanktalus!

    I'm looking for a literal string. But I would like to find all occurrences of it, not just the first one (index gives only the first match).

    A regex search could iterate over all matches. But I think I have to escape the contents of $str, because it's not a regex expression itself...

      Now it's sounding like an XY Problem. What do you need to do with all found occurrences? Modify them? With a literal string, I can only think of three things to do with it: check if it's there (boolean - a single match is sufficient), modify it (s/../../), or count it. And the vast majority IME is the first one. Only the modification one "needs" a regex, and even that isn't really true.

      That all said, index can find multiple matches as well:

      $ perl -lE ' my ($haystack,$needle)=@ARGV; my $i=0; my @found; while(-1 != (my $curidx = index $haystack, $needle, $i)) { push @found, $curidx; $i = $curidx+1 }; say "found at $_" for @found ' abcsdfabcasegabc abc found at 0 found at 6 found at 13
      If you're doing a modification, just use rindex - it's even easier, use rindex (though this will be a bit slower for longer strings with many matches).
      $ perl -lE ' my ($haystack,$needle,$new)=@ARGV; while(-1 != (my $curidx = rindex $haystack, $needle)) { substr $haystack, $curidx, length($needle), $new }; say "new string: $haystack" ' abcsdfabcasegabc abc foo new string: foosdffooasegfoo
      The only challenge with this method is if $new contains $needle in it - then it won't work.

      If you are going with the substitution and want to use a regex (probably safer once you escape it), use "\Q" before your string:

      s/\Q$str\E/$new/g; # since \E is at the end, it's not really required.
      Hope that helps.

        I would like to find the positions of all the strings in the source text. The strings mark special positions in the source that need to be processed.

        Perhaps that is the first case in your example, where finding the strings is enough.

        I haven't realized index can be used like that, to find multiple matches. Or at least I imagined it would need more code than that. I'm going to use that solution now, instead of a regex.

        Thank you for all the help!
        On a related note to my Edgar database search, is there a way to load a list of regex search patterns from a database?

        I want to convert this long list of REGEX matches into a list being sourced from a select statement on a database

        Is that possible?

        select CONCAT(DATA,'...') FROM TABLE

        which comes out like this:

        06054E...

        063679...

        06369N...

        06374V...

        06417Y...

        06418E...

        but I want to substitute this list into this regular expression.

        if ($lines=~ /(06054E...|063679...|06369N...|06374V...|06417Y...|06418E...|)/) print "$1\n";
        Another Way:
        ($str, $substr, $newsubstr) = @ARGV; #while ( -1 != ($current = rindex($str, $substr))) { # substr $str, $current, length($substr), $newsubstr; #} $i = 0; while ( -1 != ($current = index($str, $substr, $i)) ) { substr $str, $current, length($substr), $newsubstr; $i = $current + length($newsubstr); } $\ = $/; print $str;