in reply to Regex Subexpressions

I'm not clear on exactly what you're trying to do.  (Especially when you change the original post, and then change it back -- don't do that!)

Is it your intent to match the strings "a", "ab" and "abc" -all- from just the input text "abc"?  I don't know if that's possible...

If you could maybe supply examples (even just short ones, such as a dozen of the search terms from which the list will be dynamically generated), with a little more verbosity on what results you're hoping for and *why*, maybe some of us can more expeditiously point you towards a viable solution.

Replies are listed 'Best First'.
Re^2: Regex Subexpressions
by BenjiSmith (Novice) on Sep 09, 2005 at 23:29 UTC
    Yeah, sorry for changing the post around. I thought I was replying to myself when in fact I was editing the original post.

    Anyhow, here's a more concrete example:
    @keywordList = ('john', 'john.smith', 'john.smith@mail.com'); $combinedExpression = combine(@keywordList); # The combined expression looks something like this: # (john(?:\.smith(?:\@mail\.com)?)?) $searchText = "john's username is john.smith and his email address is +john.smith@mail.com"; while $searchText =~ /$combinedExpression/g { print "$1\n"; }
    For this example, I expect to get these results:

    john
    john
    john.smith
    john
    john.smith
    john.smith@mail.com

    Essentially, for every occurrence of every one of my keywords, I need to get a result, even if those keywords occur within other keywords in the input text.

      For one possible implementation of your combine subroutine, consider:

      use Regexp::Assemble; sub combine { my $str = Regexp::Assemble->new->add(@_)->as_string; qr/($str)/; }

      - another intruder with the mooring in the heart of the Perl

Re^2: Regex Subexpressions
by BenjiSmith (Novice) on Sep 09, 2005 at 23:47 UTC
    Great looking solutions you guys, but I actually have a few additional design constraints:

    1. No perl code embedded in the regex. After demonstration of the prototype in perl, it will be implemented in the product using Java, so it must be compatible with the Java regex engine.

    2. The regex should only have to be compiled once, so no rewriting of the regex string after starting to iterate through the matches.
      Whoops -- there's you're last reply.  I didn't see it when I was answering your last one.

      I'm going to have to throw in the towel -- what you're asking for is out of my league!  Isn't there some way you could do the equivalent thing in Java code though?   Anyway, good luck!