markjrouse has asked for the wisdom of the Perl Monks concerning the following question:

How can I search for a specific string pattern that has variations on the matches to that pattern?. I'm trying to find two regexp patterns:

So I use (\s-\r)|(:\r) This works, but its pattern 2 that matches different variations. As an example, in my text file, I can have different pattern 2 cases like this:

What I'm looking for is to perhaps modify my reg exp in such a way that pattern 2 only matches match2a. I want to exclude the match2b/c matches. I thought along the lines of:

(\s-\r)|(:\r)([^:\r(\d|\w]))

But of course this doesn't work. Any suggestions.

Replies are listed 'Best First'.
Re: Reg Exp to handle variations in the matched pattern
by moritz (Cardinal) on Feb 22, 2012 at 13:12 UTC

    I don't understand your question. It would be nice if you provided several pieces of text that are supposed to match, and several that are supposed not to match, and what problem you encounter.

    One thing that looks suspicious is your use of character classes. For example [^:\r(\d|\w] matches everything except the colon, \r, the vertical pipe, the opening paren, digits and word characters. That's not what you want, is it?

    Also your last regex has an imbalanced )

    What I'm looking for is to perhaps modify my reg exp in such a way that pattern 2 only matches match2a

    The regex /^this is text:\r$/ would do that trick. Is that what you want?

      Essentially, it's match any text where there is:

        a space, followed by a dash, followed by a carriage return OR a colon, followed by a carriage return BUT NOT a colon, followed by carriage return, followed by a digit, or a letter.

      One of the text files is actually located here: http://www.treasury.gov/resource-center/sanctions/SDN-List/Documents/sdnew02.txt

      I'm not interested in the text before the colon, as I want to search and replace, but having problems getting the regexp just write.

        a space, followed by a dash, followed by a carriage return OR a colon, followed by a carriage return

        So far that's simple / -[:r\r]\r/

        BUT NOT a colon, followed by carriage return

        If you're looking for two carriage returns in a row, then you'll never find something where the first carriage return is followed by a colon (because then it's not two carriage returns in a row, d'oh), so I don't see why you emphasize it like that.

        followed by carriage return, followed by a digit, or a letter.
        \r\w
        One of the text files is actually located here: http://www.treasury.gov/resource-center/sanctions/SDN-List/Documents/sdnew02.txt

        The pattern you describe matches nowhere in that file; in fact I can't find a single occurence of a carriage return in that file.

        If you describe what information you want to extract from that file, we might be able to help you. But right now it seems that you don't have a clear mental image yourself, so it's pretty hard to help you.

Re: Reg Exp to handle variations in the matched pattern
by bitingduck (Deacon) on Feb 22, 2012 at 15:46 UTC

    I agree that something's not quite clear about what you're trying to match.

    Do you want the \r to be at the end of the line? Then you can use the $ anchor at the end of the regex.