dhackbar has asked for the wisdom of the Perl Monks concerning the following question:

I've got a little regular expression problem that is bugging me.

I've got a string with this text "(7001) - This is some text" and I want to see if it matches this pattern: "([7000|7001|7002]) - This".

Basically, I want to have something like this (which doesn't work).

my $full_text = "(7001) - This is some text"; my $check_str = "([7000|7001|7002|7003]) - This"; if ($full_text =~ /$check_str/) { print "full_text has the check_str\n"; }

Thanks! Derek

Replies are listed 'Best First'.
Re: Regular Expression Help
by GrandFather (Saint) on Nov 15, 2005 at 23:55 UTC

    Me thinks you have the wrong end of the stick. You seem to want to match 7000 or 7001 or 7002 or 7003, but you have put that in a character class so what is being matched is one of '0', '1', '2', '3', '7', or '|'.

    You also have a problem with your '(' and ')' - they are capture parenthesis which is not what you want. Probably what you really want is:

    use warnings; use strict; my $full_text = "(7001) - This is some text"; my $check_str = '\((?:7000|7001|7002|7003)\) - This'; print "full_text has the check_str\n" if $full_text =~ /$check_str/;

    Note the use of ' rather than " so you don't have to quote the \ (as in \\) to quote the '(' and ')' and that a non-capture group '(?:...)' is used for the alternate group. Because the 700 is common $check_str could become '\(700[0123]\) - This' - using a character class to match the last digit.

    Take a look at perlretut, especially at the 'Matching this or that' section and the 'Using character classes' section.

    Update: fix various typos and omissions


    Perl is Huffman encoded by design.
      The one problem I have is that the $check_str is being provided as an input to my routine from a different application, so I don't have control over the syntax :(

      I'll see if I can manipulate the check_str once it's in my code though with those suggestions.

      Derek

        You'll have to figure out exactly what syntax it's using, since it's different from the normal regexp syntax. Here's a quick hack based on a guess of what you want, but it may not be robust.
        my $full_text = "(7001) - This is some text"; my $check_str = "([7000|7001|7002|7003]) - This"; $check_str =~ s/([()])/\\$1/g; $check_str =~ s/\[/(?:/g; $check_str =~ s/\]/)/g; if ($full_text =~ /$check_str/) { print "full_text has the check_str\n"; }

        In that case (and assuming you have no control over the format) you have a problem of a whole different order of complexity. What you need to achieve in that case is to map the input syntax to Perl's regex syntax and that is likely not a trivial task.

        At the very least you will require a formal description of the input syntax and, from that, you will have to work out how to achieve the same results using Perl's regex facilities. Watch out along the way for security issues. For example if the user can introduce (?{die}) (for example) into the search string and that gets passed through to the regex, unfortunate things may happen :).


        DWIM is Perl's answer to Gödel
Re: Regular Expression Help
by Roy Johnson (Monsignor) on Nov 15, 2005 at 22:26 UTC
    Square brackets are for character classes, not for alternation in general.

    Caution: Contents may have been coded under pressure.
Re: Regular Expression Help
by monarch (Priest) on Nov 15, 2005 at 22:21 UTC
    You're almost there.. just missing the parenthesis..
    my $full_text = "(7001) - This is some text"; my $check_re = qr/\((7001|7002|7003)\) - This/; if ($full_text =~ /$check_re/) { print "full_text has the check_str\n"; }
Re: Regular Expression Help
by injunjoel (Priest) on Nov 15, 2005 at 22:43 UTC
    Greetings,
    I guess you could use a character class... more of a range really.
    my $full_text = "(7001) - This is some text"; if($full_text =~ /\(700[1-3]\) \- This/){ print "full_text hash the check_str\n"; }


    -InjunJoel
    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo