If what you mean by "doesn't match" is "spews lots of errors", read on...

use strict; use warnings; $_ = 'Page 1 of 5'; if(Page\&nbsp\;1\&nbsp\;of\&nbsp\;(/d+)) { print "Number of Pages = ".$1; } __OUTPUT__ Backslash found where operator expected at test.pl line 9, near "Page\ +" Backslash found where operator expected at test.pl line 9, near "&nbsp +\" (Missing operator before \?) Backslash found where operator expected at test.pl line 9, near "1\" (Missing operator before \?) Backslash found where operator expected at test.pl line 9, near "&nbsp +\" (Missing operator before \?) Backslash found where operator expected at test.pl line 9, near "of\" Backslash found where operator expected at test.pl line 9, near "&nbsp +\" (Missing operator before \?) syntax error at test.pl line 9, near "Page\" Search pattern not terminated at test.pl line 9.

Ok, let's solve this one step at a time. First, the regexp operator is m//, which can be abbreviated as // most of the time. I don't see any regexp operator in your code. So we'll correct that part...

use strict; use warnings; $_ = 'Page&nbsp;1&nbsp;of&nbsp;5'; if(/Page\&nbsp\;1\&nbsp\;of\&nbsp\;(/d+)/) { print "Number of Pages = ".$1; } __OUTPUT__ Unmatched ( in regex; marked by <-- HERE in m/Page&nbsp;1&nbsp;of&nbsp +;( <-- HER E / at test.pl line 9.

Hmmm, what's this unmatched ( in regexp business? Oh, I see. You've got (/d+)/. The regexp thinks that the '/' in /d+ is the end of the regexp. You probably really meant the \d+ metacharacter and quantifier. So we'll fix that...

use strict; use warnings; $_ = 'Page&nbsp;1&nbsp;of&nbsp;5'; if(/Page\&nbsp\;1\&nbsp\;of\&nbsp\;(\d+)/) { print "Number of Pages = ".$1; } __OUTPUT__ Number of Pages = 5

Viola, it works!

Of course this makes the assumption that you're testing your regexp against a string held in $_. If instead you're testing against a string held in some other scalar variable, such as $string, you'll need to use the binding operator also. The binding operator is '=~', and is used like this:

$string =~ m/regexp goes here/

See perlretut and perlrequick for an introduction to Perl's regular expressions. For additional reading, you can dive into perlre and perlop.

Update: I see I've wasted my time, because your original question wasn't really the question you wanted to ask. It is foolish to retype your code when inserting it here. Cut and paste it, or boil it down to a tiny script that replicates the behavior and cut and paste that. Retyping it obviously introduced numerous other errors and led us down the wrong path toward correcting them. Your real problem, assuming you've now typed it correctly, is probably that your input text is not what you think it is.


Dave


In reply to Re: Escaping Regex Expressions by davido
in thread Escaping Regex Expressions by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.