in reply to Re^4: pattern matching once
in thread pattern matching once

You will have to show some runnable code where the \b fails. Both of your example lines work fine in my example code.

\b means approximately "word boundary". Any white space character (space or \n or other such character like \t) satisfies that boundary condition. End of the string also satisfies that boundary condition (i.e. having no character following ".htm").

What do you mean by " so the /b didn't work all the time"?

Look carefully and make sure that there is no space before the \b in:
if (my ($doc_title) = $line=~ m/<FILENAME>(.*\.htm)\b/) {

Replies are listed 'Best First'.
Re^6: pattern matching once
by justin423 (Scribe) on Aug 12, 2023 at 03:08 UTC
    An HTML space &nbsp.
      That still works; "<FILENAME>dp198076_424b2-us2342673.htm&nbsp\n",
      That is because \b is a word to non-word boundary. & is not a word character. Word characters are the ones that you can use in a Perl variable name. [a-zA-Z0-9_]

      So we are back at the same problem, you say that there is a problem, but refuse to show any actual code.
      If you are actually parsing an HTML doc, you should be using one of the HTML decoder modules before trying to use regex. I believe that haukex has posted some links on that subject. I think you are well advised to read his post in detail.