You want to capture the catalog number, but instead you're matching the anchor text, and never even looking for what comes after it.
Try this.
/Catalog Number:\s+(\w+)/
Update: (The first part of this node was posted from a smartphone, and pecking out markup and other symbols was unpleasant enough that I avoided my usual verbosity, which will now follow):
That anchors on "Catalog Number:" followed by any amount of whitespace, and then captures all contiguous "word" characters that follow, which would include alpha, numeric, and underscore. $1 would hold the catalog number in a successful match.
Anyone who mentioned you ought to parse HTML with a proper parsing module is correct though. Regexp solutions are fragile. It's strange that when we take our car to the mechanic we never say, "I want you to fix it using only a 12mm socket wrench." But people think nothing of coming for advice on parsing HTML, and in the same breath suggest that we ought to adapt our solutions to use only regular expressions, avoiding the vast array of other tools, many of which are more suitable for the task.
Dave
In reply to Re: parsing hmtl file with regex
by davido
in thread parsing hmtl file with regex
by PanchoAguirre
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |