in reply to Dealing with Word Compact HTML
Like others, I'd strongly recommend using something like HTML::Parser.
That said, if you really don't want to parse HTML for real, you can work around the problem by slurping the whole file into a single scalar and searching for the tags using a regex with the /s modifier. But be careful. <b> tags can in fact have attributes, like this: <b style="font-size: 200%">. Your regex will not catch cases like that, though if you have sufficient control over the formatting of the original documents this may not be a problem.
$perlmonks{seattlejohn} = 'John Clyman';
|
|---|