Ask for what you want matched. In this case, you most likely want everything up to the next double-quote, yes? If so, something like this should work:
my ( $code ) = ( $line =~ /code="([^"]+)"/ );
The use (and abuse) of regexes to match HTML content has been beaten to death. If you want stronger results, consider using a module designed to parse HTML. This is also covered in the fantastic book Mastering Regular Expressions.
Some cases to watch out for:
<!-- watch out for greedy matching --> <tag code="blah" attr="nothing"> <!-- and for less-than characters in attribute values (which is likely illegal, but HTML in the wild is notoriously nasty this way) --> <tag code="<bang!>"> <!-- finally, make sure you can handle multiple-line tags --> <tag foo="bar" code="nothing">
In reply to Re: Removing certain non-word characters
by tkil
in thread Removing certain non-word characters
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |