in reply to Removing certain non-word characters

Ask for what you want matched. In this case, you most likely want everything up to the next double-quote, yes? If so, something like this should work:

my ( $code ) = ( $line =~ /code="([^"]+)"/ );

The use (and abuse) of regexes to match HTML content has been beaten to death. If you want stronger results, consider using a module designed to parse HTML. This is also covered in the fantastic book Mastering Regular Expressions.

Some cases to watch out for:

<!-- watch out for greedy matching --> <tag code="blah" attr="nothing"> <!-- and for less-than characters in attribute values (which is likely illegal, but HTML in the wild is notoriously nasty this way) --> <tag code="<bang!>"> <!-- finally, make sure you can handle multiple-line tags --> <tag foo="bar" code="nothing">