Usually you'd want a parser, but a tokenizer will do the trick here.
- When you encounter a <a>, set $in_link and emit the token.
- When you encounter a </a>, clear $in_link and emit the token.
- When you encounter text and $in_link is set, emit the token.
- When you encounter text and $in_link isn't set, linkify the urls within and emit the modified text.
- When you encounter anything else, emit the token.
It's much more robust then using regexp patterns, and it's very clear and simple.