The wanted solution is not well specified though. In your example, it would seem equally valid to return:
<5b>I<5c>like <5d>yummy tacos
(corresponding to a different place to put the (<\w\w>)?).You'll have to state this doesn't matter or state a way to resolve this ambiguity.
That is good point. The tags are associated with the word immediately following them, so it would be nice if I could associate them with the first character of that word.