My best guess would be to actually attempt to parse it some how, and store the total thing in some sort of datastructure. The simples would be a hash of the form tag => string, then you could just iterate over the values of the hash to ignore the tags, and vice versa. How you would actually parse this string is a bit beyond me, if the tags are truly as simple as you depict here then it should be fairly simple to just use a regex
/<5\w>\w+/ or something, but beyond that you would have to look at some of the
parsers on
cpan.