The stupid question is the question not asked | |
PerlMonks |
Regex Questionby HamNRye (Monk) |
on Mar 26, 2009 at 16:49 UTC ( [id://753454]=perlquestion: print w/replies, xml ) | Need Help?? |
HamNRye has asked for the wisdom of the Perl Monks concerning the following question: Greetings Monks. I am working converting some text from a legacy system to good old fashioned XHTML. I have a problem that I can't get my brain around and was looking for some assistance. The system in question uses tags with hex control characters. so a formatting tag might look like this: Here is an example of the data:
Ideally I would like my output to be:
I want to remove the tag up to the \x90 char... But it will not always be there. I do not want to match a \x90 later in the file and truncate the data. I want to match from the beginning of the tag to the first word character that is NOT inside of angle brackets. Here is the regex I've been using. s/<\/bug[^>]*>[^9D]*\x9D.*?\x90//isg I'm matching with the \x90 and then without, but wound up with the first match being too greedy. I have tried more complex regexes using look ahead/lookbehind asertions... Couldn't get those working. Your help is much appreciated.
Back to
Seekers of Perl Wisdom
|
|