Encoding all HTML entities except

TrixieTang has asked for the wisdom of the Perl Monks concerning the following question:

How would one go about encoding all HTML entities in a string except for < and >? I've tried using HTML::Entities, but it ends up encoding < and > and breaking the HTML. I've also tried tinkering with the unsafe_characters parameter in HTML::Entities but I still can't seem to get it to allow < and >.

Comment on Encoding all HTML entities except < and >

Replies are listed 'Best First'.
Re: Encoding all HTML entities except < and > by haukex (Archbishop) on Oct 15, 2019 at 19:18 UTC
That seems like a bit of a strange request to me, could you explain some more what you need this for? Anyway, this works for me: `use HTML::Entities; my $str = "'Hello\" & <World>"; encode_entities($str, q{&"'}); print $str, "\n"; # prints 'Hello" & <World>` [download] But what do you mean by "all HTML entities"? Do you include e.g. non-ASCII characters in that definition? (And what encoding are you using for your HTML files?) Could you show some representative example input and the expected output for that?	[reply] [d/l]
Re^2: Encoding all HTML entities except < and > by TrixieTang (Sexton) on Oct 15, 2019 at 20:19 UTC
Yeah, I'm not even sure what I was thinking. After thinking more about this, this question now seems incredibly stupid and weird even to me. I was trying to get a module called HTML::WikiConverter to work, but I keep getting either wide character errors or garbled text when trying to use it. I'm not sure what the hell I was even thinking trying to run the string through HTML::Entities first. Looking at the CPAN page for the HTML::WikiConverter module now, I realize that the module has a bunch of reports of UTF-8 issues - which is obviously causing the issues that I've been having.	[reply]