in reply to Re: HTML::Parser, get rid of JavaScript
in thread HTML::Parser, get rid of JavaScript

# remove wide non ascii chars

Why would you want to do that?

Usually characters are in a string because they carry information - removing them by such a blind criterion as codepoint ranges almost surely implies data loss.

There are many pages on the internet where next nothing remains if you remove all non-ASCII chars.

Replies are listed 'Best First'.
Re^3: HTML::Parser, get rid of JavaScript
by tachyon-II (Chaplain) on Jun 25, 2008 at 16:26 UTC

    Why would you want to do that?

    Fair point. In the application I cut and pasted it from I did want only ascii text..... I've commented it out