Yllar has asked for the wisdom of the Perl Monks concerning the following question:
I am working on Windows Environment, I want to trim all non-Ascii characters and want only ascii range characters,numbers and symbols.Please help
My Input Was :
This is a simple text just for test purpose only ascii text 12345678910-=[];'#/.,-! " £ $ % ^ & * ( ) _ + { }~@:<>?|–
Now I am using JSON to decode my input data which decodes it as follows:
This is a simple text just for test purpose only ascii text12345678910-=[];\'#/.,\\-!\"\u00A3$%^&*()_+{}~@:<>?| \u2013
Now I am sending this decoded data to my Program to replace this unicode(utf-8) and other non-ascii characters with space/or some printable characters(I mean i want to print only ascii range characters) So, I tried all of the following in perl.
use strict; use warnings; use JSON; use LWP::UserAgent; use utf8; #Due to some security reasons I am not mentioning the url,hope u under +stand my $ResRef = sendHTTPRequest($someurlRequest); my $string = $ResRef->decoded_content;#I used json decode to decode co +ntent my $string = transalte_replace($string); sub transalte_replace { my $string = shift; for($string) { s/\\u[0-9]+/1-/g; s/\\u[a-zA-Z0-9\+]*/2-/g; s/\\x\{[a-zA-Z0-9]*\}/3-/g; s/[^\p{ASCII}]/-/g; s/[^\u0000-\u007F]+/replace1/g; s/[^\x00-\x7F]+/rep/g; s/[^\p{ASCII}]/-/g; s/[^A-Za-z0-9\.,\?'""!@#\$%\^&\*\(\)-_=\+;:\<\>\/\\\|\}\{\[\]`\~ +]+/y/g; #s/[£]//g; s/[^\x20-\x7E]+/replace3/g; #s/\\u[0-9]+/2-/g; #s/\\x[a-z0-9]+/3-/g; #s/[^\x00-\x7F]/4-/g; } }
The output still is:
"This is a simple text just for test purpose only ascii text12345678910-=[];'#/.,\-!\"\x{a3}\$%^&*()_+{}~\@:?|\x{2013}";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regex to trim non Ascii characters
by trippledubs (Deacon) on Sep 26, 2015 at 11:09 UTC | |
|
Re: Regex to trim non Ascii characters
by Albannach (Monsignor) on Sep 26, 2015 at 18:02 UTC | |
|
Re: Regex to trim non Ascii characters
by Anonymous Monk on Sep 27, 2015 at 18:51 UTC | |
|
Re: Regex to trim non Ascii characters
by nikosv (Deacon) on Sep 27, 2015 at 17:45 UTC |