How to find Unicode: 0x13 in File

dirtdog has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to find Unicode: 0x13 in File by choroba (Cardinal) on Nov 18, 2016 at 15:03 UTC
You haven't shown the "grep commands and perl one liners", so it's hard to tell what's wrong with them. The following finds the character \x13 in a file in bash: `grep $'\023' file` [download] Same in Perl: `perl -ne 'print if /\023/' file perl -ne 'print if /\x13/' file` [download] Finding "Unicode" in a file is not possible if you don't know the encoding of the file. In the examples above, it works for UTF-8 (and probably other ones, too). ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re: How to find Unicode: 0x13 in File (no magic bullet!) by Discipulus (Canon) on Nov 18, 2016 at 15:21 UTC
finding "Unicode" in a file is not possible if you don't know the encoding of the file. Indeed! just to add something take a look at tchrist about Perl and Unicode: No magic bullet (SO) L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l]
Re^2: How to find Unicode: 0x13 in File by james28909 (Deacon) on Nov 18, 2016 at 16:40 UTC
I may be wrong here, but if I am, I will learn something new :) Could you not read in few MB's of the file (if it is big enough) and then unpack it and then test to see if the character matches 0x13? Something like: `open (my $fh, '<', 'file') or die "$!\n"; binmode($fh); while(read $fh, my $char, 0x01){ $buf = unpack('H*', $char); if ($buf =~ /13/){ print "found 0x13\n" } }` [download] Contents of 'file': '.Eg5™eEfx`.' #'.' = 0x13; [download] Im not up to par on unicode so I could be way off.	[reply] [d/l] [select]
Re^3: How to find Unicode: 0x13 in File by choroba (Cardinal) on Nov 18, 2016 at 16:51 UTC
> 0x01 Why do you specify the length in hex? Also note that if you use a length greater than 1 (which you want to speed it up), you can find false positives: `read $fh, my $char, 2` reports 0x13 present in the following file: `a1` [download] because `$ perl -wE 'say unpack "H", "a1"' 6131 ~~` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7*2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^4: How to find Unicode: 0x13 in File by james28909 (Deacon) on Nov 18, 2016 at 17:24 UTC
Re^5: How to find Unicode: 0x13 in File by AnomalousMonk (Archbishop) on Nov 18, 2016 at 19:57 UTC
Some notes below your chosen depth have not been shown here
Re^2: How to find Unicode: 0x13 in File by dirtdog (Monk) on Nov 18, 2016 at 15:20 UTC
I was using the following which did not work : `perl -ne 'print "$ARGV:$.\n" if /[^[:ascii:]]/;' $filename grep -e "[\x{00FF}-\x{FFFF}]" $filename` [download] The Command you sent worked perfectly Thanks!	[reply] [d/l]
Re^3: How to find Unicode: 0x13 in File by choroba (Cardinal) on Nov 18, 2016 at 16:14 UTC
> did not work And here's why: `[:ascii:]` matches character in the range 0-127. 19 doesn't belong between 255 and 65535. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]