To get a picture of the byte values in the file (to see what might be causing those empty boxes), you could just do this:
If you run that script on your text file and save the output to some other file, like this:#!/usr/bin/perl while (<>) { chomp; $c{$_}++ for (split //); } printf("%02x : %s : %d\n",ord($_),$_,$c{$_}) for(sort keys %c);
you can then look at the "char_list.txt" file to see which hex byte values occur in the data and show up as empty boxes in notepad.perl that_script < your_file.txt > char_list.txt
If the file happens to be utf8 unicode, you might try this other tool, which I posted here a while back: unichist -- count/summarize characters in data
Run it like this:
and look at that output with notepad. (Actually, you'll want to modify the "unichist" script so that it does print "\x{feff}\n"; before doing anything else -- this will put the "byte-order-mark" (BOM) character at the start of the output file, which will tell notepad to treat the file as utf8 data.)perl unichist -x < your_file.txt > char_list.txt
Once you know what byte/character values are causing the empty boxes, you'll be able to decide how to fix or remove them.
In reply to Re: noobie control char removal
by graff
in thread noobie control char removal
by desertman
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |