Category: | Utilities |
Author/Contact Info | fmogavero
fmogo@mninter.net |
Description: | This script will read a file byte by byte and send messages to the screen to signal bytes that are out of the ASCII text range. A client sent us a file with bad data and it totally hosed their data in the database. |
use strict;
my $position = 0;
my $line = 1;
my $oldbyte = 0;
my $filesize = -s $ARGV[0];
my $byte;
my $oldbyte;
print "File size is $filesize bytes.\n";
open(INPUT,$ARGV[0]) || die "can't open $ARGV[0]:\n";
while ($position < ($filesize - 1)){
read INPUT, $byte, 1, 0;
my $val = ord $byte;
if ( $val == 10 ) {
if ( $oldbyte ==1) {
$line++;
$oldbyte = -1;
}
$oldbyte++;
}
if ( ($val < 32 && $val != 10 || $val > 126) ) {
print "Line $line byte value $val at offset $position is out of ASCI
+I text range!\n";
}
#print ord $byte,"\n";
$position++;
undef $byte;
seek(INPUT, $position, 0);
}
print "$line lines in file!\n";
|
|
---|
Replies are listed 'Best First'. | |
---|---|
•Re: Find illegal ASCII characters
by merlyn (Sage) on Mar 07, 2002 at 19:18 UTC | |
by Anonymous Monk on Mar 07, 2002 at 22:40 UTC | |
by demerphq (Chancellor) on Mar 08, 2002 at 19:09 UTC | |
by Juerd (Abbot) on Mar 08, 2002 at 20:05 UTC | |
by IlyaM (Parson) on Mar 09, 2002 at 01:06 UTC | |
| |
Re: Find illegal ASCII characters
by ww (Archbishop) on Mar 11, 2005 at 20:45 UTC | |
Re: Find illegal ASCII characters
by fmogavero (Monk) on Mar 07, 2002 at 20:14 UTC |