in reply to Rogue Null (ordinate 0) characters in text files
split /\.*/ is wrong. (So is split /.*/ which is probably what you intended.) You don't have a list of items seperated by 0 more or more periods. You want split //.
split /\.*/ results in you only checking the first character of every line and the ones after periods, and it results in you doing ord('') (which returns zero) when the line starts with a period.
There are other issues, though.
use strict; use warnings; @ARGV == 1 or die("Usage: $0 FILE\n"); my $in = shift; open(INFILE, '<', $in) or die("Can't open $in for reading: $!\n"); while (<INFILE>) { chomp; for my $char (split //) { my $ord = ord($char); if ( $ord < 9 || ($ord > 10 && $old < 32) || $ord > 126 ) { print("$in contains illegal character \(ord:\ ", "$ord\) on line + $.\n"); } } }
It's usually better to use <> instead of opening the file yourself. This allows input to be read from STDIN if no filename is provided.
use strict; use warnings; while (<>) { chomp; for my $char (split //) { my $ord = ord($char); if ( $ord < 9 || ($ord > 10 && $old < 32) || $ord > 126 ) { print("$Input contains illegal character \(ord:\ ", "$ord\) on l +ine $.\n"); } } }
Furthermore, you could use a regexp instead of splitting.
use strict; use warnings; while (<>) { chomp; while (/[^\x09\x20-\x7E]/g) { print("Input contains illegal character \(ord:\ ", "$ord\) on line + $.\n"); } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Rogue Null (ordinate 0) characters in text files
by almut (Canon) on May 08, 2008 at 00:57 UTC | |
by Thelonius (Priest) on May 08, 2008 at 01:14 UTC | |
by almut (Canon) on May 08, 2008 at 01:48 UTC |