Hello,

The following code is supposed to check for "illegal" characters in a text file that I pass to it.
I define illegal as ordinates 0-8, 11-31 or >126.
BEGIN{ @ARGV == 1 or warn "\n\tUsage: $0 FILE\n\n" and exit 255; $in=shift; warn "Error: Can't read input file $in\n" and exit 255 if ! -s $in; } open (INFILE, $in) or die "Can't open $in for reading\n"; while (<INFILE>) { $line = $_; chomp $line; @character=split /\.*/; while (@character){ my $character = shift @character; $ord = ord($character); if ($ord>126) { print "$in contains illegal character \(ord:\ ", "$ord\) on line + $."; } elsif ($ord<32) { if ($ord<9) { print "$in contains illegal character \(ord:\ ", "$ord\) on li +ne $."; } if ($ord>10) { print "$in contains illegal character \(ord:\ ", "$ord\) on li +ne $."; } } } } close (INFILE);
That gives the expected results for most of my text files, e.g. a text file that contains ^Z on line 15 will yield:

test.txt contains illegal character (ord: 26) on line 15

However if the input text file contains a line commencing with "." (period), I get output like this:

test.txt contains illegal character (ord: 0) on line 23

Can anyone explain (in terms a newbie can understand) why it thinks a "." at the start of a line is a Null character?

Thanks.

In reply to Rogue Null (ordinate 0) characters in text files by paulnovl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.