For one thing, I trust you've learned by now (from experience) that whenever you run any shell command like this:
some_proeccess < some.file > some.file
The FIRST thing the shell does is truncate "some.file" (i.e. set it's content to zero bytes); THEN it opens some.file as input to be read via the stdin of some_process. The result is: no data read by the process, because there's no longer any data in the file. Hope you have a backup copy...

For another, when you see something like \u0016, that's a hexadecimal value, You can "grep" for that using a perl one-line like the following:

perl -CS -ne 'print if /\x{0016}/' < some.file
Or, if the file is NOT UTF-16, you could just do:
perl -ne 'print if /\x16/' some.file
Of course, \u0016 isn't the only character that an XML library would reject, and if your file has a bunch of different ones, it gets tiresome fixing them one codepoint at a time as they get reported in error messages.

You can look up how valid vs. invalid XML characters are enumerated (many people use regexes - e.g. on stackoverflow), and run a diagnosis on your file(s) before feeding them to your script (or just add code to your script to filter out the bad characters, if you're sure that just deleting them is The Right Thing To Do).


In reply to Re: How to remove error: Code point \u0016 is not a valid character in XML by graff
in thread [Solved]:How to remove error: Code point \u0016 is not a valid character in XML by Perl300

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.