in reply to Why XML not well formed?

Have you taken a close look at line 221, column 97? That's where your problem is.

--
<http://www.dave.org.uk>

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg

Replies are listed 'Best First'.
Re^2: Why XML not well formed?
by Fletch (Bishop) on Jun 30, 2005 at 13:23 UTC

    And since you have the byte offset in the file you can explicitly print the offending portion of the input with some context thusly:

    perl -le 'open(X,shift())or die "$!";seek(X, 12000, 0)or die "$!";read + X, $b, 40;print $b, "\n", " " x 19, "^\n"' foo.xml

    Update: Oop, I see by the paths mentioned you're on Wintendo; you'll probably want to adjust the quotes on that or store it into a file and run that (or just get a real shell and/or OS . . . :)

    --
    We're looking for people in ATL

Re^2: Why XML not well formed?
by nan (Novice) on Jun 30, 2005 at 15:46 UTC

    Hi guys,

    Thank you for your quicky replies. I think I found what's wrong inside xml document. It seems that only if a link contains character '&' then the parser reports an error.

    For example, <link r:resource="http://www.urbancinefile.com.au/home/article_view.asp?Article_ID=3801&Section=Reviews"/>

    As I need to read <link/> elements one by one and compare the attribute value with user's input, my new question is, how can I overcome this '&' problem? I have tried to use '\' before '&' but it doesn't work.

    Thanks again,

    Nan

      If you are being passed data that contains a raw '&' character that hasn't been converted to '&amp;' then you aren't being passed valid XML and no XML parser will be able to deal with it.

      You should ask your data provider to fix their processes so that they _do_ sent you valid XML.

      --
      <http://www.dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

        Dave and guys,

        I finally found the problem, it's not only '&' but also '<' and '"'. I don't know how many of these characters left and I'm still keep looking as the original data is about 300MB.

        Thanks all for your tolerant help! Nan