in reply to Re: Why XML not well formed?
in thread Why XML not well formed?

Hi guys,

Thank you for your quicky replies. I think I found what's wrong inside xml document. It seems that only if a link contains character '&' then the parser reports an error.

For example, <link r:resource="http://www.urbancinefile.com.au/home/article_view.asp?Article_ID=3801&Section=Reviews"/>

As I need to read <link/> elements one by one and compare the attribute value with user's input, my new question is, how can I overcome this '&' problem? I have tried to use '\' before '&' but it doesn't work.

Thanks again,

sub topic {
my $count = 0;
my ($twig, $topic) = @_;
$links{$_->att('r:resource')} = $_ for $topic->children('link');
foreach my $key (keys %links){
if ($key =~ /$q/i){
print "
  • ", $topic->att('r:id'), "
  • \n";
    $count++;
    }
    last if ($count == 1); #if keywords were found in one link, we don't need to check the others in the same node, because we aim to output parent category only.
    }
    print "\n";
    $twig->purge;
    %links = (); #reset the hash for next time use
    }
    Nan

    Replies are listed 'Best First'.
    Re^3: Why XML not well formed?
    by davorg (Chancellor) on Jun 30, 2005 at 15:54 UTC

      If you are being passed data that contains a raw '&' character that hasn't been converted to '&amp;' then you aren't being passed valid XML and no XML parser will be able to deal with it.

      You should ask your data provider to fix their processes so that they _do_ sent you valid XML.

      --
      <http://www.dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

        Dave and guys,

        I finally found the problem, it's not only '&' but also '<' and '"'. I don't know how many of these characters left and I'm still keep looking as the original data is about 300MB.

        Thanks all for your tolerant help! Nan