Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Why XML not well formed?

by nan (Novice)
on Jun 30, 2005 at 15:46 UTC ( #471371=note: print w/replies, xml ) Need Help??


in reply to Re: Why XML not well formed?
in thread Why XML not well formed?

Hi guys,

Thank you for your quicky replies. I think I found what's wrong inside xml document. It seems that only if a link contains character '&' then the parser reports an error.

For example, <link r:resource="http://www.urbancinefile.com.au/home/article_view.asp?Article_ID=3801&Section=Reviews"/>

As I need to read <link/> elements one by one and compare the attribute value with user's input, my new question is, how can I overcome this '&' problem? I have tried to use '\' before '&' but it doesn't work.

Thanks again,

sub topic {
my $count = 0;
my ($twig, $topic) = @_;
$links{$_->att('r:resource')} = $_ for $topic->children('link');
foreach my $key (keys %links){
if ($key =~ /$q/i){
print "
  • ", $topic->att('r:id'), "
  • \n";
    $count++;
    }
    last if ($count == 1); #if keywords were found in one link, we don't need to check the others in the same node, because we aim to output parent category only.
    }
    print "\n";
    $twig->purge;
    %links = (); #reset the hash for next time use
    }
    Nan

    Replies are listed 'Best First'.
    Re^3: Why XML not well formed?
    by davorg (Chancellor) on Jun 30, 2005 at 15:54 UTC

      If you are being passed data that contains a raw '&' character that hasn't been converted to '&amp;' then you aren't being passed valid XML and no XML parser will be able to deal with it.

      You should ask your data provider to fix their processes so that they _do_ sent you valid XML.

      --
      <http://www.dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

        Dave and guys,

        I finally found the problem, it's not only '&' but also '<' and '"'. I don't know how many of these characters left and I'm still keep looking as the original data is about 300MB.

        Thanks all for your tolerant help! Nan

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Node Status?
    node history
    Node Type: note [id://471371]
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others perusing the Monastery: (3)
    As of 2023-02-04 02:24 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      I prefer not to run the latest version of Perl because:







      Results (30 votes). Check out past polls.

      Notices?