in reply to XML::Parser - Usage of &
XML::Parser shouldn't be ignoring the Company A&; I think what you'll find is that it treats the title as three pieces of character data:
And it will treat these as three separate parse events. Quick demonstration:
use 5.010; use strict; use warnings; use XML::Parser; my $in_title; my $parser = XML::Parser->new( Handlers => { Start => sub { $in_title++ if $_[1] eq 'Title' }, End => sub { $in_title-- if $_[1] eq 'Title' }, Char => sub { say "CHAR: $_[1]" if $in_title }, }, ); $parser->parse(<<'XML'); <Document> <Title>Company A&B Information</Title> <Abstract>Foo</Abstract> </Document> XML
XML::Parser is very bare-bones, and sees the job of translating those parse events into a useful data structure as being very much your job.
Personally I prefer DOM-based XML parsers, such as XML::LibXML which parse the entire file into a tree and allow you to manipulate and navigate that tree using the same DOM interface which web browsers expose to Javascript.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: XML::Parser - Usage of &
by sumeetgrover (Monk) on Feb 20, 2013 at 11:19 UTC | |
by tobyink (Canon) on Feb 20, 2013 at 12:35 UTC | |
by sumeetgrover (Monk) on Feb 20, 2013 at 14:53 UTC | |
by runrig (Abbot) on Feb 20, 2013 at 19:57 UTC |