DanielSpaniel has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I've been playing with the new Bing API, for web results, which returns XML in the following format (apologies for the length, but it won't make sense otherwise):
<feed> <title type="text">xbox</title> <subtitle type="text">Bing Web Search</subtitle> <id>https://api.datamarket.azure.com/Data.ashx/Bing/SearchWeb/v1/Web?Q +uery='xbox'&Market='en-GB'&$top=3</id> <rights type="text"/> <updated>2013-03-13T19:01:31Z</updated> <link rel="next" href="https://api.datamarket.azure.com/Data.ashx/Bing +/SearchWeb/v1/Web?Query='xbox'&Market='en-GB'&$skip=3&$top=3"/> <entry> <id>https://api.datamarket.azure.com/Data.ashx/Bing/SearchWeb/v1/Web?Q +uery='xbox'&Market='en-GB'&$skip=0&$top=1</id> <title type="text">WebResult</title> <updated>2013-03-13T19:01:31Z</updated> <content type="application/xml"> <m:properties> <d:ID m:type="Edm.Guid">13bfd262-8460-4487-827f-465643cb7</d:ID> <d:Title m:type="Edm.String">Xbox 360 - Xbox.com</d:Title> <d:Description m:type="Edm.String">Your ultimate Xbox 360 ....</d: +Description> <d:DisplayUrl m:type="Edm.String">www.xbox.com</d:DisplayUrl> <d:Url m:type="Edm.String">http://www.xbox.com/</d:Url> </m:properties> </content> </entry> <entry> <id>https://api.datamarket.azure.com/Data.ashx/Bing/SearchWeb/v1/Web?Q +uery='xbox'&Market='en-GB'&$skip=1&$top=1</id> <title type="text">WebResult</title> <updated>2013-03-13T19:01:31Z</updated> <content type="application/xml"> <m:properties> <d:ID m:type="Edm.Guid">daf94bdf-e59b-4e17-8c06-62a8b4ff8</d:ID> <d:Title m:type="Edm.String">Xbox UK Home</d:Title> <d:Description m:type="Edm.String">For UK Xbox gamers ...</d:Descr +iption> <d:DisplayUrl m:type="Edm.String">www.xbox.com/GB</d:DisplayUrl> <d:Url m:type="Edm.String">http://www.xbox.com/GB</d:Url> </m:properties> </content> </entry> <entry> ... etc, etc, etc ..... </entry> </feed>

So, I'm trying to get the values of d:Title, d:Description, and d:Url, but seem to be having problems. I'm using XML::Simple, which I'm slightly more familiar with than anything else - but far from proficient.

So, I've got the data, like this:

$data   =$xml->XMLin($bingdata,ForceArray=>1);

... and then try to process each <entry>, as such:

foreach my $record (@{$data->{'feed:entry'}->[0]->{'content:m:properties'}})

... but it just skips the loop, because presumably I don't have the loop set up correctly.

I was hoping I could just loop through and do something like:

my $title=$record->{'d:Title'}->[0];, etc.

Very confused ... Any help would be much appreciated!

Thanks.

Replies are listed 'Best First'.
Re: Parsing XML
by runrig (Abbot) on Mar 13, 2013 at 20:20 UTC
    First, that's not valid XML, the &'s are not entity encoded, and the namespaces are not declared, but I assume you really do have valid XML otherwise XML::Simple could not parse it. Whatever you get back from XML::Simple, you can use Data::Dumper to see the structure of the data it returns. Or you can only process the things you want to process the way you want to process them with something like XML::Rules:
    use XML::Rules; my @rules = ( 'm:properties' => sub { my $p = $_[1]; print "Title : $p->{'d:Title'}\n"; print "Description: $p->{'d:Description'}\n"; }, _default => 'content', ); my $xr = XML::Rules->new( rules => \@rules, ); $xr->parse($xml);
      Thank you runrig!
Re: Parsing XML
by tobyink (Canon) on Mar 13, 2013 at 20:42 UTC

    This seems to be an Atom feed plus a few extensions. XML::Atom might bring you joy.

    Update: OK, here's a quick example that pulls out the search results. (I needed first to fix the unescaped ampersands and add the missing namespace declarations.)

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

      Hey, thanks very much tobyink.

      I realize now that the data I posted was simply what was being displayed in the browser when viewing the file; when I viewed the source I could see that the namespace declarations were there, and ampersands were also in their correct form.

      I'm sure I can figure out what to do now.

      Thank you again!