Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Atom Feed for Chatterbox 60

by tobyink (Canon)
on Oct 14, 2012 at 10:04 UTC ( [id://998947]=CUFP: print w/replies, xml ) Need Help??

#!/usr/bin/env perl use HTML::HTML5::Parser; use Object::Tap -package => 'XML::Atom::Base'; use XML::Atom::Feed; use XML::Atom::Entry; use XML::Atom::Person; use XML::LibXML 2.00; my $list = HTML::HTML5::Parser -> new -> parse_file('http://mini-cb60.datenzoo.de/', { ignore_http_respo +nse_code => 1 }) -> getElementsByTagName('dl') -> get_node(1); my $feed = XML::Atom::Feed->new; $feed->title('PerlMonks Chatterbox'); $feed->id('tag:buzzword.org.uk,2012:perlmonks:chatterbox'); my $dt; foreach my $node ($list->getChildrenByTagName('*')) { ($dt = $node) && next if $node->tagName eq 'dt'; $feed->add_entry( XML::Atom::Entry->new->tap( title => [ $node->textContent ], id => [ join ':', $feed->id, $dt->{id} ], author => [ XML::Atom::Person->new->tap( name => [ $dt->getElementsByTagName('a')->get_node +(1)->textContent ], uri => [ 'http://www.perlmonks.org/'. $dt->getElementsByTagName('a')->get_node(1)->{ +href} ], ) ], published => [ $dt->getElementsByTagName('small')->get_node(1)->textC +ontent, ], ) ); } print $feed->as_xml;

Fixing up the datetimes to meet the Atom 1.0 spec is left as an exercise for the reader.

perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Replies are listed 'Best First'.
Re: Atom Feed for Chatterbox 60
by mje (Curate) on Oct 15, 2012 at 13:16 UTC

    Thanks tobyink. I had to make a slight change for behind a proxy server:

    my $ua = LWP::UserAgent->new; $ua->env_proxy; my $list = HTML::HTML5::Parser -> new -> parse_file('http://mini-cb60.datenzoo.de/', { ignore_http_respo +nse_code => 1, user_agent => $ua }) -> getElementsByTagName('dl') -> get_node(1);

    Also, HTML::HTML5::Parser failed its tests for me. When I looked at the test results I saw other people/smokers had hit this too so I reported it at 07ua.t fails.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://998947]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-16 18:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found