in reply to Re: WWW::Mechanize::Firefox CSS Selectors
in thread WWW::Mechanize::Firefox CSS Selectors

use WWW::Mechanize::Firefox; my $firefox = WWW::Mechanize::Firefox -> new ( tab=>"current", ); $firefox -> get("http://feeds.cbsnews.com/podcast_eveningnews_video_1? +tag=contentMain%3bcontentBody"); $firefox -> click ({selector => 'a'}); <>;

This is the entire script and every time I run it, I get the following error

No elements found for CSS selector 'a' at test.pl line 8

The script doesn't stall, as you mentioned and which is the problem covered in the troubleshooting, it just ends after this message. Somehow I can't access any CSS Selectors on that page

Replies are listed 'Best First'.
Re^3: WWW::Mechanize::Firefox CSS Selectors
by Corion (Patriarch) on Jun 25, 2012 at 11:34 UTC

    Have you printed $mech->content? I assume that Firefox only supports CSS selectors for HTML. The page you gave is RSS (or ATOM or whatever other feed format). I assume that Firefox does not want to run CSS queries against RSS. Replacing your call to ->click with a simple call that returns the elements shows that no elements get found for either ->selector or ->xpath.

    use WWW::Mechanize::Firefox; my $firefox = WWW::Mechanize::Firefox -> new ( ); $firefox -> get("http://feeds.cbsnews.com/podcast_eveningnews_video_1? +tag=contentMain%3bcontentBody"); print $firefox->ct; print $_->{innerHTML} for $firefox->selector('a'); print $_->{innerHTML} for $firefox->xpath('//a');

    I'm not sure what goes wrong here, but I assume that Firefox does not really support CSS or XPath queries for documents other than HTML documents. Even if it did, I highly doubt that the surrounding event model would support the ->click event as HTML pages do.

    You can consider looking at Mojolicious, Web::Magic, Web::Scraper or App::scrape, all of whom do not munge the RSS through Firefox and all of whom support CSS and XPath queries - maybe you have more luck extracting the URLs with those.

      Ok, thanks for the answer I'll have a look the given links. Kind regards

        At least part of the problem that the RSS contains entity-encoded HTML which is inaccessible to CSS and XPath queries. I would look at one of the dedicated RSS parsers for extracting the relevant elements.

        At least the commonly used HTML::TreeBuilder "messes up" the document tree for RSS documents.