Lobatto has asked for the wisdom of the Perl Monks concerning the following question:

Hey there,

I'm currently working on a script, that should click on a specified link in a website. For this I am using Pelr with the WWW::Mechanize::Firefox module.

I'm trying to use this to click on the Link

use WWW::Mechanize::Firefox; my $firefox = WWW::Mechanize::Firefox -> new ( tab => "current", ); $firefox -> click ({ selector => 'a[href*="2012-06-20"]'});

When I run the script it just does nothing, so I tried to find any CSS selectors on this page, but somehow they cannot be found.

Someone knows a solution to get the CSS selectors ? Kind regards

Update : This is the part of HTML in which I want to access the link
<a href="http://feeds.cbsnews.com/~r/podcast_eveningnews_video_1/~5/RL +DkKWpoJSo/cbsnews_2012-06-24-202400.default.flv">cbsnews_2012-06-24-2 +02400.default.flv</a>

Replies are listed 'Best First'.
Re: WWW::Mechanize::Firefox CSS Selectors
by Corion (Patriarch) on Jun 25, 2012 at 11:01 UTC

    I can't replicate the problem:

    Q:\repos\WWW-Mechanize-FireFox>perl -Ilib -MWWW::Mechanize::Firefox::D +SL -wle "update_html(q{<html><head></head><body><a href="http://examp +le.com">Test</a><a href="http://foobar.example.com">no Test</a></body +></html>}); print content; print $_->{innerHTML} for selector q{*[hre +f*='/example.co']}" <html><head></head><body><a href="http://example.com">Test</a><a href= +"http://foobar.example.com">no Test</a></body></html> Test

    A somewhat more verbose version than this oneliner would be:

    #!perl -w use WWW::Mechanize::Firefox::DSL; update_html(q{<html><head></head><body><a href="http://example.com">Te +st</a><a href="http://foobar.example.com">no Test</a></body></html>}) +; highlight_node( selector(q{*[href*='/example.com']}));

    Most likely your HTML is not what your script sees, or your selector is subtly wrong, or something else. It's hard to tell without seeing the relevant Perl code. Also see WWW::Mechanize::Firefox::Troubleshooting about a possible reason for why calling ->click on an element may cause your script to stall.

    If that does not address the problem, consider also showing the relevant error message(s) you get.

      use WWW::Mechanize::Firefox; my $firefox = WWW::Mechanize::Firefox -> new ( tab=>"current", ); $firefox -> get("http://feeds.cbsnews.com/podcast_eveningnews_video_1? +tag=contentMain%3bcontentBody"); $firefox -> click ({selector => 'a'}); <>;

      This is the entire script and every time I run it, I get the following error

      No elements found for CSS selector 'a' at test.pl line 8

      The script doesn't stall, as you mentioned and which is the problem covered in the troubleshooting, it just ends after this message. Somehow I can't access any CSS Selectors on that page

        Have you printed $mech->content? I assume that Firefox only supports CSS selectors for HTML. The page you gave is RSS (or ATOM or whatever other feed format). I assume that Firefox does not want to run CSS queries against RSS. Replacing your call to ->click with a simple call that returns the elements shows that no elements get found for either ->selector or ->xpath.

        use WWW::Mechanize::Firefox; my $firefox = WWW::Mechanize::Firefox -> new ( ); $firefox -> get("http://feeds.cbsnews.com/podcast_eveningnews_video_1? +tag=contentMain%3bcontentBody"); print $firefox->ct; print $_->{innerHTML} for $firefox->selector('a'); print $_->{innerHTML} for $firefox->xpath('//a');

        I'm not sure what goes wrong here, but I assume that Firefox does not really support CSS or XPath queries for documents other than HTML documents. Even if it did, I highly doubt that the surrounding event model would support the ->click event as HTML pages do.

        You can consider looking at Mojolicious, Web::Magic, Web::Scraper or App::scrape, all of whom do not munge the RSS through Firefox and all of whom support CSS and XPath queries - maybe you have more luck extracting the URLs with those.

Re: WWW::Mechanize::Firefox CSS Selectors
by Anonymous Monk on Jun 25, 2012 at 09:52 UTC