in reply to Re^3: Completely Confused with Mechanize::Firefox Forms
in thread Completely Confused with Mechanize::Firefox Forms

Ah ok ! Fair enough.

What I'm trying to do is make a simplified 'web scraper' Frequently within our Org I need to collect data from various systems ( say checking the names on the internal colleague register )

But we have many systems and some use javascript. Because of this I was leaning towards Mech::Firefox since it handles it for you.

I envisaged putting in the internal web address, then receiving back a set of links, forms, etc which the user could then select. I would save the choices in a config file and so allow the user to avoid repetitive checks.

So i'm bit confused by the answer as it suggest using HTML::Form but the documentation states it doesn't return this type of object. I'd like to understand it properly as I intend to expand the code substantially

  • Comment on Re^4: Completely Confused with Mechanize::Firefox Forms

Replies are listed 'Best First'.
Re^5: Completely Confused with Mechanize::Firefox Forms
by PerlSufi (Friar) on Jul 12, 2013 at 19:43 UTC
    Hi help_3452, there seems to be several ways going about what you need, which makes it hard for me to tell you what to do. So I will make a few suggestions based on what I understand:
    -I would still consider trying to use just the plain WWW::Mechanize for 'collecting the data'.
    -I have bypassed java script before with that module. I would also suggest getting the 'Live HTTP Headers' module for firefox. This will help you bypass some of the java script by seeing the HTTP GET/POSTS that may be occurring as you navigate the site.
    - I think you're making this harder than it may need to be by allowing the user to select the particulars of the forms. If this feature is a 'must have' to you, I would get the form names, and allow the user to select which forms they intend to 'submit', have them enter the required input and pass that to the submit_form method that mechanize has. Each input could be a 'field' within the 'form', maybe.. Here is an example of what I used to login with Mechanize::Firefox:
    $mech->form_name('loginform'); $mech->field('email' => 'me@awesome.com'); $mech->field('password' => 'l337'); $mech->click_button(name => 'login');
    The WWW::Mechanize module is basically the same in this regard. For each of those methods I just used firebug to inspect each HTML element and then coded it into the script.