in reply to Re^2: Completely Confused with Mechanize::Firefox Forms
in thread Completely Confused with Mechanize::Firefox Forms

Hi there. I'm not clear on what you want exactly. You just want the names of the forms? WWW::Mechanize has a dump_forms method- so you may not need Mechanize::Firefox. Also, If you use the firefox 'firebug' extension, you can inspect any html element and then use the name or value in the script to mechanize what you want to do..
  • Comment on Re^3: Completely Confused with Mechanize::Firefox Forms

Replies are listed 'Best First'.
Re^4: Completely Confused with Mechanize::Firefox Forms
by help_3452 (Initiate) on Jul 12, 2013 at 23:45 UTC

    Ok cool I'll give that a go. My whole idea though is to try to avoid having to use firebug. I've been making scripts by hand for a little while and I wanted to take the time to make something a little more intelligent. I do understand I can only cover relatively simple scenarios, but i want to take more of the leg work out of setting up a scraper. Cheers all the same.

      I would also consider using the Web::Scraper module. I haven't used it much but it looks pretty good
Re^4: Completely Confused with Mechanize::Firefox Forms
by help_3452 (Initiate) on Jul 12, 2013 at 18:26 UTC

    Ah ok ! Fair enough.

    What I'm trying to do is make a simplified 'web scraper' Frequently within our Org I need to collect data from various systems ( say checking the names on the internal colleague register )

    But we have many systems and some use javascript. Because of this I was leaning towards Mech::Firefox since it handles it for you.

    I envisaged putting in the internal web address, then receiving back a set of links, forms, etc which the user could then select. I would save the choices in a config file and so allow the user to avoid repetitive checks.

    So i'm bit confused by the answer as it suggest using HTML::Form but the documentation states it doesn't return this type of object. I'd like to understand it properly as I intend to expand the code substantially

      Hi help_3452, there seems to be several ways going about what you need, which makes it hard for me to tell you what to do. So I will make a few suggestions based on what I understand:
      -I would still consider trying to use just the plain WWW::Mechanize for 'collecting the data'.
      -I have bypassed java script before with that module. I would also suggest getting the 'Live HTTP Headers' module for firefox. This will help you bypass some of the java script by seeing the HTTP GET/POSTS that may be occurring as you navigate the site.
      - I think you're making this harder than it may need to be by allowing the user to select the particulars of the forms. If this feature is a 'must have' to you, I would get the form names, and allow the user to select which forms they intend to 'submit', have them enter the required input and pass that to the submit_form method that mechanize has. Each input could be a 'field' within the 'form', maybe.. Here is an example of what I used to login with Mechanize::Firefox:
      $mech->form_name('loginform'); $mech->field('email' => 'me@awesome.com'); $mech->field('password' => 'l337'); $mech->click_button(name => 'login');
      The WWW::Mechanize module is basically the same in this regard. For each of those methods I just used firebug to inspect each HTML element and then coded it into the script.