in reply to Re: Help getting text from website using www mechanize
in thread Help getting text from website using www mechanize

I'm just guessing since I can see the message for a split second. Thanks for the hint, stderr.txt now contains:

Can't locate HTML/TreeBuilder.pm in @INC (@INC contains: C:/strawberry/perl/site/lib C:/strawberry/perl/vendor/lib C:/strawberry/perl/lib .) at C:/strawberry/perl/site/lib/WWW/Mechanize.pm line 662, <STDIN> line 1.

I'm not sure what to do..

Oh wait, I installed HTML::TreeBuilder - it works but, all I'm getting is a return character... so much for that.

Does anyone know how to get text off a website using mechanize ?

  • Comment on Re^2: Help getting text from website using www mechanize

Replies are listed 'Best First'.
Re^3: Help getting text from website using www mechanize
by Marshall (Canon) on Jan 28, 2011 at 06:26 UTC
    When I write these web automation things, the first step is to be able to get the HTML of the page I want. You can save the resulting HTML from LWP/Mechanize as a file and then open that file in Firefox to make sure you're getting correct stuff that is the same as when you use the browser to go there. Have you passed this hurdle yet?

    Then the question becomes: How do I get what I want out of this HTML? That is an application specific thing. If it is really easy, I just write a regex. HTML Parser is one option.

Re^3: Help getting text from website using www mechanize
by Anonymous Monk on Jan 28, 2011 at 07:55 UTC
    - it works but, all I'm getting is a return character... so much for that.

    $mech->get( $uri )
    NOTE: Because :content_file causes the page contents to be stored in a file instead of the response object, some Mech functions that expect it to be there won't work as expected. Use with caution.

      Wow, you saw right through that code, thank you very much that was it.

      I've got one more question though, it seems mechanize can't see text generated through javascript, is there any way I can get it ?

      For example, here's a bit of code:

      function calculate(){ if(date_completate(0)){ $('#PolitaAddForm').ajaxSubmit(formoptions); }else{ $('#prima').html("Insufficient data."); } } var formoptions = { target: '#prima', url: '/rca/ajax/calcul' };

      Now when you first enter the form page you can see "Insufficient data" in the div with the id "prima". If you complete some of the form that changes into a number. That's what I want to get. There are other ways to get it but this would be the most straightforward one.

Re^3: Help getting text from website using www mechanize
by Anonymous Monk on Jan 28, 2011 at 07:45 UTC