scrape a webpage

backyardbill has asked for the wisdom of the Perl Monks concerning the following question:


# ============ this is the HTML snipit from Firebug ==================
+========

<td class="fex_standardblack_font_small" bgcolor="white">
<input id="chkEquipment" name="chkEquipment" value="72" type="checkbox
+">
  4WD/AWD
</td>


# ====== this is the perl code snipit trying to access/extract the dat
+a

    print "\n  $node\n";
    my $B =  $tree->findvalue( $node );
    $B = trim($B);
    print "\n  $B \n";

# ============ this is what prints ================

  /html/body/div/div[4]/div[2]/div[2]/form/div[3]/div/table[2]/tbody/t
+r/td[3]/div/table/tbody/tr[1]

  4WD/AWD

# ============ this is my question ================
#
#    how do I access/extract the 'value="72"' data?
[download]

Comment on scrape a webpage Download Code

Replies are listed 'Best First'.
Re: scrape a webpage by marto (Cardinal) on May 22, 2017 at 04:49 UTC
You haven't specified what you're using to parse the HTML, here is an example using Mojo::DOM: `#!/usr/bin/perl use strict; use warnings; use Mojo::DOM; my $html = '<td class="fex_standardblack_font_small" bgcolor="white"> <input id="chkEquipment" name="chkEquipment" value="72" type="checkbox +"> 4WD/AWD </td>'; my $dom = Mojo::DOM->new( $html ); print $dom->at('#chkEquipment')->val;` [download] The module is well documented. If this isn't what you're looking for you may need to ask a better question. How do I post a question effectively?.	[reply] [d/l]
Re^2: scrape a webpage by Anonymous Monk on May 22, 2017 at 13:00 UTC
# ===== I'm using ===================== use WWW::Mechanize::Firefox qw(); use HTML::TreeBuilder::LibXML qw();	[reply]
Re^3: scrape a webpage by marto (Cardinal) on May 22, 2017 at 13:29 UTC
Look at `$mech->value();` from the WWW::Mechanize::Firefox documentation. You could have saved time by specifying these modules in your question.	[reply] [d/l]
Re^4: scrape a webpage by backyardbill (Initiate) on May 27, 2017 at 21:50 UTC
Re^5: scrape a webpage by Corion (Patriarch) on May 28, 2017 at 06:55 UTC
Re: scrape a webpage by choroba (Cardinal) on May 22, 2017 at 07:45 UTC
Probably just add `td/input[@id="chkEquipment"]/@value` [download] to the XPath expression, if what you used is an XPath expression. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]