in reply to Re: Extract hidden values from HTML
in thread Extract hidden values from HTML

<html><head> <title>Data</title> </head> <form name="action"> <table cellspacing="1" cellpadding="1"> <tr> <td>File name</td> <td>From To</td> <td>Svc</td> <td>Period</td> <td>Seq#</td> <td>Reason for Rejection</td> <td>Next Action</td> </tr> <tr> <td>Sample.XML</td> <td><a href='dch_redirect?refer=10293377&ref=10293377'>USA-IND</a> +</td> <td>Voice</td> <td><input type="hidden" name="recid1" value="10293377">2009/08</t +d> <td>03386</td> <td>data already exists</td> <td><input type="checkbox" name="c_1"></td> </tr> </body></html>

Above is the sample HTML file and I want to extract the hidden value's with the names recid1. Or is there any way to get 'href' values from the HTML content.

Replies are listed 'Best First'.
Re^3: Extract hidden values from HTML
by wfsp (Abbot) on Dec 15, 2009 at 11:40 UTC
    One way
    #!/usr/bin/perl use warnings; use strict; use HTML::TreeBuilder; my $t = HTML::TreeBuilder->new_from_file(*DATA); my @hidden_inputs = $t->look_down( _tag => q{input}, type => q{hidden}, name => q{recid1}, ); for my $hidden_input (@hidden_inputs){ printf qq{*%s*\n}, $hidden_input->attr(q{value}); } __DATA__ <html><head> <title>Data</title> </head> <form name="action"> <table cellspacing="1" cellpadding="1"> <tr> <td>File name</td> <td>From To</td> <td>Svc</td> <td>Period</td> <td>Seq#</td> <td>Reason for Rejection</td> <td>Next Action</td> </tr> <tr> <td>Sample.XML</td> <td><a href='dch_redirect?refer=10293377&ref=10293377'>USA-IND</a></ +td> <td>Voice</td> <td><input type="hidden" name="recid1" value="10293377">2009/08</td> <td>03386</td> <td>data already exists</td> <td><input type="checkbox" name="c_1"></td> </tr> </body></html>
    *10293377*
    update: Similar would also work for hrefs