http://qs1969.pair.com?node_id=996419

hesco has asked for the wisdom of the Perl Monks concerning the following question:

This script is intended to parse the Membership Management page in the mailman administrative interface in order to harvest the name and email address of each subscriber.

Using HTML::TableExtract in text mode, by commenting out the tree importation, gives me access to the the email address of each subscriber. But as shown in the sample at the bottom of the script, I am unable to extract the name from the html form input tag where it exists as the default value for the text box. I have found no documentation for how to extract the raw html so I can parse it myself, but uncommenting the importation on line 4, will give me access to objects which presumably include that data but which so far seem inpenatrable.

Can anyone please advise how I move past stuck on this project?

#!/usr/bin/env perl use strict; use warnings; use HTML::TableExtract; # qw(tree); use HTML::ElementTable; use Data::Dumper; use FindBin; use File::Util; # This script is intended to parse the Membership Management page # in the mailman administrative interface in order to harvest # the name and email address of each subscriber. my($f) = File::Util->new(); my (@html_files) = $f->list_dir("$FindBin::Bin",'--files-only','--patt +ern=05\.html'); foreach my $html_file ( @html_files ){ my $html; open( 'HTML', '<', $html_file ) or die "Unable to open $html_file +\n"; while(<HTML>){ $html .= $_; } close(HTML); parse_subscriber_list( $html ); } sub parse_subscriber_list { my $html = shift; my $te = HTML::TableExtract->new( headers => [ 'unsub', 'member', 'mod', 'hide', 'nomail', 'ack' +, 'not metoo', 'nodupes', 'digest', 'plain', 'language' ] ); my $row_count; $te->parse($html); foreach my $ts ($te->tables){ foreach my $row ($ts->rows){ $row_count++; # chomp( @{$row} ); print "name: email: $row->[1] \n"; } } } exit; __DATA__ <td><a href="http://lists.example.net/options.cgi/updates-example.net/ +hesco--at--example.net">hesco@example.net</a><br><input name="hesco%4 +0example.net_realname" type="TEXT" value="Hugh Esco" size="33"><input + name="user" type="HIDDEN" value="hesco%40example.net"></td>

Please see comment below for final solution.

Thanks,

-- Hugh Esco

if( $lal && $lol ) { $life++; }
if( $insurance->rationing() ) { $people->die(); }
Vote Jill Stein on November 6th!