Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
I have found some of the most interesting links, both internal and external, from visiting homenodes. The problem is that by visiting random homenodes, you are extremely likely to end up on one with little to no content.

I asked around in the CB as well as Super Searched, but didn't find anything that did exactly what I wanted. The closest was Random NonHome Nodes by blakem, which contrary to the title would allow you to surf to random homenodes with a minimun XP. The trouble is it is b0rk ATM (at least for me). Even if it were working, having XP makes no guarantee a monk has put something on their homenode. In the CB, atcroft mentioned parsing the PM Stats.

Here was my criteria for including a homenode:

  • Homenode length > 500 and XP > 200
  • XP is a measure of participation. That participation comes in many formes (logging in every day, voting, posting, etc), which tells me I am more likely to find the content I am looking for.
  • Homenode length > 500 and account created > 1 yr and last here < 45 days
  • While participation (XP) is a good indicator of quality content, not all monks are as obsessed with The Monastery as I am. Some monks have been around for a while, chosen to take the 17th seriously, but only visit occasionally.

Here is the code

#!/usr/bin/perl use strict; use warnings; use HTML::TableContentParser; use Time::Local; use WWW::Mechanize; use constant ID => 1; use constant CREATE => 3; use constant STATS => 4; use constant LAST => 4; use constant EXP => 5; use constant LENGTH => 11; # length && ( rep || create && last ) my %opt = ( length => 500, exp => 200, create => 365, last => 45, url => ' +pt=15&sortlist=15,1,3&', pos => 0, ); my $finished; my $mech = WWW::Mechanize->new( autocheck => 1 ); my @homenodes; while ( ! $finished ) { $mech->get( $opt{url} . '&start=' . $opt{pos} ); my $table = HTML::TableContentParser->new()->parse( $mech->content +() ); for my $row ( @{ $table->[ STATS ]{rows} } ) { my $length = Get_Length( $row ); next if ! defined $length; if ( $length < $opt{length} ) { $finished = 1; last; } my $id = Get_ID( $row ); push @homenodes , $id if defined $id; } $opt{pos} += 50; } sub Get_Length { my $row = shift; my $data = ${ $row->{cells} }[ LENGTH ]{data}; ($data) = $data =~ /(\d+)/ if defined $data; return $data; } sub Get_ID { my $row = shift; my ($id) = ${ $row->{cells} }[ ID ]{data} =~ /(\d+)/; my ($exp) = ${ $row->{cells} }[ EXP ]{data} =~ /(\d+)/; return $id if $exp >= $opt{exp}; my $create = Get_Days( ${ $row->{cells} }[ CREATE ]{data} ); my $last = Get_Days( ${ $row->{cells} }[ LAST ]{data} ); return $create >= $opt{create} && $last <= $opt{last} ? $id : unde +f; } sub Get_Days { my $then = shift; ($then) = $then =~ m|<NOBR>(.*)</NOBR>|; my ($yr, $mon, $day, $hr, $min, $sec) = split /[ :-]/ , $then; my $stamp = timelocal ($sec, $min, $hr, $day, --$mon, $yr); return int ( (time - $stamp) / 86_400 ); } print "<ul>\n"; print "<li>[id://$_]</li>\n" for @homenodes; print "</ul>\n";

As of this posting, there were 871 homenodes that fit this criteria. The list may be on my scratch pad depending on how long it takes me to get through them all.

Cheers - L~R

In reply to Homenode Surfing by Limbic~Region

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2022-01-26 19:36 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (70 votes). Check out past polls.