So, some tips:
I've included an example of what I mean. First, here's what your script might look like after its been broken up into subroutines. I've put in two: one to find the links (getAllLinks(...)) and one to retrieve the byte count with each link (getByteCount(...)). I've done it this way because the techniques for testing those two parts of your script are very different. Please forgive typos: this is only a reorganization for demonstration purposes. It hasn't been run through a compiler.
use strict; use warnings; use WWW::Mechanize; my $start = "http://www.domain.com"; my $mech = WWW::Mechanize->new( autocheck => 1 ); my $regex = qr/\d+.+\.pdf$/; my @aLinks = findAllLinks($mech, $start, $regex); for my $link ( @links ) { my $url = $link->url_abs; my $bytecount = getByteCount($mech, $url); print "Fetching $url"; print " $bytecount bytes\n"; } sub findAllLinks { my ( $mech, $start, $regex ) = @_; $mech->get( $start ); return $mech->find_all_links( url_regex => $regex ); } sub getByteCount { my ($mech,$url) = @_; my $filename = $url; $filename =~ s[^.+/][]; $mech->get( $url, ':content_file' => $filename ); return -s $filename; }
Now, here's an example of a test script. A test script is just a plain old script that ends, by convention, with .t rather than .pl. What this test script does is pass various combinations of inputs to the subroutines getAllLinks(...) and getByteCount(...). To compare the actual outputs of those functions with the expected outputs, we wrap each subroutine call with one of two special testing functions: is(...), is_deeply(...).
Your test script might look something like this. Again, this code hasn't been run through a compiler - consider it more as a demonstration of how to use Test::More:
use strict; use warnings; use Test::More qw(no_plan); #imports testing tools use MyModule; #that's your code my $mech = WWW::Mechanize->new( autocheck => 1 ); #call repeatedly with various values of $start, $regex # is_deeply compares data structures element by element # is_deeply($got, $expected, $description_of_test) my $start = "http://www.domain.com"; my $regex = qr/\d+.+\.pdf$/; my $aExpected = [ 'foo.pdf', 'baz.pdf' ]; is_deeply(getAllLinks($mech, $start, $regex), $aExpected , "getAllLinks: start=$start, regex=$regex"); #call repeatedly with various urls # is compares simple scalars # is($got, $expected, $description_of_test) is(getByteCount($mech, $url), $iExpected , "getByteCount: url=$url");
Best, beth
In reply to Re: www::mechanize file download script
by ELISHEVA
in thread www::mechanize file download script
by jaytan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |