I'm looking for testers to test a quick script I threw together that I'm not sure how to improve upon anymore - it's a basic downloading tool I've created in the last two days. It's nothing fancy, it's useful to automate large amounts of downloads if you need a quick-and-dirty way to download tons of files without needing to babysit them, though.

I built it out of necessity for my job, since I either needed to build a script that automated the download of over 2000 files, find one that works well for the task, or do it myself... I couldn't find one that worked for me, so I learned more about LWP and WWW::Mechanize, asked a couple questions from you great Monks, and have it working! It's currently being used to download said 2000+ files onto one of our computers, so I thought I'd ask for criticism on how to improve the meager tool, and release it into the wilderness of the internet.

It's available for viewing/download on my dropbox here, feel free to download it and tell me what you think. Are there glaring issues with it? Are there features you'd want added if you were going to use this tool? Be nice please, but feel free to critique.

Edit: Here's the code, upon advisement that I should post the code directly here for people to view.
#!/usr/bin/perl -w use strict; use warnings; use LWP::UserAgent; use LWP::Simple; use WWW::Mechanize; use Digest::MD5 qw( md5_hex ); # Coded by Brendan Galvin from June 3rd 2013, to June 5th 2013. # This product is open-source freeware, and credit to the original sou +rce must be given to Brendan Galvin upon re-distribution of the origi +nal source or any program, script or application made using the origi +nal source. # http://www.linkedin.com/pub/brendan-galvin/26/267/94b my $flag=0; print"\n\nURL for mass-downloading (only download links using the <a h +ref> tag, not images or other embedded elements): "; chomp(my $url = <STDIN>); print"\nExtensions to download (seperated by comma's): "; chomp(my $extensions = <STDIN>); $extensions =~ s/[.]//g; print"\nLocation to store downloaded files: "; chomp(my $location = <STDIN>); print"\nHow many downloads would you like to skip starting from the fi +rst (in case you started this download earlier and have already downl +oaded some of the files)? "; chomp(my $skips = <STDIN>); print"\nAre you going to want to skip any files while the program is r +unning (y/n)?"; chomp(my $skiporno = <STDIN>); my $error = ""; my @extension = split(',', $extensions); my %extens = map{$_ => 1} @extension; sub GetFileSize{ my $url=shift; my $ua = new LWP::UserAgent; $ua->agent("Mozilla/5.0"); my $req = new HTTP::Request 'HEAD' => $url; $req->header('Accept' => 'text/html'); my $res = $ua->request($req); if ($res->is_success) { my $headers = $res->headers; return $headers; }else{ $flag = 1; $error .= "Error retrieving file information at $url "; } return 0; } my $mech = WWW::Mechanize->new(); $mech->get($url); my $base = $mech->base; my @links = $mech->links(); for my $link ( @links ) { my $skip = 'n'; if($link->url() =~ m/([^.]+)$/){ my $ext = ($link->url() =~ m/([^.]+)$/)[0]; if(exists($extens{$ext})){ my $newurl = $link->url(); if($newurl !~ /http::\/\/$/ig){ my $baseurl = URI->new_abs($newurl, $base); $newurl = $baseurl; } my $filename = $newurl; $filename =~ m/.*\/(.*)$/; $filename = $1; if($skips > 0){ $skips -= 1; print "\n\nSkipped $filename at " . $link->url(); next; }else{ my $header = GetFileSize($newurl); my $urlmech = WWW::Mechanize->new(); $urlmech->show_progress("true"); print"\n\n\n$filename at $newurl\n"; print "File size: ".$header->content_length." bytes\n" + unless $flag==1; print "Last modified: ".localtime($header->last_modifi +ed)."\n" unless $flag==1; if($skiporno eq 'y'){ print"Skip file (y/n)?"; chomp($skip = <STDIN>); } if($skip ne 'y'){ print " downloading...\n"; my $response = $urlmech->get($newurl, ':content_fi +le' => "$filename", )->decoded_content; }else{ print"\nSkipping...\n\n"; next or print"Error skipping.\n"; } } } } } print"\n\n\nTasks completed.\n"; if($error ne ""){ print"\nErrors: $error"; }else{ print"No errors.\n"; }

In reply to RFC: Code testers/reviewers needed by AI Cowboy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.