dear monks - this is the thread for the: Perl Mechanize-optimizaztion: make a script running faster with less overhead
Problem: I have a list of 2500 websites and need to grab a thumbnail screenshot of them. How do I do that?
I could try to parse the sites either with Perl.- Mechanize would be a good thing. Note: i only need the results as a thumbnails that are a maximum 240 pixels in the long dimension.
Prerequisites:
https://addons.mozilla.org/en-US/firefox/addon/mozrepl/
the module WWW::Mechanize::Firefox;
the module imager http://search.cpan.org/~tonyc/Imager-0.87/Imager.pm
First Approach: Here is a first Perl solution:
use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new();
$mech->get('http://google.com');
my $png = $mech->content_as_png();
Outline: This returns the given tab or the current page rendered as PNG image. All parameters are optional. $tab defaults to the current tab. If the coordinates are given, that rectangle will be cut out. The coordinates should be a hash with the four usual entries, left,top,width,height.This is specific to WWW::Mechanize::Firefox.
As i understand from the perldoc that option with the coordinates, it is not the resize of the whole page it's just a rectangle cut out of it.... well the WWW::Mechanize::Firefox takes care for how to save screenshots. Well i forgot to mention that i only need to have the images as small thumbnails - so we do not have to have a very very large files...i only need to grab a thumbnail screenshot of them. I have done a lookup on cpan
for some module that scales down the $png and i found out Imager
The module does not concern itself with resizing images. Here we have the various image modules on CPAN, like Imager.
http://search.cpan.org/~tonyc/Imager-0.87/Imager.pm
Imager - Perl extension for Generating 24 bit Images: Imager is a module for creating and altering images. It can read and write various image formats, draw primitive shapes like lines,and polygons, blend multiple images together in various ways, scale,crop, render text and more. I installed the module - but i did not have extended my basic-approach
What i have tried allready; here it is:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $mech = new WWW::Mechanize::Firefox();
open(INPUT, "<urls.txt") or die $!;
while (<INPUT>) {
chomp;
print "$_\n";
$mech->get($_);
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
sleep (5);
}
Well this does not care about the size:
See the
output commandline:
linux-vi17:/home/martin/perl # perl mecha_test_1.pl
www.google.com
www.cnn.com
www.msnbc.com
command timed-out at /usr/lib/perl5/site_perl/5.12.3/MozRepl/Client.pm
+ line 186
linux-vi17:/home/martin/perl #
This is my source ... see the
urls.txt
www.google.com
www.cnn.com
www.msnbc.com
news.bbc.co.uk
www.bing.com
www.yahoo.com
Question: how to extend the solution either to make sure that it does not stop in a time out.
and - it does only store little thumbnails
Note:again: i only need the results as a thumbnails that are a maximum 240 pixels in the long dimension.
As a prerequisites, i allready have installed the module imager http://search.cpan.org/~tonyc/Imager-0.87/Imager.pm
love to hear from you!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.