Perl Mechanize issues - how to make a script running faster with less overhead

Perlbeginner1 has asked for the wisdom of the Perl Monks concerning the following question:

hya - good evening dear perl-folks want to grab screenshots from sites like the following..

www.google.com
   www.msnbc.com
    news.bbc.co.uk
    www.yahoo.com
[download]

for a mechanize-scraper i take this one here


#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize::Firefox;

my $mech = new WWW::Mechanize::Firefox();

open(INPUT, "<url.txt") or die $!;

while (<INPUT>) {
        chomp;
        print "$_\n";
        $mech->get($_);
        my $png = $mech->content_as_png();
        my $name = "$_";
        $name =~s/^www\.//;
        $name .= ".png";
        open(OUTPUT, ">$name");
        print OUTPUT $png;
        sleep (5);
}
[download]

Well i forgot to mention that i only need to have the images as small thumbnails - so we do not have to have a very very large files...

i only need to grab a thumbnail screenshot of them. How do I do that?

my $png = $mech->content_as_png();

The line returns the given tab or the current page rendered as PNG image.
as the documentation states:

All parameters are optional. $tab defaults to the current tab. If the coordinates are given, that rectangle will be cut out. The coordinates should be a hash with the four usual entries, left,top,width,height. This is specific to WWW::Mechanize::Firefox.

Well does that have any impact on our code - can we make the code somewhat more specifical to our needs. Note:the files i store are a bit smaller - if we limit them after fetching from the web !? aren 愒 they!?

BTW; Currently, the data transfer between Firefox and Perl is done Base64-encoded, isn t it. Well It would be beneficial to find what 's necessary to make JSON handle all the binary data more gracefully what do you think. Love to hear from you

Now i try to install mozrepl - if you or anybody has got some tipps on that. should i run

a. zypper in .or

b. get it as a firefox plugin greetings

Comment on Perl Mechanize issues - how to make a script running faster with less overhead Select or Download Code

Replies are listed 'Best First'.
Re: Perl Mechanize issues - how to make a script running faster with less overhead by Corion (Patriarch) on Feb 11, 2012 at 13:36 UTC
You might want to make more clear which parts of your text are copied from the WWW::Mechanize::Firefox documentation and which parts are your own. Especially when the documentation answers your questions. See the examples included with WWW::Mechanize::Firefox for how to save screenshots. The module does not concern itself with resizing images. See the various image modules on CPAN, like Imager.	[reply]
Re^2: Perl Mechanize issues - how to make a script running faster with less overhead by Perlbeginner1 (Scribe) on Feb 12, 2012 at 01:49 UTC
hello thx for all the hints greetings pb1	[reply]
Re: Perl Mechanize issues - how to make a script running faster with less overhead by Anonymous Monk on Feb 11, 2012 at 11:18 UTC
UTSL sub content_as_png { my ($self, $tab, $rect) = @_; $tab \|\|= $self->tab; $rect \|\|= {}; # Mostly taken from # http://wiki.github.com/bard/mozrepl/interactor-screenshot-server my $screenshot = $self->repl->declare(<<'JS'); function (tab,rect) { var browser = tab.linkedBrowser; var browserWindow = Components.classes['@mozilla.org/appshell/ +window-mediator;1'] .getService(Components.interfaces.nsIWindowMediator) .getMostRecentWindow('navigator:browser'); var win = browser.contentWindow; var body = win.document.body; if(!body) { return; }; var canvas = browserWindow .document .createElementNS('http://www.w3.org/1999/xhtml', 'canva +s'); var left = rect.left \|\| 0; var top = rect.top \|\| 0; var width = rect.width \|\| body.clientWidth; var height = rect.height \|\| body.clientHeight; canvas.width = width; canvas.height = height; var ctx = canvas.getContext('2d'); ctx.clearRect(0, 0, width, height); ctx.save(); ctx.scale(1.0, 1.0); ctx.drawWindow(win, left, top, width, height, 'rgb(255,255,255 +)'); ctx.restore(); //return atob( return canvas .toDataURL('image/png', '') .split(',')[1] // ); } JS my $scr = $screenshot->($tab, $rect); return $scr ? decode_base64($scr) : undef }; [download] Now i try to install mozrepl - if you or anybody has got some tipps on that. should i run Hire a sysadmin, seriously	[reply] [d/l]