Perlbeginner1 has asked for the wisdom of the Perl Monks concerning the following question:

hya - good evening dear perl-folks want to grab screenshots from sites like the following..

www.google.com www.msnbc.com news.bbc.co.uk www.yahoo.com



for a mechanize-scraper i take this one here
#!/usr/bin/perl use strict; use warnings; use WWW::Mechanize::Firefox; my $mech = new WWW::Mechanize::Firefox(); open(INPUT, "<url.txt") or die $!; while (<INPUT>) { chomp; print "$_\n"; $mech->get($_); my $png = $mech->content_as_png(); my $name = "$_"; $name =~s/^www\.//; $name .= ".png"; open(OUTPUT, ">$name"); print OUTPUT $png; sleep (5); }



Well i forgot to mention that i only need to have the images as small thumbnails - so we do not have to have a very very large files...

i only need to grab a thumbnail screenshot of them. How do I do that?

my $png = $mech->content_as_png();

The line returns the given tab or the current page rendered as PNG image.
as the documentation states:

All parameters are optional. $tab defaults to the current tab. If the coordinates are given, that rectangle will be cut out. The coordinates should be a hash with the four usual entries, left,top,width,height. This is specific to WWW::Mechanize::Firefox.

Well does that have any impact on our code - can we make the code somewhat more specifical to our needs. Note:the files i store are a bit smaller - if we limit them after fetching from the web !? aren īt they!?

BTW; Currently, the data transfer between Firefox and Perl is done Base64-encoded, isn t it. Well It would be beneficial to find what 's necessary to make JSON handle all the binary data more gracefully what do you think. Love to hear from you

Now i try to install mozrepl - if you or anybody has got some tipps on that. should i run

a. zypper in .or

b. get it as a firefox plugin greetings

Replies are listed 'Best First'.
Re: Perl Mechanize issues - how to make a script running faster with less overhead
by Corion (Patriarch) on Feb 11, 2012 at 13:36 UTC
    You might want to make more clear which parts of your text are copied from the WWW::Mechanize::Firefox documentation and which parts are your own. Especially when the documentation answers your questions. See the examples included with WWW::Mechanize::Firefox for how to save screenshots. The module does not concern itself with resizing images. See the various image modules on CPAN, like Imager.
      hello thx for all the hints

      greetings pb1
Re: Perl Mechanize issues - how to make a script running faster with less overhead
by Anonymous Monk on Feb 11, 2012 at 11:18 UTC

    UTSL

    sub content_as_png { my ($self, $tab, $rect) = @_; $tab ||= $self->tab; $rect ||= {}; # Mostly taken from # http://wiki.github.com/bard/mozrepl/interactor-screenshot-server my $screenshot = $self->repl->declare(<<'JS'); function (tab,rect) { var browser = tab.linkedBrowser; var browserWindow = Components.classes['@mozilla.org/appshell/ +window-mediator;1'] .getService(Components.interfaces.nsIWindowMediator) .getMostRecentWindow('navigator:browser'); var win = browser.contentWindow; var body = win.document.body; if(!body) { return; }; var canvas = browserWindow .document .createElementNS('http://www.w3.org/1999/xhtml', 'canva +s'); var left = rect.left || 0; var top = rect.top || 0; var width = rect.width || body.clientWidth; var height = rect.height || body.clientHeight; canvas.width = width; canvas.height = height; var ctx = canvas.getContext('2d'); ctx.clearRect(0, 0, width, height); ctx.save(); ctx.scale(1.0, 1.0); ctx.drawWindow(win, left, top, width, height, 'rgb(255,255,255 +)'); ctx.restore(); //return atob( return canvas .toDataURL('image/png', '') .split(',')[1] // ); } JS my $scr = $screenshot->($tab, $rect); return $scr ? decode_base64($scr) : undef };

    Now i try to install mozrepl - if you or anybody has got some tipps on that. should i run

    Hire a sysadmin, seriously