Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^20: Need help with WWW::Mechanize and Chrome cookies

by Corion (Patriarch)
on Jul 27, 2021 at 19:20 UTC ( [id://11135407]=note: print w/replies, xml ) Need Help??


in reply to Re^19: Need help with WWW::Mechanize and Chrome cookies
in thread Need help with WWW::Mechanize and Chrome cookies

The following gets at the content of images displayed on a page:

#!perl use strict; use warnings; use 5.012; use WWW::Mechanize::Chrome; use Log::Log4perl ':easy'; Log::Log4perl->easy_init($TRACE); use File::Temp 'tempdir'; use Cwd; my $tempdir = tempdir(); my $mech = WWW::Mechanize::Chrome->new( headless => 1, data_directory => $tempdir, download_directory => cwd(), ); use Data::Dumper; my $res = $mech->get('https://egp.rutgers.edu/cgi/wmc.pl'); say Dumper $mech->getResourceTree_future()->get; my $link = $mech->xpath( '//a[text()="MY IMAGE"]', single => 1 ); $mech->click($link); $mech->sleep(1); my $resources = $mech->getResourceTree_future()->get; my @images = grep { $_->{type} eq 'Image' } @{$resources->{resources}} +; my $image = $mech->getResourceContent_future( $images[0]->{url} )->get +->{content}; open my $fh, '>:raw', 'test.jpg'; print $fh $image;

Note that you will need a way to find which image is the one you want.

Replies are listed 'Best First'.
Re^21: Need help with WWW::Mechanize and Chrome cookies
by bakiperl (Beadle) on Jul 28, 2021 at 15:24 UTC
    Corion,
    Many thanks. the code has worked for a single image download. How about if you loop over multiple images like this:
    my @ids = qw(101 102 103 104 105); foreach my $id ( @ids ) { my $link = $mech->xpath( "//a[text()='MY IMAGE $id']", single => 1 + ); $mech->click($link); $mech->sleep(1); my $resources = $mech->getResourceTree_future()->get; my @images = grep { $_->{type} eq 'Image' } @{$resources->{resourc +es}}; print @images, "\n"; # this shows that the information in the arr +ay is not resetting my $image = $mech->getResourceContent_future( $images[0]->{url} )- +>get->{content}; open my $fh, '>:raw', $id.'.jpg'; print $fh $image; }
    This code does not necessarily save the correct images because it looks like the resources are not resetting. Thank you again.

      If the resources are not resetting, that is a bug in Chrome. I suggest that you enable trace logging and look at the information that goes over the wire between Perl and Chrome.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11135407]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-03-28 13:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found