Re^7: Need help with WWW::Mechanize and Chrome cookies

The links for these files

<a class="txt" href="file.txt"> Text File </a>
[download]

can be obtained using the WMC instance by doing something like this

my @links = $mech->find_all_links( text_contains => 'some description 
+etc... ' );
my @urls = map { $_->[0] } @links;
[download]

In the case of WWW::Mechanize (WM) you can simply download the files using this code

for my $foo (@urls)
{
   my $filename = '/path/'.$foo;
   $mech->get($foo, ':content_file'=>$filename);
}
[download]

Unfortunately, this function does not work with WWW::Mechanize::Chrome (WMC). I hope the Author of WMC can shed some light on this or provide a patch. Thank you.

Comment on Re^7: Need help with WWW::Mechanize and Chrome cookies Select or Download Code

Replies are listed 'Best First'.
Re^8: Need help with WWW::Mechanize and Chrome cookies by Corion (Patriarch) on Jul 09, 2021 at 22:17 UTC
By "does not work", what do you mean exactly? If by that, you mean, "it's not documented, and not implemented", maybe you want to help implement it? Alternatively, you can maybe use `... my $filename = '/path/'.$foo; $mech->get($foo); my $img = $mech->content(); # save the image to disk` [download]	[reply] [d/l]
Re^9: Need help with WWW::Mechanize and Chrome cookies by bakiperl (Beadle) on Jul 11, 2021 at 12:20 UTC
Corion, Here is the issue: First, let's start with the html file that I have used to test the script (WMC.html) `<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type" /> <title>Testing hyperlink file Downloads</title> </head> <body> <h2>Testing Download of hyperlinked files using WWW::Mechanize::Chrome +</h2> <p></p> Let's try downloading this <a href="/my_Files/csv_File.csv">CSV File</ +a> <br/><br/> </body> </html>` [download] Now here is the Perl script. #!/usr/bin/perl -w use Log::Log4perl qw(:easy); use WWW::Mechanize; use WWW::Mechanize::Chrome; use strict; my $cookie_dir = 'C:/Users/some_user/AppData/Local/Google/Chrome/User +Data/Default/'; #chrome cookies path #my $mech = WWW::Mechanize::Chrome->new( data_directory => $cookie_dir +); my $mech = WWW::Mechanize->new(); my $uri = URI->new( "https://www.your_site.com/WMC.html" ); $mech->get( $uri ); unless ($mech->success) { my $mesg = $mech->response->status_line; print $mesg; goto FINISH; } my $path = "/path"; my @links = $mech->find_all_links( url_regex => qr/\.csv/i ); my @urls = map { $_->[0] } @links; for my $foo (@urls) { my $filename = $path.$foo; $mech->get($foo, ':content_file'=>$filename); my $file_content = $mech->get($foo); print $file_content->content(); } print "Success\n"; FINISH : [download] When I use the WWW::Mechanize instance, the script runs fine. It prints and saves the file content to disk. However, when the WWW::Mechanize::Chrome instance is used I get the following error message: `Cannot navigate to invalid URL -32000 at C:/Perl/perl/site/lib/Chrome/DevToolsProtocol/Target.pm line + 490` [download]	[reply] [d/l] [select]
Re^10: Need help with WWW::Mechanize and Chrome cookies by Corion (Patriarch) on Jul 11, 2021 at 15:20 UTC
`my @urls = map { $_->[0] } @links;` [download] That's not the documented way to get the URL from a WWW::Mechanize::Link object. You should use: `my @urls = map { $_->url_abs } @links;` [download]	[reply] [d/l] [select]
Re^10: Need help with WWW::Mechanize and Chrome cookies by Corion (Patriarch) on Jul 11, 2021 at 14:09 UTC
This is really weird - what is the value of the URL you're trying to navigate to when that error occurs? Where does that error occur?	[reply]
Re^11: Need help with WWW::Mechanize and Chrome cookies by bakiperl (Beadle) on Jul 11, 2021 at 14:42 UTC
Re^12: Need help with WWW::Mechanize and Chrome cookies by Corion (Patriarch) on Jul 11, 2021 at 15:10 UTC
Some notes below your chosen depth have not been shown here
Re^9: Need help with WWW::Mechanize and Chrome cookies by bakiperl (Beadle) on Jul 09, 2021 at 22:24 UTC
The hyperlinked files don't download to disk unless they are going somewhere else other than the declared directory. The code that you have suggested returns the the html document.	[reply]


Perl Monk, Perl Meditation
	PerlMonks