in reply to Re: Bandwidth limiting for file downloads: What can Perl do?
in thread Bandwidth limiting for file downloads: What can Perl do?

I appreciate the lead. If you or someone else could give me a clue as to where to add this delay and how to make the callback, it would be much appreciated. I'm unfamiliar with the methods used by LWP::UserAgent, and when I tried using the special ':content_cb'     => \&callback mechanism, with "callback" being a subroutine that used usleep, all I got for my effort was a downloaded file named /content_cb ...hardly the expected behavior. I'm obviously clueless, even after reading the documentation. Given a hint as to how to start this, I'd like to see if anything useful could be done to rate-limit the connection. I spent a few hours playing with it already, but couldn't get past even the first step.

Update: I found a question asked some years ago on SO that posted a code snippet which invoked the ':content_cb' callback, and I tried it. It changes absolutely nothing except that the downloaded file is misnamed (becomes "/content_cb", instead of the .html file that it should be). No change whatsoever in speed, regardless of the number put into "usleep". Adding a print line resulted in nothing as well--no output to the screen, so I don't think the subroutine is actually running, and it seems LWP::UserAgent is conflating the ':content_cb' token for the ':content_file' one.

Blessings,

~Polyglot~

Replies are listed 'Best First'.
Re^3: Bandwidth limiting for file downloads: What can Perl do?
by Corion (Patriarch) on Apr 30, 2022 at 22:34 UTC

    The following program downloads a file very slowly. Maybe this gets you started.

    #!/usr/bin/perl -w use strict; use WWW::Mechanize; use 5.020; use feature 'signatures'; no warnings 'experimental::signatures'; my $mech = WWW::Mechanize->new(); my $large_url = 'http://ftp.acc.umu.se/mirror/wikimedia.org/dumps/dewi +ki/20220420/dewiki-20220420-abstract.xml.gz'; $| = 1; my $read_size = 0; $mech->get( $large_url, ':read_size_hint' => 4096, ':content_cb' => sub { $read_size += length( $_[0] ); my $len = length($_[0]); print "\r$len - $read_size bytes"; # discard the content sleep 1; }, ); say 'done';
      So why might that work with WWW::Mechanize and not with LWP::UserAgent which is supposed to accept the same callbacks? Is this a bug in the latter? With the Mechanize, it did reduce the bandwidth considerably, but at the end of 10+ minutes I had no file. How does one keep the file, too?

      Blessings,

      ~Polyglot~

        The same code should work with LWP::UserAgent, because WWW::Mechanize inherits from it.

        If you want to keep the file, you write the data in the callback instead of just printing the length of the data you received.