Polyglot has asked for the wisdom of the Perl Monks concerning the following question:

I've done the "Super Search"; I've searched online; and apart from the "common sense" suggestions such as compressing the files before transfer, the "best" suggestions all led away from a pure-Perl solution; e.g. use wget or rsync or cURL, etc. A few stray suggestions involved hacking Net::FTP, and adding a "sleep" command to the code.

Links to some of those prior discussions:

But most of those suggestions are nearly decades old. Even shorewall-perl never made it simple or easy, as the documentation for it explains here: Traffic Shaping/Control -- and, of course, shorewall worked by configuring linux's iptables.

What can one do with Perl now? Is it possible to rate-limit one's downloads via a simple Perl command or module so as to play nicely with the server's resources?

Blessings,

~Polyglot~

Replies are listed 'Best First'.
Re: Bandwidth limiting for file downloads: What can Perl do?
by Corion (Patriarch) on Apr 29, 2022 at 09:21 UTC

    I've looked at such limiters previously, and the most things I find are rate limiters (but they are more central to my interests too). There is HTTP::Tiny::Bandwidth, which claims to do bandwidth limiting, but I haven't used it. Maybe this would work for you to slow down downloads.

    For hacking this into LWP::UserAgent, simply doing a usleep 0.5 or whatever in the content_cb callback would lower the bandwidth used, but it will still be quite bursty. I'm not aware of any convenient solution there, or for the async toolkits, so if you find anything relevant, a post about this is welcome!

      Conceivably, Mojo::UserAgent might have features that could help with this.
      I appreciate the lead. If you or someone else could give me a clue as to where to add this delay and how to make the callback, it would be much appreciated. I'm unfamiliar with the methods used by LWP::UserAgent, and when I tried using the special ':content_cb'     => \&callback mechanism, with "callback" being a subroutine that used usleep, all I got for my effort was a downloaded file named /content_cb ...hardly the expected behavior. I'm obviously clueless, even after reading the documentation. Given a hint as to how to start this, I'd like to see if anything useful could be done to rate-limit the connection. I spent a few hours playing with it already, but couldn't get past even the first step.

      Update: I found a question asked some years ago on SO that posted a code snippet which invoked the ':content_cb' callback, and I tried it. It changes absolutely nothing except that the downloaded file is misnamed (becomes "/content_cb", instead of the .html file that it should be). No change whatsoever in speed, regardless of the number put into "usleep". Adding a print line resulted in nothing as well--no output to the screen, so I don't think the subroutine is actually running, and it seems LWP::UserAgent is conflating the ':content_cb' token for the ':content_file' one.

      Blessings,

      ~Polyglot~

        The following program downloads a file very slowly. Maybe this gets you started.

        #!/usr/bin/perl -w use strict; use WWW::Mechanize; use 5.020; use feature 'signatures'; no warnings 'experimental::signatures'; my $mech = WWW::Mechanize->new(); my $large_url = 'http://ftp.acc.umu.se/mirror/wikimedia.org/dumps/dewi +ki/20220420/dewiki-20220420-abstract.xml.gz'; $| = 1; my $read_size = 0; $mech->get( $large_url, ':read_size_hint' => 4096, ':content_cb' => sub { $read_size += length( $_[0] ); my $len = length($_[0]); print "\r$len - $read_size bytes"; # discard the content sleep 1; }, ); say 'done';