rendler has asked for the wisdom of the Perl Monks concerning the following question:

Currently I'm unable to capture wget's progress bar (the one with the arrow). It doesn't appear to be using STDOUT or STDERR. So this his not working:
open WGET, "wget foo.moo 2>&1 |" or die "Couldn't run wget: $!\n"; while(<WGET>){

Replies are listed 'Best First'.
Re: Capture wget progress bar
by tachyon (Chancellor) on Mar 15, 2003 at 13:13 UTC

    Presumably the REASON you want to do this is so you can report how the download is going. If this is what you want to do you will find it easier just to do it with a socket. Say you want something via http

    #!/usr/bin/perl use IO::Socket::INET; my $debug = 1; my $get = 'http://www.mysite.com/index.html'; my ( $domain, $stuff ) = $get =~ m!http://([^/]+)(.*)$!; $sock = IO::Socket::INET->new( PeerAddr => $domain, PeerPort => 80, Proto => 'tcp'); die "No socket" unless $sock; print "Got sock\n"; my ( $content, $buffer ); print $sock "HEAD $get HTTP/1.0\015\012\015\012"; $content .= $buffer while ( read ( $sock, $buffer, 1024 ) ); print $content if $debug; die $content unless $content =~ m/200 OK/; my ($content_length) = $content =~ m/^Content-Length:\s*(\d+)/m; print "Content-Length: $content_length\n" if $debug; $content = ''; $sock = IO::Socket::INET->new( PeerAddr => $domain, PeerPort => 80, Proto => 'tcp'); binmode $sock; print $sock "GET $get HTTP/1.0\015\012\015\012"; my $found_crlf = 0; while ( read ( $sock, $buffer, 1024 ) ) { $content .= $buffer; # we need to chop off the header at the first CRLFCRLF if ( ! $found_crlf and index($content, "\015\012\015\012") > 0 ) { $content = substr $content, (index $content, "\015\012\015\012 +") + 4; $found_crlf = 1; } printf "Got %.2f%\n", ( 100* length($content)/$content_length ) if +$debug; } print $content if $debug; __DATA__ Got sock HTTP/1.1 200 OK Date: Sat, 15 Mar 2003 13:07:35 GMT Server: Apache/1.3.26 (Unix) PHP/4.1.2 Last-Modified: Fri, 17 Jan 2003 01:08:19 GMT ETag: "1d0f53-a36-3e275783" Accept-Ranges: bytes Content-Length: 2614 Connection: close Content-Type: text/html X-Pad: avoid browser bug Content-Length: 2614 Got 28.27% Got 67.44% Got 100.00% <html> <head> [blah]

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      That's one of the reasons, the others are resuming, cookies, proxy. All possible with native Perl code but not as easy IMO. The only thing I can't get to work properly is the progress bar, which on wget looks VERY nice.

        FWIW you can easily make a progress bar by printing backspace chars or a carriage return so you can overprint.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Capture wget progress bar
by lacertus (Monk) on Mar 15, 2003 at 09:18 UTC
    I am unfamiliar with wget, however, I think your problem might lay in that fact that the output from wget is going into the buffer and not directly to output. Say you want a simple loading sequence:
    print "Loading"; for(1..10) {print '.'; sleep(1);}
    That prints "Loading......" with the periods coming singly, at one second intervals. This won't actually happen because it's going into the buffer. Rather, set Perl's special $| (that's a "pipe") to 1, so "$|=1" and output will go straight to stdout. Aftwards, if so inclined, set it back to 0 for regular buffering.

    Hope this helps,
    Lacertus
Re: Capture wget progress bar
by crenz (Priest) on Mar 15, 2003 at 11:45 UTC

    Not sure what you mean with progress bar. When doing

    #!/usr/bin/perl $| = 1; open OUT, ">out.txt"; open WGET, "wget http://www.perlmonks.com 2>&1 |" or die "$!\n"; print OUT "$_" while (<WGET>); close WGET; close OUT;

    I am getting the same in out.txt as I would via the command line:

    $ cat out.txt --12:39:46-- http://www.perlmonks.com/ => `index.html' Verbindungsaufbau zu www.perlmonks.com:80... verbunden! HTTP Anforderung gesendet, auf Antwort wird gewartet... 200 OK Länge: nicht spezifiziert [text/html] 0K .......... .......... .......... .......... .......... @ 480.77 + KB/s 50K .......... .......... ... @ 11.39 + MB/s 12:39:48 (691.69 KB/s) - »index.html« gespeichert [75079]
      You appear to have an old version of wget. From the manpage of 1.8.1:
      --progress=type

      Select the type of the progress indicator you wish to use. Legal indicators are dot and bar.

      ...

      Specifying --progress=bar will draw a nice ASCII progress bar graphics (a.k.a ``thermometer'' display) to indicate retrieval. If the output is not a TTY, this option will be ignored, and Wget will revert to the dot indicator. If you want to force the bar indicator, use --progress=bar:force.

      Here's what I get:
      $ wget --progress=bar:force www.perlmonks.org 2>&1 | less --13:18:17-- http://www.perlmonks.org/ => `index.html' Resolving www.perlmonks.org... done. Connecting to www.perlmonks.org[66.39.54.27]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] ^M [<=> ] 0 --.--K/s + ^M [ <=> ] 12,889 + 42.81K/s ^M [ <=> ] 6 +3,569 125.67K/s ^M [ <=> + ] 75,054 148.07K/s 13:18:19 (148.07 KB/s) - `index.html' saved [75054]

      Makeshifts last the longest.

Re: Capture wget progress bar
by deadkarma (Monk) on Mar 15, 2003 at 17:09 UTC
    wget's progress bar looks like it prints return chars (\r)instead of newlines. Maybe try changing the default input record separator ($/) to something else.
      Someone on Perl beginners suggested that it might be writing directly to the console and therefore not using an IO stream.
      Someone else also suggested using the logging facilty of wget, and I came up with this.
      sub wget { my $file = shift; $SIG{INT} = sub { unlink $log_f; exit }; if (my $pid = fork) { system "wget -o $log_f --progress=bar:force -c $file"; } else { die "cannot fork: $!" unless defined $pid; } sleep 1 until -f $log_f; open LOG, $log_f or die "Couldn't open '$log_f': $!\n"; my ($pos, $length, $status); while (1) { for ($pos = tell LOG; $_ = <LOG>; $pos = tell LOG) { s/^\s+//; if (/^Length: ([\d,]+)/) { print "Downloading: $file [$1] bytes.\n"; } elsif (/^\d+%/) { print "$_\r"; $status = 'downloading'; } elsif (defined $status eq 'downloading' and !/^\d+%/) { unlink $log_f; print "\n"; last; } } sleep 1; seek LOG, $pos, 0; } }
      Of course the error handling for it like that is not too great.