in reply to Re: Using a Fetchrow with LWP
in thread Using a Fetchrow with LWP

so this works:
use File::Fetch; use LWP::UserAgent (); use DBI; use parent 'HTTP::Message'; $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); $filename='/temp/edgar/workfile.txt'; unlink ($filename); $url='https://www.sec.gov/Archives/edgar/data/1869467/0000919574-23-00 +4048.txt'; my $ua = LWP::UserAgent->new(timeout => 10); $ua->default_header('Accept-Encoding' =>$mess = HTTP::Message->new()); $ua->default_header( USER_AGENT =>'COMPANY admin@example.com' ); print "Now downloading the file...\n"; my $res = $ua->mirror( $url, $filename );
but this doesn't...
use LWP::UserAgent (); use DBI; use parent 'HTTP::Message'; $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); my $ua = LWP::UserAgent->new(timeout=>10); $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); $ua->default_header('Accept Encoding'=>$mess=HTTP::Message->new()); $ua->default_header( USER_AGENT =>'COMPANY admin@example.com' ); my $SQL = "select url,filename from linktable"; my $sth = $dbh->prepare($SQL) or die "Prepare".$dbh->errstr; $sth-> execute() or die "".$dbh->errstr; while (my $row = $sth->fetchrow_arrayref) { my ($url,$filename)= ($row->[0],$row->[1]); print "\n$row[0] $row[1]\n"; my $resp = $ua->mirror( $url,$filename); if ( $resp->{success} ) { print "OK\n"; } else { print "Failure: $resp->{status}, $resp->{reason}\n"; } }

Replies are listed 'Best First'.
Re^3: Using a Fetchrow with LWP
by marto (Cardinal) on Jul 19, 2023 at 07:00 UTC

    "but this doesn't..."

    use LWP::UserAgent (); use DBI; use parent 'HTTP::Message'; $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); my $ua = LWP::UserAgent->new(timeout=>10); $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); $ua->default_header('Accept Encoding'=>$mess=HTTP::Message->new()); $ua->default_header( USER_AGENT =>'COMPANY admin@example.com' ); my $SQL = "select url,filename from linktable"; my $sth = $dbh->prepare($SQL) or die "Prepare".$dbh->errstr; $sth-> execute() or die "".$dbh->errstr; while (my $row = $sth->fetchrow_arrayref) { my ($url,$filename)= ($row->[0],$row->[1]); print "\n$row[0] $row[1]\n"; my $resp = $ua->mirror( $url,$filename); if ( $resp->{success} ) { print "OK\n"; } else { print "Failure: $resp->{status}, $resp->{reason}\n"; } }

    no strict, no warnings, no creation of a database handle object, it's almost as though you've ignored everything 1nickt has provided in this thread...

      The error I am getting is 403 forbidden.

      so the issue is not with the database related code. it is with the HTML::Tiny or LWP settings.

      here is the output with use warnings on

      Unquoted string "gzip" may clash with future reserved word at testdown +.pl line 9. Unquoted string "deflate" may clash with future reserved word at testd +own.pl line 9. Unquoted string "gzip" may clash with future reserved word at testdown +.pl line 30. Unquoted string "deflate" may clash with future reserved word at testd +own.pl line 30. Unquoted string "gzip" may clash with future reserved word at testdown +.pl line 33. Unquoted string "deflate" may clash with future reserved word at testd +own.pl line 33. Failure: 403, Forbidden Failure: 403, Forbidden Failure: 403, Forbidden Failure: 403, Forbidden Failure: 403, Forbidden

      These is what Securities and Exchange Commission says about downloading from them

      Fair access Current max request rate: 10 requests/second. To ensure everyone has equitable access to SEC EDGAR content, please u +se efficient scripting. Download only what you need and please modera +te requests to minimize server load. SEC reserves the right to limit request rates to preserve fair access +for all users. See our Internet Security Policy for our current rate +request limit. The SEC does not allow botnets or automated tools to crawl the site. A +ny request that has been identified as part of a botnet or an automat +ed tool outside of the acceptable policy will be managed to ensure fa +ir access for all users. Please declare your user agent in request headers: Sample Declared Bot Request Headers: User-Agent: Sample Company Name AdminContact@<sample company domain>.com Accept-Encoding: gzip, deflate Host: www.sec.gov

        The line numbers you've posted errors at don't exist in the code you've posted. The sensible approach is to post a SSCCE when asking for help.

        $mess = HTTP::Message->new(); $mess->encode(gzip,deflate); my $ua = LWP::UserAgent->new(timeout=>10); $mess = HTTP::Message->new();# Again $mess->encode(gzip,deflate); # ?

        Why are you doing this twice? Why not take steps to fix problems before posting code here? With strict/warnings:

        marto@Marto-Desktop:~/code/perlmonks$ perl -c junk.pl Global symbol "$mess" requires explicit package name (did you forget t +o declare "my $mess"?) at junk.pl line 6. Global symbol "$mess" requires explicit package name (did you forget t +o declare "my $mess"?) at junk.pl line 7. Global symbol "$mess" requires explicit package name (did you forget t +o declare "my $mess"?) at junk.pl line 9. Global symbol "$mess" requires explicit package name (did you forget t +o declare "my $mess"?) at junk.pl line 10. Global symbol "$mess" requires explicit package name (did you forget t +o declare "my $mess"?) at junk.pl line 11. Global symbol "$dbh" requires explicit package name (did you forget to + declare "my $dbh"?) at junk.pl line 14. Global symbol "$dbh" requires explicit package name (did you forget to + declare "my $dbh"?) at junk.pl line 14. Global symbol "$dbh" requires explicit package name (did you forget to + declare "my $dbh"?) at junk.pl line 15. Global symbol "@row" requires explicit package name (did you forget to + declare "my @row"?) at junk.pl line 18. Global symbol "@row" requires explicit package name (did you forget to + declare "my @row"?) at junk.pl line 18. junk.pl had compilation errors.

        Update: also it's good form to mark significant updates to a post.

        "These is what Securities and Exchange Commission says about downloading from them"

        I know.