in reply to Re: HTTP response: 400 Bad Request
in thread HTTP response: 400 Bad Request

Thanks for the your input. The odd thing was that if started all of a sudden. Maybe a change in the server configuration. As long as I get the files, I guess that's all that matters.

.

Best,

Joe

Replies are listed 'Best First'.
Re^3: HTTP response: 400 Bad Request
by justin423 (Scribe) on Jul 14, 2023 at 01:56 UTC
    Sorry for reviving a 5 year old thread, but I am getting a 400 bad request from the SEC site now. Testing downloading this file using file::Fetch and LWP https://www.sec.gov/Archives/edgar/daily-index/2023/QTR3/form.20230712.idx

    The SEC does not allow botnets or automated tools to crawl the site. Any request that has been identified as part of a botnet or an automated tool outside of the acceptable policy will be managed to ensure fair access for all users. Please declare your user agent in request headers: Sample Declared Bot Request Headers: User-Agent: Sample Company Name AdminContact@<sample company domain>.com Accept-Encoding: gzip, deflate Host: www.sec.gov

      I believe you'll need to set a custom user agent header (with a contact email as it shows) rather than using LWP's default. If you do that I think it'll let you through (I want to say had to do something similarly once for something from the Treasury).

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

      but try first to set the agent string as they suggest, modify this to reflect yours: User-Agent: Sample Company Name AdminContact@<sample company domain>.com . They say "does not allow" and "managed to ensure..." so you may have a chance to do it by the book.

      Either way, this is how you set the agent string with File::Fetch:

      use File::Fetch; $File::Fetch::USER_AGENT = 'abc'; my $ff = File::Fetch->new(uri => 'https://dnschecker.org/user-agent-in +fo.php'); $ff->fetch(to=>'./abc');

      bw, bliako

        well, this is interesting.

        this:

        $File::Fetch::USER_AGENT = 'COMPANYNAME validemail@validemail.com';

        came back as this

        Fetch failed! HTTP response: 400 Bad Request 400 Bad Request at useragent.pl line 8.

        then changing it to this:

        $File::Fetch::USER_AGENT = 'User-Agent: COMPANY validemail@validemail.com';

        Fetch failed! HTTP response: 403 Forbidden 403 Forbidden at useragent.pl line 8.