in reply to Re^2: HTTP response: 400 Bad Request
in thread HTTP response: 400 Bad Request

Sorry for reviving a 5 year old thread, but I am getting a 400 bad request from the SEC site now. Testing downloading this file using file::Fetch and LWP https://www.sec.gov/Archives/edgar/daily-index/2023/QTR3/form.20230712.idx

The SEC does not allow botnets or automated tools to crawl the site. Any request that has been identified as part of a botnet or an automated tool outside of the acceptable policy will be managed to ensure fair access for all users. Please declare your user agent in request headers: Sample Declared Bot Request Headers: User-Agent: Sample Company Name AdminContact@<sample company domain>.com Accept-Encoding: gzip, deflate Host: www.sec.gov

Replies are listed 'Best First'.
Re^4: HTTP response: 400 Bad Request
by Fletch (Bishop) on Jul 14, 2023 at 02:16 UTC

    I believe you'll need to set a custom user agent header (with a contact email as it shows) rather than using LWP's default. If you do that I think it'll let you through (I want to say had to do something similarly once for something from the Treasury).

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re^4: HTTP response: 400 Bad Request
by bliako (Abbot) on Jul 14, 2023 at 07:29 UTC

    but try first to set the agent string as they suggest, modify this to reflect yours: User-Agent: Sample Company Name AdminContact@<sample company domain>.com . They say "does not allow" and "managed to ensure..." so you may have a chance to do it by the book.

    Either way, this is how you set the agent string with File::Fetch:

    use File::Fetch; $File::Fetch::USER_AGENT = 'abc'; my $ff = File::Fetch->new(uri => 'https://dnschecker.org/user-agent-in +fo.php'); $ff->fetch(to=>'./abc');

    bw, bliako

      well, this is interesting.

      this:

      $File::Fetch::USER_AGENT = 'COMPANYNAME validemail@validemail.com';

      came back as this

      Fetch failed! HTTP response: 400 Bad Request 400 Bad Request at useragent.pl line 8.

      then changing it to this:

      $File::Fetch::USER_AGENT = 'User-Agent: COMPANY validemail@validemail.com';

      Fetch failed! HTTP response: 403 Forbidden 403 Forbidden at useragent.pl line 8.

        OK, almost there.. This worked under LWP.

        $ua->default_header('Accept-Encoding' => scalar HTTP::Message::decodable());

        $ua->default_header( USER_AGENT =>

        so now all I have to figure out is how to move those variables over to File:Fetch.