in reply to Re^6: Testing with Test::Mock::HTTP::Tiny
in thread Testing with Test::Mock::HTTP::Tiny

I don't know what goes on in WWW::Crawl. You have the added complication that you change the domain name. I would start from the Mock module, add debugging messages to see what urls/domains it has in store. What I understand you did is: 1) get data from domain A. 2) save it to Mock module with different domain B (how???). 3) try to retrieve mock data by using domain B OR fetch fresh. 4) croak. Well, it looks like step 3 fails: url is neither in Mock module nor out there in the web to be fetched fresh. The latter is true because it has a 'fake' domain. So not in store or domain change failed.

Oh, in all my tests above I never changed the domain.

1min edit: can it be that HTTP::Tiny within WWW::Crawl was not affected by Mock (which adds a callback for 'request'). In this case, if WWW::Crawl is yours add an option to have an $http passed on it rather than creating one fresh from inside the module. Long shot but I have no idea how this kind of inter-package interaction works.

Replies are listed 'Best First'.
Re^8: Testing with Test::Mock::HTTP::Tiny
by Bod (Parson) on Sep 29, 2023 at 20:05 UTC
    I don't know what goes on in WWW::Crawl

    This is now on GitHub

    2) save it to Mock module with different domain B (how???)

    I took the data from the domain and replaced the domain this data came from "https://www.onradar.uk" 1 with the test domain "https://www.testing.crawl" using a simple search and replace. The replacement also changed the email domain in the page source but that's probably not relevant.

    1 - I've changed from using "www.way-finder.uk" in the original question to using "www.onradar.uk" now because the amount of data returned in the HTML source is smaller.

    Oh, in all my tests above I never changed the domain.

    I've changed the domain to one that doesn't exist on t'interweb so I can be sure the test code is using the mocked data and not going off and fetching a fresh page.

      I've changed the domain to one that doesn't exist on t'interweb so I can be sure the test code is using the mocked data and not going off and fetching a fresh page.

      Sooner or later, a troll will register the domain. Better either use a domain that you control, or use one of the domains reserved for purposes like this (see RFC2606):

      Reserved Top-Level Domains (Quoting RFC2606):

      • ".test" is recommended for use in testing of current or new DNS related code.
      • ".example" is recommended for use in documentation or as examples.
      • ".invalid" is intended for use in online construction of domain names that are sure to be invalid and which it is obvious at a glance are invalid.
      • The ".localhost" TLD has traditionally been statically defined in host DNS implementations as having an A record pointing to the loop back IP address and is reserved for such use. Any other use would conflict with widely deployed code which assumes this use.

      Reserved Second-Level Domains (again quoting RFC2606):

      • example.com
      • example.net
      • example.org

      RFC6761 updates RFC2606 with best practices.

      If you fake a domain data, you should probably use the .test or .invalid TLDs. If you have no better idea, use something like bod-example-1.test, bod-example-2.test, and so on. Probably, you may want to name your domains to match your tests, e.g. working-host.invalid, wrong-cert.invalid, and so on.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        Sooner or later, a troll will register the domain

        Ah yes...I'd overlooked that strange TLDs like crawl can now be registered...

        Fortunately I read this (and acted on it) lust before updating the code on GitHub and uploading WWW::Crawl to CPAN

Re^8: Testing with Test::Mock::HTTP::Tiny
by Bod (Parson) on Sep 29, 2023 at 21:32 UTC
    Long shot but I have no idea how this kind of inter-package interaction works

    Looking at the source code of HTTP::Tiny the get($url) method calls request('GET', $url) which, in turn calls _request('GET', $url).