ronstudio has asked for the wisdom of the Perl Monks concerning the following question:

Dear All,

I have been learning how to write non-blocking http request.
While following the example from Mojo::UserAgent:
http://mojolicious.org/perldoc/Mojolicious/Guides/Cookbook#Non-blocking

I do not understand the meaning of the following:
$fetch->() for 1 .. 2;
The following is the complete example:
use Mojo::UserAgent; use Mojo::IOLoop; my @urls = ( 'mojolicious.org/perldoc/Mojo/DOM', 'mojolicious.org/perldoc/Mojo', 'mojolicious.org/perldoc/Mojo/File', 'mojolicious.org/perldoc/Mojo/U +RL' ); my $ua = Mojo::UserAgent->new(max_redirects => 5); $ua->transactor->name('MyParallelCrawler 1.0'); my $delay = Mojo::IOLoop->delay; my $fetch; $fetch = sub { return unless my $url = shift @urls; my $end = $delay->begin; $ua->get($url => sub { my ($ua, $tx) = @_; say "$url: ", $tx->result->dom->at('title')->text; $end->(); $fetch->(); }); }; #Process two requests at a time $fetch->() for 1 .. 2; $delay->wait;

I have tried the following:
1) Removing the "for 1 .. 2" part.
i.e. $fetch->();
the result is the http request will always get the first link only.

2) Increase the end point, say "for 1 .. 4"
Based on the comment from the provided example, I believe this will result in processing 4 requests at a time.
Could I ask your help in explaining the meaning of the above syntax?
Does those 1, 2 become the argument passing into the function? (but $fetch is not reading @_, so this shouldn't be the case)
What exactly does that for loop mean in this context?


Many thanks for your help in sharing your Perl knowledge,
Ronald

Replies are listed 'Best First'.
Re: coderef for 1 .. 2?
by kcott (Archbishop) on Nov 01, 2017 at 07:59 UTC

    G'day Ronald,

    That's the range operator. See "perlop: Range Operators" for details.

    The for loop will set $_ to each of the values, e.g.

    $ perl -E 'say for 1 ..2' 1 2

    That's equivalent to this:

    $ perl -E 'for (1 .. 2) { say }' 1 2

    In the context you show, it's not passing $_ to your coderef (that would be $fetch->($_)). Here's a longer example, showing your posted statement exactly:

    $ perl -E 'my $fetch = sub { say "fetching ..." }; $fetch->() for 1 .. +2' fetching ... fetching ...

    — Ken

      Hi Ken,

      Many thanks for your reply and also the reference link as well.

      May I ask a bit further regarding the example code?

      my $fetch; $fetch = sub { # Stop if there are no more URLs return unless my $url = shift @urls; # Fetch the next title my $end = $delay->begin; $ua->get($url => sub { my ($ua, $tx) = @_; say "$url: ", $tx->result->dom->at('title')->text; $end->(); # Next request $fetch->(); }); }; # Process two requests at a time $fetch->() for 1 .. 2; $delay->wait;
      From the above code example, I believe this is an example of recursion.
      There is condition to exit (return unless my $url = shift @urls;) and $fetch is calling itself within the sub $fetch.

      I can understand the part with range operators to call how many times of $fetch.
      But why if the range operator has been removed, i.e.
      $fetch->(); $delay->wait;
      Shouldn't the code process 1 request at a time until all the @urls have been processed?
      But from my testing, it will only process the first url and then stop. Could you tell me more about this part?

      Thanks,
      Ronald
        "From the above code example, I believe this is an example of recursion. There is condition to exit (return unless my $url = shift @urls;) and $fetch is calling itself within the sub $fetch."

        Yes, that's recursion. An exit stategy is a standard feature; all recursive functions should have this: without it, they will recurse infinitely (or, at least, to the extent system resources or configuration allow).

        "... if the range operator has been removed ... Shouldn't the code process 1 request at a time until all the @urls have been processed?"

        Yes, that's what I would have expected. I ran a few tests and all URLs are processed with any of these:

        $fetch->() for 1 .. 2; # or $fetch->() for 1 .. 3; # or $fetch->() for 1 .. 4; # or $fetch->(); $fetch->();

        But only one was processed with these:

        $fetch->() for 1 .. 1; # or $fetch->();

        I did some investigating. I will point out that I'm in no way a Mojo* expert; in fact, I wrote a very tiny application using Mojolicious::Lite about a month or so ago (and, until today, that was my only exposure to this family of modules).

        I added lots of additional code to that cookbook example; mostly printing variable values or indications of where the code had got to. Eventually, I tracked this down to "delay" related code. After commenting out these two lines, a single "$fetch->()" processed all URLs:

        ### my $end = $delay->begin; ... ### $end->();

        I was using Perl 5.26.0 and Mojolicious 7.46. Although the latter had only been installed about a month ago, I noticed that there had been five updates since then and 7.51 is the latest; the Changes file showed a number of delay-related comments, so I attempted to install the latest via cpan (I only got 7.50, but 7.51 is only a little over 24 hours old, so probably hasn't reached my CPAN mirror sites yet). I retested, with and without those two lines commented, and got the same results as before.

        I checked the bug reports; none really seemed to address this issue; "Remove finish and error events from Mojo::IOLoop::Delay" looked like it was the only one that was related to delays.

        I can't really spend any more time on this. The next place to look would probably be Mojo::IOLoop::delay(). There are others here with far more experience with the Mojo* modules; they may be able to provide much better answers than I have.

        In the spoiler below, I've added my code with all the debug statements (and some additional code). You may find it interesting to track the process or for further investigations.

        — Ken

        By the way, that code has a memory leak (cyclic reference ($fetch contains a reference to a closure which captures $fetch)).

        Replace

        my $fetch; $fetch = sub { ... $fetch->(); ... }; ... $fetch->(); ...

        with

        use feature qw( current_sub ); my $fetch = sub { ... __SUB__->(); ... }; ... $fetch->(); ...
Re: coderef for 1 .. 2?
by Anonymous Monk on Nov 02, 2017 at 01:55 UTC
    I am fairly certain that the Perldoc is in error ... I find no good reason for what is basically an event-driven routine to call itself upon completion of its work. I would suggest raising that specific point with the module's author. The documentation may by-now be out of date, and not yet corrected.
      Thanks kcott and everyone for the time and energy involved with this. I feel obligated that I should go further to ask the module author for help if he can explain it further on the example. The mentioned 2 lines
      my $end = $delay->begin; ... $end->();
      are related to IOLoop. From the documentation, each $delay->begin would increment the event counter and the returned code reference can be used as a callback. The callback needs to be executed when the event has completed to decrement the event counter again. However, I just cannot understand the recursion in this case failed to continue calling itself without the for loop.

      The following is the question that I raised in the issue report:
      https://github.com/kraih/mojo/issues/1147#issuecomment-341304178
        Hi Ken,

        Many thanks for your help again.
        I think I have finally got it!

        The code example is basically the same example as this AnyEvent example from Perl Maven:
        https://perlmaven.com/fetching-several-web-pages-in-parallel-using-anyevent

        The "condvar" in AnyEvent is basically the same "delay" in Mojo. I find it easier to understand in the context of AnyEvent that condvar "condition variable" represent a condition that must become true.

        You'll notice the sequence and logic of the codes in both Perl Maven and Mojo cookbook are the same.
        I am wondering if it is possible for the event-loop to finish (and exit) before it can start calling the inner function.

        So as what you have found, the problem is related to the codes for loop control:
        ### my $end = $delay->begin; ... ### $end->();
        I notice if I change the sequence of the $end->(); as following:
        $fetch->(); $end->();
        Instead of
        $end->(); $fetch->();
        The code will now process all the URLs one by one when it is only $fetch->().
        And if I run it with for-loop like $fetch->() for 1 .. 2, it looks like processing 2 URLs in 1 go.
        The code behaving much more like the original intended.

        The problem with the original code is that if you are calling it 1 time only, at the time reaching the end of the event loop the condition may have satisfied and exit the program before it can call another function itself.(my guess)
        (I hope the above phase are correct in describing the issue. This event loop is too new for me)