unmatched has asked for the wisdom of the Perl Monks concerning the following question:

Hello, Monks!

I come seeking your wisdom once more. Today, the question that I have is a simple matter of loops and how they work when they happen within functions such as grep.

This is my code:

for my $component ( split(/,/, $components_csv) ) { if (grep { $_ eq $component } @url_components) { push( @selected_components, $component); next; } # ... }

In this example, the next keyword: would take me to the next iteration of grep, or to the next iteration of the outer for loop? I could use a label to handle this, but I'm not sure if I need to.

Thank you, cheers.

Replies are listed 'Best First'.
Re: Question about loops and breaking out of them properly (updated)
by haukex (Archbishop) on Apr 18, 2025 at 11:18 UTC

    Your next is not inside the grep's block (which is "{ $_ eq $component }"), so it doesn't apply to it, only to the for - and in any case I wouldn't try to affect a grep or map with those control keywords. I would suggest of thinking of map and grep as always applying to the whole list, and it's also best practice for their blocks not to have any side effects.

    If you wanted to apply them to only part of the list, I would suggest filtering the list with grep first, or e.g. a slice. For anything more complex, use a for loop.

    Another common use case, which might apply in your case, is wanting to know from grep whether there is any match in the list, and stopping the search after that first match, for efficiency. This can be done with e.g. first or any from the core module List::Util. (Update: And apparently Perl is getting builtin any and all.)

    In general, you should use Text::CSV / Text::CSV_XS for parsing CSV instead of split, and if I am understanding your code correctly, my node Building Regex Alternations Dynamically might be a technique that's useful to you as well, as regexes could perhaps be faster than a linear search in an array. Update 2: Also Fletch has an excellent point about using a hash in the reply below.

    Minor edits for clarity.

      Another efficiency possibility would be to use a hash and exists (or treat it as a set where keys for members get set to 1). Precaffeine at the moment (so I believe this was the intent but not sure) but for what the op was trying setup a %valid_selections and then loop over the values passed setting $selected{$item} = 1 if exists $valid_selection{$item} then @selection = keys %selected.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        Hi, and thank you for your feedback

        You're correct, I'm basically trying to recreate a list of some sort that I can iterate over later. I considered using a hash for this but in this case it's such a short list that it's probably faster to do this with a list.

        I realized later that I want to check the final list for duplicates, which I've fixed with the List::Util::uniq function (using this module is already paying off!). It adds an extra iteration over the final list but again, quite small in size, otherwise I would've also preferred to use a hash.

      Thank you very much for your detailed input!

      In my case, there's not much more going on in the code and efficiency is not really a concern, as I'm working with very short lists of no more than maybe 5 elements or so. But I'd like to do things as "correctly" as possible which is why I was indeed looking to exit the lookup on the first match. Using first comes really in handy here, I'm glad that I asked as I obviously have a lot to learn about Perl. I'll be sure to read more about all these functions (thanks for the links!)/

      I'm calling split since it's such a short string anyway that it just seems more than enough. Plus, I'm intentionally restricting myself to the core modules to learn Perl more properly before I go out using other solutions, if that makes sense. At least for now, of course, as I'm not really making anything significant or complex, just small scripts for my own use.

      Just to clarify, this question refers to a simple script that can extract URL parameters out of a text file, with an option to provide the components to select from each match, like a specialized type of grep on the command line just for URLs:

      xurl --select domain,route file1 file2 ... xurl --select proto,credentials file1 file2 ...

      I'm taking my time to refine this script with better language features and cleaning it up, and this sort of questions come up from time to time.

      Thanks again!

        I'm intentionally restricting myself to the core modules to learn Perl more properly before I go out using other solutions, if that makes sense.

        Sure, I understand. For other tasks, like say HTML parsing, there's several different good modules to choose from, but for CSV, Text::CSV_XS falls into the (small) group of modules that can always be recommended, since it's pretty much the standard CSV parsing module so it's worth learning in addition to core Perl.

Re: Question about loops and breaking out of them properly
by ikegami (Patriarch) on Apr 18, 2025 at 23:25 UTC

    Without a label, next/last/redo affects the closest (innermost containing) loop.

    for my $y ( ... ) { for my $x ( ... ) { next; } }
    is equivalent to
    Y: for my $y ( ... ) { X: for my $x ( ... ) { next X; } }

    To go to the next pass of the outer loop, you can use the following:

    Y: for my $y ( ... ) { X: for my $x ( ... ) { next Y; } }

    In your example, the next is inside an if statement's body (not a loop), which is inside a foreach loop's body. It will go to the next pass of the only/foreach loop.


    Notes

    In your example, the next isn't in the grep "loop". It's in the if statement's body which is executed once the grep is complete.

    You should not use next/last/redo in the callback of the grep operator.

    Be careful. { ... } as a while statement is a loop. This means that

    for ( @a ) { { last; } say; }

    is equivalent to

    for ( @a ) { say; }
Re: Question about loops and breaking out of them properly
by LanX (Saint) on Apr 18, 2025 at 21:57 UTC
    > and in any case I wouldn't try to affect a grep or map with those control keywords.

    It's documented in next

    > "It should not be used to exit a grep or map operation."

    Tho it's a bit inconsistent, because return clarifies that those grep/map blocks are special, but doesn't discourage the use.

    > Returns from a subroutine, eval, do FILE, sort block or regex eval block (but not a grep, map, or do BLOCK block) with the value given in EXPR.

    On a side note: I've never thought of using return inside a sort block and can't remember ever seeing it.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

    Update

    So I tried it out in the debugger and next inside a grep block is doing exactly what I expected, it's ignoring grep and jumps to the next iteration of the surrounding loop. IOW

    • grep is not considered a loop
    • grep's block is breakable
    perl -de0 DB<12> for (1..9) {say grep {next if $_<5;$_%2} $_..9} 579 79 79 9 9

    PS: yes it's a convoluted example.