in reply to Converting to sub signatures

My first concern would be with regexps like:

        if($line =~ /[^\#]*\ *sub\ (.*)\ \{/) {

It looks like the intent is to match only outside of comments, but because the pattern is not anchored the initial /[^\#]*/ is effectively a no-op. And because both elements of [^\#]*\ * can be zero-length, this would match eg "for my $sub (@callbacks) {" - I'd definitely want at least a mandatory space (or beginning-of-line).

For the rest, I would find the regexps easier to read if they were expanded with //x (at the cost of having to write \s+ frequently). But it's hard to comment on code that is "quite tied to (your) own coding style" without details of what that is. For example I'll tend to break long lines in particular ways that would not be hard to parse in a line-by-line manner, but in the general case that would be hard to do without a complex state machine.

Replies are listed 'Best First'.
Re^2: Converting to sub signatures
by cavac (Prior) on Jun 23, 2022 at 18:33 UTC

    You are absolutely right about the regular expressions not working all that well. It did in fact fumble some commented out functions into non-working code.

    I very seldomly use negative matches ("must not contain" stuff), usually i check first if the line is a comment, and THEN decide if a want to process further. How would i correctly anchor this?

    Example code lines:

    should match these: sub hello { sub hello { should not match these: #sub hello { # sub hello { # sub hello { # sub hello {

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

      G'day cavac,

      This does/doesn't match as per your example:

      qr{^\s*[^#]*\s*sub}

      Quick test:

      #!/usr/bin/env perl use v5.26; my $re = qr{^\s*[^#]*\s*sub}; my (@match, @no_match); while (<DATA>) { if (/$re/) { push @match, $_; } else { push @no_match, $_; } } say '*** Matched:'; print for @match; say '*** Not Matched:'; print for @no_match; __DATA__ should match these: sub hello { sub hello { should not match these: #sub hello { # sub hello { # sub hello { # sub hello {

      Output:

      *** Matched: sub hello { sub hello { *** Not Matched: should match these: should not match these: #sub hello { # sub hello { # sub hello { # sub hello {

      — Ken

      I guess the next question is: are there any examples you want to match that do not start m{ ^ \s* sub \b }x? If not, then you have your answer; if there are, those are the tricky cases we'd need to see.

      I just took a quick look at one of my codebases and found 558 matches for that pattern; 503 of them were normal sub declarations all matching m{ ^ \s* sub \s+ (\w+) \s* \{ }x, the rest were anonymous sub references not matching that pattern. Of matches against m{ \b sub \b }x elsewhere in any line, almost all were anonymous sub references (the remaining 2 or 3 were comments or $sub variables).

      So that pattern would work for me, would it work for you?