rockyb has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to get the start and end ranges of subroutines in a file much the same way that Perl does for the %DB:sub hash when debugging is turned on.

I can approximate this using B:Utils::all_starts by looking for minimum and maximum COP lines, but that really isn't the same thing since the first executable line might not be on the same line as the sub declaration. For example:

  sub five {
     5;
  }
  sub six { 6; }

Note that the start and end lines of five() are one less and one greater than the single returned expression, 5; For six() though everything is on the same line.

  • Comment on How to get line ranges of subroutines from Perl source code

Replies are listed 'Best First'.
Re: How to get line ranges of subroutines from Perl source code
by Your Mother (Archbishop) on Apr 16, 2014 at 19:50 UTC

    It’s a little clumsy (might be a better way?) but this PPI based version seems to do a decent job–

    use PPI; my $doc = PPI::Document->new(shift || die "Give a doc!\n"); my $subs = $doc->find("PPI::Statement::Sub"); for my $sub ( @$subs ) { my $start = $sub->line_number; my $lines = $sub =~ y/\n//; my $end = $start + $lines; print "Found a subroutine starting at line ", $start, " and ending at line ", $end, $/; }

      Many thanks!

      I think this is pretty much what I need. I also need the fully qualified subroutine name e.g. main::foo or File::Basename::dirname, but that is easily added. To wit:

      #!/usr/bin/perl -w use PPI; use strict; use Cwd 'abs_path'; my $filename = shift || die "Give me a doc!"; my $doc = PPI::Document->new($filename); my $subs = $doc->find("PPI::Statement::Sub"); my $packages = $doc->find('PPI::Statement::Package'); my @pkg_info = (['main', 0]); if ($packages ne "") { for my $pkg ( @$packages ) { my $start = $pkg->line_number; my $pkg_name = $pkg->{children}[2]; push @pkg_info, [$pkg_name, $start]; } } sub enclosing_pkg($$) { my ($start, $end) = @_; my $pkg_name = 'main'; foreach my $pkg_info (@pkg_info) { if ($pkg_info->[1] > $start) { # fn start and end can't span a "package" statement. die "Bolixed package parsing" if $pkg_info->[1] < $end; last; } $pkg_name = $pkg_info->[0]; } $pkg_name; } print abs_path($filename), ":", $/; for my $fn ( @$subs ) { my $start = $fn->line_number; my $lines = $fn =~ y/\n//; my $end = $start + $lines; my $pkg_name = enclosing_pkg($start, $end); printf "\t%s::%s: %d-%d\n", $pkg_name, $fn->name, $start, $end; }

      The only remaining piece is getting a list of files that need to get loaded, but I think I can get that via %INC.

      Finally to the question of is this perfect, or is there's a better way? Possibly.

      But I'd like to start out with something that is pretty good as this is, and then improve or even rewrite as we understand better ways. In my experience, if you wait for the perfect solution, you'll never get anywhere.

      Old thread, I know, but this is *EXACTLY* the baseline I've needed to find for a project I'm currently working on.

      If you add:

      my $name = $sub->name; print "Found $name starting at line "...

      One gets the actual name of the sub too.

      Thanks!

      -stevieb

Re: How to get line ranges of subroutines from Perl source code
by LanX (Saint) on Apr 16, 2014 at 06:34 UTC
    The only way I'm aware of is scanning for curlies in the source around the known limits.

    You will need to ignore comments-lines and POD in between.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Re: How to get line ranges of subroutines from Perl source code
by karlgoethebier (Abbot) on Apr 16, 2014 at 18:58 UTC

    I'm not sure about your specs. But please take a look at Text::Balanced.

    I used it to parse Nagios status.dat files. It performs bad (or i wrote bad code) but it worked.

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

      I'm not sure about your specs.

      rockyb working on perl debugger Devel::Trepan, also released B::CodeLines, its not always easy/desireable to simply do it the way perl5db.pl does it :)

        I apologize for not giving more background. I was afraid of boring and burying people in too much detail.

        But yes, you are correct. Although I think of this as something interesting and useful in of itself, my particular interest is in adding to the %DB::subs hash that information that it would normally get if debugging were set up initially.

        When one uses Enbugger or manages to call a debugger inside a running program that wasn't set for debugging at the outset, it is desirable to reconstruct the structures, possibly on demand, that would have been gathered if Perl were set to debug at the outset.

        It's possible jjore will beef up Enbugger to add this information. And if he does, I have no doubt that he'll do a better or more thorough job than the approaches suggested here so far. But on the other hand, it would be good to explore and have other possibilities to create the information on demand too.

        By the way, I followed that perl5db.pl link and was both in awe of it, and frankly a little repulsed by it.

        There is way more stuff in perl5db when I first learned it decades ago. I think there is possibly more stuff in there than when I last looked at an apress book devoted to perl5db about a decade ago.

        However I find it a mish-mash of details, many that don't make sense to me. So much so that I find myself getting lost. Is it just me?

        The document starts out with an explanation or apology under "General Notes" of why the thing is so ugly — probably not the best way to lead off for most poeple who are looking for information. This is followed by coding tricks used that I doubt most people should be using normally, even if they are interested in such stuff. And if they are, they probably know this already.

        Somewhere among this, are features and capabilities I really didn't know much about like debugger startup, its options, what can go in the debugger initalization file, terminal and socket handling, and how to influence what happens on restart.

        Then finally after all of this other stuff, first mention of the actual commands that can be used, followed by, or rather intermixed with, API information.

        I think it both amusing and fitting that the thing ends with a POD error in red.

        I would offer to reorganize or rewrite it, if I weren't of the frame of mind that one should really move away from the debugger altogether.

        If you don't like Devel::Trepan, then please consider Devel::Hdb. Too many people have died trying to fix up perl5db.