Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

SmallTalk-like Message browser

by diotalevi (Canon)
on Jul 03, 2007 at 21:12 UTC ( [id://624789]=sourcecode: print w/replies, xml ) Need Help??
Category:
Author/Contact Info Josh.
Description: Examines a code tree and reports on which things are called by which other things.
#!/usr/local/bin/perl
use strict;
use warnings all => 'FATAL';
use constant EMPTY_ARRAY => [];
use File::Find 'find';
use PPI         ();
use PPI::Dumper ();
use YAML 'Dump';
use Cwd 'abs_path';

my @to_search = map { abs_path($_) } @ARGV ? @ARGV : '.';

# Find all "classes" and the messages they directly accept.
my %subs;
find(
    sub {
        return unless -f and /pm$/;
        my $src = read_file($_);

        $subs{$File::Find::name} = extract_subs($src);
    },
    @to_search,
);

# Remove redundant information from the keys.
my $common_prefix = find_common_prefix( [ keys %subs ] );
if ( $common_prefix ) {
        for my $key ( keys %subs ) {
                my $new_key = $key;
                $new_key =~ s/\A\Q$common_prefix//
                        or next;
                $subs{$new_key} = delete $subs{$key};
        }
}

# Now figure out what each thing is potentially used by.
my %usage;
for my $file ( keys %subs ) {
        my $subs_href = $subs{$file};

        for my $sub ( keys %$subs_href ) {
                my $messages_aref = $subs_href->{$sub};
        my $messages_id = 0 + $messages_aref;
        $usage{$file}{$sub} = find_others( $sub, $messages_id );
    }
}

# Report it.
print Dump( \%usage );
exit;

sub find_others {
    my ( $name, $id ) = @_;

    my %o;
  FILE:
    for my $file ( keys %subs ) {
      SUB:
        for my $sub ( keys %{ $subs{$file} } ) {
            next SUB if $id == $subs{$file}{$sub};

          WORD:
            for my $word ( @{ $subs{$file}{$sub} } ) {
                next WORD unless $name eq $word;

                ++$o{"${file}:${sub}"};
            }
        }
    }

    return \%o;
}

sub extract_subs {

    # Accepts perl source and returns a hash reference of subroutines
    # and the messages they might be sending.

    my $doc = PPI::Document->new( \shift @_, readonly => 1 );

    my @uses =
      map {
        my $name = $_->schild(0)->snext_sibling->content;
        my @words =
                        map {
                                ( $_->content =~ /(\w+)/g )[-1]
                        }
                        @{
                                $_->find('PPI::Token::Word')
                                || EMPTY_ARRAY };
        if ( $name =~ /^\w+$/ ) {
            @words = grep { $_ ne $name } @words;
            [ $name => \@words ];
        }
        else {
            [ '???' => \@words ];
        }
      } @{ $doc->find('PPI::Statement::Sub') || EMPTY_ARRAY };

    # It is "possible" that a subroutine might be mentioned more than
    # once so I merge them here. Maybe that is only the ??? sub.
    my %x;
    for (@uses) {
        push @{ $x{ $_->[0] } }, @{ $_->[1] };
    }

    return \%x;
}

sub read_file {

    # Slurps a file.

    my $file = shift @_;
    open my $fh, '<', $file or die "Can't open file $file: $!";
    local $/ = undef;
    return <$fh>;
}

sub find_common_prefix {
    my $everything = join '', map { "$_\n" } sort @{ shift @_ };

    my $parts = 1;
    my %prefixes;
    my @lines;
    my $continue = 1;
    while ( $continue ) {
        $continue = 0;

        my $re = qr{^(/(?:[^/\n]+/){$parts,$parts})}m;
        pos( $everything ) = 0;
        while ( $everything =~ /$re/g ) {
            my $pos = pos $everything;
            $continue = 1;
            ++ $lines[$parts]{$1};
            pos( $everything ) = $pos;
        }

        $prefixes{$_} = $lines[$parts]{$_} * $parts for keys %{ $lines
+[$parts] };
        ++ $parts;
    };

    my ($max) = sort { $prefixes{$b} <=> $prefixes{$a} } keys %prefixe
+s;
    return $max;
}
Replies are listed 'Best First'.
Re: SmallTalk-like Message browser
by Ovid (Cardinal) on Jul 05, 2007 at 11:05 UTC

    This looks really interesting, but I confess that some of the output I'm receiving doesn't seem to make sense. I see modules which appear to be completely unrelated, even when I dig through the code, but the output suggests otherwise. Could you possibly explain how to interpret the output?

    Cheers,
    Ovid

    New address of my CGI Course.

      The output roughly means the following:
      --- <filename>: <subname>: called from: <filename>:<subname>: <this many times>
      The reason you're seeing odd results is a limitation of static analysis. If you have two packages with identically named methods, then their usage gets lumped together. That is, One->new and Two->new are grouped, and count as two calls to new.

      I don't think you can get meaningful results unless you have an interactive environment -- like Squeak -- that can track the class of the invocant. Still, this is a nice approximation. As long as you keep the limitations in mind, it can be quite useful.

        I haven't seen how I can get Squeak's browser to decide which senders of "new" are the relevant senders. I've been of the opinion that it also punts on the subject of allomorphism. I figure that if ST punts, I'm ok to do so also.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://624789]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-28 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found