Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
What tools exist for analyzing a Perl based system? I've been handed a CD with a Perl/CGI system on it and I'm supposed to report back on its quality/maintainability. I've run it through perltidy so it now conforms to the "one true Perl style" and is legible. It's 37 .pl files, 10K+ lines and a couple of comments.
What other tools exist for analyzing a Perl program? In other languages I have used programs that produce:
- System summaries (lines of code, file statistics...)
- Variable cross-reference
- Tree structure of the system
I've checked google, google groups, cpan, perl.com and here with no luck. Am I just using the wrong search terms?
Re: Programs/Methods for analyzing an existing Perl based system
by Ovid (Cardinal) on May 29, 2002 at 23:26 UTC
|
There is no real substitute for getting into the system and understanding how it works. Further, you have to know Perl fairly well, including good coding standards. I was recently helping a gentleman in the Netherlands with some Perl issues, when he emailed me a program and asked for feedback. Here's one of the subroutines:
sub stats {
unless ($aantalja) { &pak_getallen; }
$totaal_stemmen = ($aantalja + $aantalnee);
$eenstem = (100 / $totaal_stemmen);
$procentja = ($aantalja * $eenstem);
$procentnee = ($aantalnee * $eenstem);
$procentja = int($procentja);
$procentnee = int($procentnee);
if (($procentja + $procentnee) < 100) {
if ($procentja > $procentnee) { $procentja+=1; }
elsif ($procentja < $procentnee) { $procentnee+=1; }
}
}
Right off the bat, I can point to several problems. First, the subroutine is refers to variables declared outside of itself, so it's going to have side-effects that will be difficult to maintain. The indentation is poor, so it's tough to determine scope. Further, there's no sanity check to avoid a divide by zero error in this line:
$eenstem = (100 / $totaal_stemmen);
But does it work? Who knows? I don't speak Dutch. While there is plenty of information available in that little snippet, there is no meaning. Ultimately, this means that whatever metrics you want to produce, there is no substitute for understanding the code. Further, whatever metrics someone wants to put down as a standard, I guarantee that I can write code that will hit whatever target they are looking for, but still be an unmaintainable mess. Trust me, you should see some of my production code :)
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats. | [reply] [d/l] [select] |
|
sub stats {
unless ($numberyes) { &grab_numbers; }
$total_votes = ($numberyes + $numberno);
$onevote = (100 / $total_votes);
$percentyes = ($numberyes * $onevote);
$percentno = ($numberno * $onevote);
$percentyes = int($percentyes);
$percentno = int($percentno);
if (($percentyes + $percentno) < 100) {
if ($percentyes > $percentno) { $percentyes+=1; }
elsif ($percentyes < $percentno) { $percentno+=1; }
}
}
Greetz
Beatnik
... Quidquid perl dictum sit, altum viditur. | [reply] [d/l] |
Re: Programs/Methods for analyzing an existing Perl based system
by derby (Abbot) on May 30, 2002 at 02:14 UTC
|
Well, there are lots of non-free tools to do what you want as
well as loads of "theories" (and some of them are loads).
One of the more reasonable approaches is the
McCabe Complexity Metric.
Basically by loooking at each function (method,procedure,etc)
- Start with 1 for the "straight" path through the function
- Add 1 for each for, if, while, and, or.
- Add 1 for each case in a case statement
The number you come up with is the "complexity" of the
function. Average the complexity of all the functions and
you have the complexity of the codebase. The idea here is the lower the number, the less complex the code. The less complex the code, the better chances for higher quality. There's a lot more there (like if the function has a complexity of 1, does it really need to be its own function).
I don't know of a tool to do this for perl but there's one
on freshmeat
for other languages.
While strict adherence to complexity metrics can drive you
crazy, they actually fit nicely into the programming mantra -
high cohesion, low coupling. There's pretty strong evidence that if a function is doing a lot of conditionals it probably
has low cohesion.
-derby | [reply] |
|
There's something fundamentally wrong with measuring complexity based
on low level analyses of code and using the outcome to judge the
quality of code.
Most people will agree the grammar of the musings of Shakespeare is
much more complex than Dr Suess books. Does that mean the childrens
books have a higher quality than the plays?
There are other problems as well. Such analyses can only focus on a
particular implementation. It doesn't cast any judgement on a proper
algorithm. It will favour a linear seach of a sorted array over a
binary search, because the linear search requires less conditions to
implement.
It doesn't mean you shouldn't use such a tool. It just means that you
have to be very careful with what you do with its results.
Abigail
| [reply] |
|
A2,
There's something fundamentally wrong with measuring complexity based on low level analyses of code and using the outcome to judge the quality of code
Didn't think I was.
Everything else++. Except the part about Shakespeare and
Suess - that's just silly.
-derby
| [reply] |
Re: Programs/Methods for analyzing an existing Perl based system
by Molt (Chaplain) on May 30, 2002 at 09:35 UTC
|
I really don't think you're going to find what you're looking for in Perl. The grammatical nature of Perl makes parsing it exceptionally difficult ("Only perl can parse Perl") and so automated programs of this nature are nightmarish to write and even then won't catch everything. Nice things like eval, symbolic references, XS linkage, the ability to tweak so deeply into the engine, and other such things hammer us.
Yes, it has also been said that 'C doesn't have a grammar, C coders write their own with #define' but C seems a lot more regular than Perl does, and there's a lot more people willing to pay big money to people who can produce tools like this for it. Java obeys a nice simple grammar, which is why you see so many of these kind of tools for it.
All this being said there does seem to be good progress with the Perl Refactoring Browser, so possibly if someone was determined enough it may be possible to stand on their shoulders and produce code metrics from that. The Browser itself may do so, I've not looked that deeply into it to be honest, this'd help it detect the 'Code smells' that Refactoring is meant to solve.
| [reply] |
Re: Programs/Methods for analyzing an existing Perl based system
by graff (Chancellor) on May 30, 2002 at 07:13 UTC
|
Your situation made me think back to my early days, facing
the same sort of problem with masses of C code; back then
(before internet connectivity was common), I actually wrote
my own code indexer in C to read a bunch of C source files
and print a simple table that lists the functions defined
in each file, and the subroutines called from within each
function. It really helped.
So, how hard could it be to do that in Perl? Well, not that
hard, if you're willing to be content with less than (but
usually close to) 100% accuracy in the "precision and recall"
measures of subroutine detection.
It's a real bare-bones, quick-and-dirty, not-too-subtle
attempt, but it's here in case you want to try it
out. | [reply] |
Re: Programs/Methods for analyzing an existing Perl based system
by chicks (Scribe) on May 30, 2002 at 12:02 UTC
|
Perl really needs more tools to support software engineering. First and foremost among those is allowing metrics to be gathered. (Yes, there are bad metrics and there are metrics that can be easily weaseled, but I'm not telling you which metrics to use!) Obviously parsing the raw perl isn't going to be easy enough in perl5. But once perl parses it couldn't we navigate the op tree? That should make it easy to see how many non-local variables are affected and what-not.
If somebody has enough free time to put something into a project like this, let me know. Particularly if you're familiar with the way the B:: modules work. | [reply] |
Re: Programs/Methods for analyzing an existing Perl based system
by rinceWind (Monsignor) on May 30, 2002 at 11:57 UTC
|
A few months ago, I did post a code counter which does give a measure of density in terms of tokens per line. Although not designed for analysing perl, it can do so. The limitations with perl relate to token counting and the likes of regexs, heredocs and obscure quoting
Hopefully this can be of some use to you as a rough measure of code density and scale of effort required. | [reply] |
Re: Programs/Methods for analyzing an existing Perl based system
by samtregar (Abbot) on May 30, 2002 at 16:57 UTC
|
You can use a profiler, like Devel::DProf or my module Devel::Profiler, to extract a call-tree for a given request. If there are a small enough number of request types you might consider producing a call-tree for each request. This might give you an idea of how complex the application is.
Here's a quick example to get you started. First, run the code under a profiler:
$ perl -MDevel::Profiler -e 'sub foo { bar(); } sub bar { 1 }; print f
+oo();'
1
Then use the appropriate tool to generate a call-tree. In this case, dprofpp:
$ dprofpp -T
main::foo
main::bar
-sam
| [reply] [d/l] [select] |
Re: Programs/Methods for analyzing an existing Perl based system
by dada (Chaplain) on May 31, 2002 at 10:11 UTC
|
perl -MO=Xref yourscript.pl
Update:
System summaries (lines of code, file statistics...)
I haven't tried it, but Perl Metrics seems pretty good at statistics, although unfortunately it doesn't appear to be actively mantained.
cheers,
Aldo
__END__
$_=q,just perl,,s, , another ,,s,$, hacker,,print;
| [reply] [d/l] [select] |
|
|