http://qs1969.pair.com?node_id=413087

ryanus has asked for the wisdom of the Perl Monks concerning the following question:

I would like to be able to calculate the Cyclomatic Complexity of some Perl code. After searching Google, there seems to be nothing implemented which calculates this for Perl source code. For those who didn't click the link, Cyclomatic Complexity is a test of source code that generates a number. If this number is high, the code is complex and therefor hard to maintain, prone to errors, and hard to test.

Questions:

  1. Does anyone know of an implementation that will calculte Cyclomatic Complexity for Perl code?
  2. If this does not exist, does anyone know of any language agnostic engines that parsers for Perl parsers could be written for?

Replies are listed 'Best First'.
Re: Cyclomatic Complexity of Perl code
by BrowserUk (Patriarch) on Dec 08, 2004 at 08:57 UTC

    This particular metric, along most other metrics of this type--ie. static code metrics; including Lines-Of-Code (LOC); Function Points (FP); Interface Complexity (IF) etc,-- would be almost impossible to calculate to any degree of accuracy in any language that supports dynamic programming.

    That is to say, any language that supports eval & bless; runtime creation of functions my $func = sub{...};; runtime modification of the symbol table *{"${module}::{$func}} =; introspection if( ref $arg ){...} & if( $arg->can( ... ) ) {...}; and probably others; is almost impossible to evaluate in these static terms.

    If you try to do it simply by analysing the source code, you will completely miss all of the complexity that only comes into being at runtime. Even in pretty statitic langauges, like C, that support function pointers and dispatch tables, source code analysis is almost meaningless.

    The only way to render these type of statistics vaguely meaningful and comparable is to totally hamstring the programmers to not using any language feature or construct that would defeat static source code analysis.

    I once went through the exercise of taking a moderately complex, working system, that used function pointers and dispatch tables and re-writing it to remove these and a few other fairly everyday techniques that the 'Metrics guys' decried--mostly because their (very exspensive!), source code analysis software couldn't properly handle them.

    The result was a system that was almost

    1. Twice as large;

      Every previously dynamic decision had to have all it's possible code paths laid out to static decisions. 60% of which would probably never be exercised, but they had to be there for 'completeness'.

    2. Much slower.

      Even simple constructs like (some) switch/case statements; functions with multiple (data-conditional)returns; even exception handling (of the setjump/longjump variety); were outlawed.

      Because 'they said', "the data-driven aspect of the control flow, meant the complexity and correctness of the running system was dependent upon the make-up and correctness of the data supplied". (That's a from-memory quote.).

      In other words, their software couldn't predict the path(s) of flow, so their metrics were useless.

    3. Involved a much higher degree of cut&paste "code reuse".

      Nearly identical functions 'xeroxed' with a different name--because it reduced the complexity number!

    4. Took nearly twice as long to refactor as the original did to code.

      And much functionality had to be 'deferred' along the way.

    As with all such simple minded metrics systems, the original idea of providing for some means of measuring and comparing code quality, complexity and maintainability, is, in principle, a sound one.

    The problem is that the result, is a metric that becomes self-serving.

    1. Programmer's work starts to be assessed on the basis of 'the numbers'--rather than their inventiveness, productivilty and thoroughness.
    2. User requirements start being rejected on the basis that it 'introduces too much complexity'--rather than on the usefulness and productivity gains that the feature would provide the users.
    3. Employer's start seeing the measurements as a way of reducing their maintainance and training bills--by employing cheaper maintainance programmers and providing less training.

      This is the most pernicious and debilitating effect.

      • Development team turnover starts to rise because they become sick of writing 'baby code'.
      • Maintainance programmer turnover rises because they are paid so little that the promise of a few pence more will take them away.

        Or the provision of some training that may lift them out of the maintainance role into the developer role.

        Or simply because when you pay peanuts.... Even with the best will in the world, not every person that adopts the job-title "programmer" is going to be any good at it.

        When you pay bottom dollar, the least-skilled practioners will tend to rotate amongst the employers within their area until such time as the pattern becomes obvious and they move onto the next area or re-invent themselves in some variation of the role.

    I well remember the reactions of a few of the 'independant programmer assessments' that were made of the "reduced complexity system" that resulted from that exercise. Cheif among them were phrases like:

    • Who the **** coded this?
    • Why on earth was this done this way?
    • My eleven year old writes better code than this!
    • This is a joke right? Right??

    Don't misread me. I want code metrics. It has always seemed rediculous to me that a carpenter or metalworker can delegate a piece of work to an apprentice by drawing a sketch on the back of an envelope and writing a few numbers on it. The sketch doesn't need to be accurate, if the numbers (including tolorences) are.

    But the equivalent task of delegation in software requires, two hours, an overhead projector, several white boards and still what you get back may be as much related to what you envisaged as a Mac truck is to a tricycle.

    But imagine trying to measure the accuracy & productivity of a carpenter by measuring the length of his plane shavings or the volume of his sawdust; and then try and improve this by restricting him to overlap joints and nails because dovetails and screws are 'too complicated'.


    Examine what is said, not who speaks.
    "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

      I think you have hit upon an important point (which is only slightly related to the topic at hand).

      It has always seemed rediculous to me that a carpenter or metalworker can delegate a piece of work to an apprentice by drawing a sketch on the back of an envelope and writing a few numbers on it. The sketch doesn't need to be accurate, if the numbers (including tolorences) are.

      The key here is that they are drawing a representation of the "real world". And those numbers can be easily measured in the "real world". A carpenter or metalworker works directly with measurable physical reality.

      There is also a knowledge base of carpentry and metalworking that goes back to pre-historic times, and common language for all the aspects of those crafts which goes back many thousands of years. I know what you mean when you say a "dovetail joint", and anyone with even basic carpentry skills will to. And even more importantly, the concept is understood well enough that it can mentally visualized easily and that mental image can be easily transposed into many unique situations without too much trouble. After all, it's only a dovetail right?

      But the equivalent task of delegation in software requires, two hours, an overhead projector, several white boards and still what you get back may be as much related to what you envisaged as a Mac truck is to a tricycle.

      This is because it is incredibly hard to describe something which cannot be seen, cannot be heard, cannot be smelled, cannot be touched and cannot be tasted. As a matter of fact, software is nothing really at all but a collection of ideas and thoughts for which the end product is imperceptible patterns of magnetism on a round peice of metal (or metal coated plastic). Software works purely in the abstract.

      It should also be noted that we as software craftspeople do not have a common knowledge base to draw off of. We instead have a whole lot of common knowledge bases, which we pick and choose from and argue about endlessly ;)

      There is also no common language that we all share. Terminology is a disease, not a cure.

      And lastly, this silliness that we all enjoy so much has really only been happening for about 50 years. Sure it's foundations are a little older than that, but the act of magnitizing small disks and using coordinated electical pulses to perform something akin to thought is a pretty new thing still.

      Okay, enough of that silliness, I need more coffee!

      -stvn
        This is because it is incredibly hard to describe something which cannot be seen, cannot be heard, cannot be smelled, cannot be touched and cannot be tasted.

        The electronics industry, by definition, has been around just a few years longer than the computer software industry. Sit and observe a cpu, memory chips, or even the venerable 741 IC in operation and it is equally hard to measure or describe what is going on. However, you can buy memory from any one of a dozen manufactures and stick it in your PC, or mp3 player, or camera and it will work just fine.

        By contrast, the software industry continues to re-invent every building block used by every program on every system over and over. It reminds me very much of the early days of the car & motorcycle industries where each component was individually crafted for each vehicle.

        I said a bit more on this in Re^2: What do you call yourself? & Re^4: What do you call yourself?, and a lot more in Re: How do you view programming, and even more in Re: (OT) Programming as a craft, Re: Re: Re: (OT) Programming as a craft & Re+: (OT) Programming as a craft (loong post. Even by my standards!).

        As a coder, I enjoy trying to invent rounder wheels (along with hovercraft, monorails, helecopters and rockets ships; not to mention those designs for an 8-legged vehicle that can walk on ceilings and the snake-car that can traverse land an water without roads using an fuel economic motion. :). As a one-time project leader/manager/architect, I know that until we in the software industry manage to agree on a few basic, re-usable, interchangable, replaceable, software component standards--and agree to use them, universally--software will continue to be one-off works of art! Replicated as in reproductions, but each individually created for it's purpose, platform etc.

        As I said. I want software metrics. I want to be able to describe a requirement, or a subcomponent of it, in terms of smaller, standardised subcomponents. I want to be able to mix'n'match those subcomponents from multiple sources. I want to be able to assess and compare those components via some consistant and verifiable, industry standard metrics.

        I should be able to choose a sort implementation based on my criteria: For application A, I choose an implementation that trades ram for speed, cos speed is paramount. For application B; I'll choose a low-memory implementation, because that's more important that absolute performance.

        I shouldn't have to write these, just download them (with or without fee) as binary components from any standard sort supplier. I could even ship my program without a sort. The customers machine would interogate the local system and either use the one currently available--if it meets the specification I embed in the application--or goes out to the intranet; or the internet; and pulls one that does from a preferred supplier, or the first complient one that Google throws up.

        This utopian vision (IMO) has to happen. Useful , comparable, standardised metrics are the first step to achieving it. I just don't see source code complexity as a useful measure. At least not until you also have a way of measuring the complexity of the problem the same software solves. If you had both, then you could divide the latter by the former to produce a "complexity of implementation" ratio that might be useful.

        Until then you may as well compare the goodness of art by the size of the canvas divided by the number of brush-strokes used. Rolf Harris would do well:)


        Examine what is said, not who speaks.
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      As stvn pointed out, the most effective way to analyze Perl software is not through static source code analysis, but by taking advantage of the dynamic nature to gather information at run-time.

      Of course, that does mean that the expensive analysis software your company bought wasn't going to be any help, but it doesn't mean the task is impossible.

      Looking at the output of the various profiling and coverage tools, it seems like you could perform a similar code analysis function with reasonable results by combining the source-code and run-time information.

        I was commenting directly at the static method the OP asked about, and at static methods in general.

        The source code analysis software we were evaluating was for C; around in the early '90s; rejected completely; and probably died a death. The best tool I saw for code complexity analysis was a ICE based tool that operated on the compiled (with probes) executable that instrumented the entire program, not individual source files.

        The problem with instrumenting individual files is that it is pretty easy to ensure that each file has a very low complexity value. One easy step is to make sure that each file has almost nothing in it. But that is self-defeating if the number of source files trebles. Even if this simple step is avoided, the other sanctioned ways of reducing complexity result in much higher interface complexities, which is again self defeating.

        As a measure, complexity is a very bad indicator of anything useful. Some, (most) systems are complex by their very nature.

        A 747 is hugely complex, but would it be "better" if you removed all the dual and triple redundancy systems to reduce complexity?

        The most reliable car, is one without an engine. It's less complex, but is it "better".

        It's not just the case that it would be difficult to produce a measure of complexity for Perl source code, even by attempting to combine both static and dynamic measurements; you also have to consider what that measure would mean, if it was produced?

        Measuring the complexity of the source code, or how hard the runtime code has to branch to do what it does, has much more to do with the complexity of the problem being solved, than of the 'goodness' of the way the solution was written.


        Examine what is said, not who speaks.
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      Reminds me of a quote I ran across earlier today:
      If you torture data sufficiently, it will confess to almost anything. -Fred Menger, chemistry professor (1937- )

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

      I empathise with your tales of woe, but I think you go too far in dismissing the value of metrics as source code instrumentation.

      If one considers a given metric to be part of the whole picture, but at the same time does not confuse the single metric for the whole of the picture, I think metrics can help a great deal, particularly when dealing with large volumes of code, and large numbers of coders, and particularly in terms of informing decisions about where to spend code review and unit test effort.

      It sounds like you have suffered from idiotic applications of misguided rules more than you have suffered from the metrics themselves, but perhaps that is the price of their existence; that they will be abused like so much else.

      Out of interest, I generated these metrics for a part of my hobby chess engine -

      MethodCyc. Compl.LOCBlank linesComment linesComment %Statements
      GenMoves504372311426.09236
      FENToPosition41117182723.0899
      IsAttacked26193177438.34102
      PieceToString1621000.0032
      ColourOfPiece1416400.004
      PieceToString1015000.0020
      HitTheEdge1031900.0010
      ASCToFEN95361935.8526
      FirstBit8391000.0021
      MoveToString828100.0017
      Slide5256416.0015
      GetMoveCount421600.0012
      PrintMoveList25000.002
      LeftEdge14000.002
      RightEdge13000.001
      TopEdge13000.001
      BottomEdge13000.001
      SquareToString13000.001
      SquareToString13000.001
      SquareToRank13000.001
      SquareToFile13000.001
      ASCToPosition13000.001
      Warn13000.001

      As a profile of this part of the engine, I found this very interesting. For instance,

      • why do four of the complex methods have no comment lines at all?
      • Why is FENToPosition so complex? (this was a surprise)
      • Is Slide() under-commented at 16%?
      • What (if any) is the correlation between lines of code (LOC) and Complexity?
      • Is this consistent with other code I've written?
      • How do all these stats compare to other projects?
      • Does this align with my intuitive sense of their complexity?
      • Is there a code density metric I can derive from LOC and statements?

      Now some of these thoughts I could have had by simply browsing the code, but I hope I'm illustrating the usefulness of these metrics as a high-level view. I find them provocative in a highly beneficial way.

       

        I empathise ... It sounds like you have suffered from idiotic applications of misguided rules

        No need for empathy. It was a long time ago, and we didn't suffer :) The result of the study, was that we rejected both the tool and the idea.

        With respect to your statistics. One set thereof does not a case make.

        I would derive two things from my reading of the numbers.

      • Three modules are heavily over commented.
      • Large modules are more complicated than small ones.

        What I cannot say from those numbers is whether the complexity arises as a result of

        • the needs of the algorithm required to perform the function of those methods.
        • because a bad algorithm has been used.
        • because a good algorithm has been badly implemented.
        • because the "complex" (read:big) methods do too much.

        Indeed, without inspecting the source code, I cannot even tell the accuracy of those metrics.

        It could be that "HitTheEdge" contains a recursive algorithm that is extremely complicated to follow and modify, but simple in it's code representation.

        Or that "GenMoves" is a huge if/then/else structure that would be better implemented as a dispatch table

        Or that "Slide" uses a string eval to replace itself with an extremely complicated subroutine that the source code analyser sees simply as a big string constant.

        And that's my point. You are already looking to derive further metrics from the generated metrics, but there is no way to validate the efficacy of those metrics you have, beyond inspecting the code and making a value judgement.

        So you already falling into the trap of allowing the metrics to become self-serving, but the metrics themselves are not reproducible, scientific measurements, they are simply "indicator values".

        When you measure the length, mass, hardness, reflectivity, temperature, elasticity, expansion coefficient etc. of a piece of steel, you are collecting a metric which can be reproduced by anyone, anywhere, any time. Even if the tools used to make the measurement are calibrated to a different scale, it is a matter of a simple piece of math, or a lookup table to convert from that scale to whichever scale is needed or preferred. This is not the case for any of the metrics in your table.

        You don't say what language your program is coded in but I could (probably) take all of your methods and reduce them to a single line. It would be a very long line, but in most languages it would still run perfectly well. What affect does that have on your metrics?

        Equally, we could get half a dozen monks (assuming Perl) to refactor your methods according to their own formatting and coding preferences and skill-levels. And even if they all do a bang-up job of making sure that they reproduce the function of your originals--bugs an all--and if we then used the same program to measure their code as you have used, they will all produce different sets of numbers.

        And that is the crux of my distaste for such numbers. They are not metrics. They do not measure anything! They generate a number, according to some heuristic.

        They do not measure anything about the correctness, efficiency or maintainability of the code that gets run, they only make some guesses, based upon the way the coder formatted his source code.

        1. They do not conform to any standards.
        2. They are not transferable between programmers.
        3. They are not transferable between algorithms.
        4. They are not transferable between programs.
        5. They are not transferable between languages.
        6. They are not transferable between sites.
        7. They are not tranferable between design or coding methods. (OO, procedural, functional etc.).
        8. They are not transferable between assessment methods or tools.

        In short, they are not comparable, and you cannot perform math with them.

        As proof of this, take a look at your "PieceToString" and "HitTheEdge" methods. They have an equal 'complexity' when measured by the same tool. Is this obvious, or even definable from looking at the source code? If I am given two pieces of steel 10 cms long, even without measuring them with a rule, I can easily tell they are the same length. No such comparison is possible for source code.

        The tool has become the only way of comparing source code, and as it does not (and could not) adhere to any standard, all measurements are relative, not absolute. So, unless everyone agrees on which tool/language/coding standards etc. etc. to use, there is no way to compare two versions of the same thing.

        That means that in order to make comparisons, you have to implement every possible (or interesting) version of the source code, before you can make any inference about whether any one is good or bad.

        And even if you could code every possible implementation of a given algorithm, and could prove that they all produced exactly the same results, and you generated your numbers: What would it tell you?

        Should you pick the version with the lowest complexity rating? The shortest? The longest? The one with the highest ratio of comments?

        Would you make any choice based on the numbers alone? Or would you have to look at the source code?

        If you admit that you would have to look at the source code, then you have just thrown your "metrics" in the bin in favour of your own value judgement.

        And if you didn't, then you should publish the formulea by which you are going to juggle all those numbers in order to make your decision. It should make for interesting reading.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
Re: Cyclomatic Complexity of Perl code
by stvn (Monsignor) on Dec 08, 2004 at 04:10 UTC

    I can't help you on the Cyclomatic Complexity tool thing. But if you want to build you own, and need to parse perl code (and analyze it) then you might want to give PPI a look.

    From a quick scan of the Cyclomatic Complexity page you linked to, and from this sentence fragment/description "it measures the number of linearly-independent paths through a program module", it makes me think of Devel::Cover, which uses the "runops" (I think that is the name) routine to gather coverage information and could possibly be used to calculate Cyclomatic Complexity with.

    -stvn
Re: Cyclomatic Complexity of Perl code
by pernod (Chaplain) on Dec 08, 2004 at 09:39 UTC
    I would like to be able to calculate the Cyclomatic Complexity of some Perl code.

    Why?

    Don't get me wrong. Software metrics, properly used and configured, are great tools. The question is; is cyclomatic complexity what you really need?

    BrowserUk has given lots of reasons for not using this metric on dynamic languages, and I agree wholeheartedly with him. I would suggest looking for other metrics first before this one, starting with test coverage (as mentioned already in this thread). A super search on Devel::Cover will give you lots of useful information.

    A look at Tom Gilb's work on evolutionary programming and measurable requirements might also be a good idea. Gilb is a loudmouth, but many of his ideas are sound.

    I know this does not answer your question, but I hope it will give some food for thought. Good luck measuring!

    pernod
    --
    Mischief. Mayhem. Soap.

      Which would be a good idea to bring up what I heard someone term "Gilb's Law"

      (paraphrasing) "For anything that you wish to measure there exists some way of doing so that is better than not measuring at all"
Re: Cyclomatic Complexity of Perl code
by dragonchild (Archbishop) on Dec 08, 2004 at 13:26 UTC
    On the Software Design mailing list, we've been discussing this exact topic1. One of our members has written PPI, which does a good (but not perfect) job of parsing Perl code. From that, we're starting to work on a way of determining complexity. Of course, we're still arguing on what determines "complexity", but that's the fun part! :-)

    We'd really love to have you, or anyone else, join us.

    1. Without the nifty-keen name for it!

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: Cyclomatic Complexity of Perl code
by stvn (Monsignor) on Dec 08, 2004 at 20:02 UTC

    NOTE: This has been cross posted on the sw-design mailing list as well.

    So I was thinking about this Cylcomatic Complexity thing, and it occured to me that B::Concise actually gives up some of the necessary information (at least I think it does).

    Now of course there are some issues with this. To start with B::Concise's sequence numbers are in base 36 and values greater than 62 are not supported according to the docs. However, if you use the B::Concise functional interface, it seems that is might be possible to get around this restriction by using the perl opcode hex addresses instead of the sequence numbers. However now I am getting into needing to really write some code, and I am currently out of time and need to get back to "real" work. But anyway I am interested in comments from the group.

    -stvn

      You don't co-opt B::Concise for this. You use B::Utils::walkoptree() and everywhere you find a B::LOGOP, you increment your complexity by one. You shouldn't actually use this, however, for points better raised by BrowserUK, simonm, and stvn in other parts of this thread.

      use B (); use B::Utils (); print "function() == " . sub_simplistic_complexity( \ &function ); print "overall: " . program_simplistic_complexity(); sub program_simplistic_complexity { my $simplistic_complexity = 0; for my $op ( B::Utils::all_roots() ) { $simplistic_complexity += op_simplistic_complexity( $op ) } return $simplistic_complexity; } sub sub_simplistic_complexity { op_simplistic_complexity( B::svref_2object( shift() )->ROOT ); } sub op_simplistic_complexity { my $op = shift; my $simplistic_complexity = 0; B::Utils::walkoptree_filtered( $op, sub { shift() -> can( 'other' ) }, sub { ++ $simplistic_complexity } ); return $simplistic_complexity; }

        Could you elaborate on the signifagance of B::LOGOP please?

        -stvn