|Just another Perl shrine
The problem of documenting complex modules.by BrowserUk (Patriarch)
|on Jul 05, 2015 at 08:41 UTC
This is meditation; but I also hope that it might start a discussion that will come up with (an) answers to what I see as an ongoing and prevalent problem.
This has been triggered at this time by my experience of trying to wrap my brain around a particular complex module; but I don't want to get into discussion particular to that module, so I won't be naming it.
Suffice to say that CPAN is replete with modules that are technically brilliant and very powerful solutions to the problems they address; and that deserve far wider usage than they get.
In many cases the problem is not that they lack documentation -- often quite the opposite -- but more that they don't have a simple in; a clearly defined and obvious starting point that gives a universal starting point on which the new user can build.
And example of (IMO) good documentation is Parallel::ForkManager. It's synopsis (I've tweaked it slightly to remove a piece of unnecessary fluff):
is sufficient to allow almost anyone needing to use it, for almost any purpose, to put together a reasonable working prototype in a dozen lines of code without reading further into the documentation.
It allows the programmer to get started and move forward almost immediately on solving his problem -- which isn't "How to use P::FM" -- and only refer back to and utilise the more sophisticated elements of P::FM, as and when he encounters the limitations of that simple starting point.
As such, the module is successful in hiding the nitty-gritty details of using fork correctly; whilst imposing the minimum of either up-front learning curve or infrastructural boiler-plate upon the programmer; who has other more important (to him) things on his mind.
Contrast that with something like POE which requires a month of reading through the synopsis of the 800+ modules in that namespace POE::*, and then another month of planning, before the new user could put together his first line of code. As powerful as that
And before anyone says that it is unfair to compare those two modules -- which maybe true -- the purpose was to pick extremes to make a point; not to promote or denigrate either.
Another module that I know I should have made much more use of in the type of code I frequently find myself writing, is PDL. I've tried at least a dozen times to use PDL as a part of one of my programs; and (almost) every time I've abandoned the attempt before ever writing a single line of PDL, because I get frustrated by the total lack of a clear entry point in to the surfeit of documentation.
There's the FAQ, and the Core; and the Index; and the QuickStart; and the Doc; and the Basic; and the Lite; and the Course; and the Philosophy; and the pdldoc; and the Tips; and ... I'm outta here. I'm trying to write my program, which does a little math on some biggish datasets that would benefit from being vectored, but life's too short...
Again; the underlying code is brilliant (I am assured), and it isn't a case of a lack of documentation; just a mindset that says: "this is PDL in all its glory, power and nuance. bathe yourself in its wonderfulness and wallow in its depth". Oh, and then when you've immersed yourself in its glory, understood its philosophy, and acclimated its nuance, then you can get back to working out how to use it to solve your problem.
And that's a real shame; and a waste.
I'm not sure what the solution is. I do know that the modules I use most List::Util, Data::Dump, threads etc. I have rarely ever had to look at the documentation; their functionality has (for me) become an almost invisible extension of Perl itself, and only the occasional (perhaps you forgot to load "sum"?) reminds me that they aren't.
Of course, what they do is essentially pretty simple; but that in itself is a perhaps a clue.
I do know that (for me) the single most important thing in encouraging my use of a module is being able to C&P the synopsis into my existing program, tweak the variable names, and have it do something useful immediately.
In part, that comes down to a well designed API; in part, to well-chosen defaults; and part having a well-chosen, well-written synopsis that addresses the common case; with variable names and structure that make it obvious how to adapt that synopsis for the common case. Once I have something that compiles and runs -- even if it doesn't do exactly what I need it to do; or even what I thought it would do from first reading -- it gives me a starting point and something to build on. And that encourages me to persist. To read the documentation on a as-I-need-to basis to solve particular problems as I encounter them.
Over a decade ago, I posted My number 1 tip for developers.; and this is the other side of that same philosophy. Start simple and build.
And that I think has to be the correct approach to documenting complex modules. They need to:
What it must not do:
If you want programmers to use your modules, you need to tell them what (the minimum) they *NEED TO KNOW* to get started. And then give a clear index to the variations, configurations and extensions to that basic starting point.
Achieve that, give them their 'in', with the minimum of words, fuss or choice, and they'll come back for all the rest as they need it.
This is ill-thought through and incomplete, so what (beyond risking offending half the authors on CPAN) am I trying to achieve with this meditation?
I'd like to hear if you agree with me? Or how you differ. What you look for in module documentation. Examples that you find particularly good; or bad.
It'd be nice to be able to derive from the thread, a set of consensus guidelines to documenting moderate to complex modules -- that almost certainly won't happen -- but if we managed to get a good cross section of opinions on what makes for good and bad documentation; and a variety of opinions of the right way to go about it; it might provide a starting point for people needing to do this in the future.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!