Splitting program into modules

lis128 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Splitting program into modules by eyepopslikeamosquito (Archbishop) on Nov 11, 2018 at 05:11 UTC
my main goal is to document code, understand it's flow and based on that create another functionality I've kept a list of PM nodes over the years related to this topic. Legacy Code Swallowing an elephant in 10 easy steps by ELISHEVA (Describes how she tackles big problems to keep moving forward rather than going around in circles) Strategies for maintenance of horrible code? Perl archeology: Need help in refactoring of old Perl code that does not use strict Moving from scripting to programming Code Structure Changes Object-oriented Reengineering Patterns book now available as a free download Nobody Expects the Agile Imposition (Part VI): Architecture (discusses refactoring vs rewriting of very large code bases) The Boy Scout Rule (apply the Boy Scout rule of "leave the campground cleaner than when you found it" to your code) Adding Tests to Legacy Code Looking for help for unit tests and code coverage on an existing perl script Needing help on testing How do you test end-user scripts?	[reply]
Re^2: Splitting program into modules by lis128 (Acolyte) on Nov 11, 2018 at 17:16 UTC
I must say that feedback overgrown my expectations. Thank you all for humongous repository of things to read- i really appreciate that I just wanted to say that i did not abandoned topic and will try to dig through your advices and links. In meantime, i've managed to isolate similiar functionalities without switching namespaces. I just got rid of package statements and usage of Exporter module, but this leads to another problem As i wrote earlier i am using my own simple debugging routines (yes i know it can be done better, but i am developing these modules giving them required functionality). Let's code speak for himself `my $debug = $ENV{'dbg'}; sub debugInfo { my $iWasAt= ( caller(1) )[3] \|\| "main"; my $lineWhereCalled= ( caller(0) )[2] \|\| ( caller(1) )[2]; print STDERR ("\033[1;31m\t$iWasAt\033[0m\@\033[1;32m$lineWhereCal +led:\033[0m \t\t@_\n") if ($debug); }` [download] Until now everything went fine, i called `debugInfo("entry: @_");` and i received package name with corresponding line where call was made, like `main@139: wchodze w loopa, iteracja:4 Database::sql_connect@144: entry: API::base@13: entry: config` [download] Now, my simple use'ing packages not being packages makes my $iWasAt always being main, also lines are relative to module file line number. So i am looking for another solution, but i feel that with you hackers, nothing's impossible :) Going do read thoroughly through your posts, thanks angain	[reply] [d/l] [select]
Re^3: Splitting program into modules by LanX (Saint) on Nov 11, 2018 at 18:16 UTC
As I already said you should start by splitting your 14000 lines into multiple files and `require` or `do` them, no need to switch packages at the first step. (Careful about filescoped private variables) Since `caller` will also tell you the filename, your debug routine can be more explicit then. Btw: Using the trace option of the debugger might be another option. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply]
Re^3: Splitting program into modules by talexb (Chancellor) on Nov 11, 2018 at 18:14 UTC
I'm a big fan of the Log::Log4perl module for logging. For me, the terrific feature of this module is that you can adjust the level of messages you get in your log file -- dial it up to DEBUG to get everything, or back down to WARN for just warnings. In between the two is INFO, containing useful messages about what my scripts are doing. If you add log messages to the various modules that you are developing, you'll be able to track in what orders things are happening. It's really illuminating to see this stuff scroll by -- I have status screens that watch the tail end of various log files during production hours so I can stay on top of how my system is behaving. Good luck -- let us know how it all turns out. Alex / talexb / Toronto Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.	[reply]
Re: Splitting program into modules by LanX (Saint) on Nov 10, 2018 at 23:38 UTC
Well ... some general thoughts on splitting up unknown code Incremental strategy create your test suite first do little steps, and always test the result use a revision control system like git commit every change use branches for experiments once you found a previously untested bug expand your test suite and roll back Prerequisites Study and understand ==== Perl essentials strict and warnings my vs our variables, namespaces, scoping, blocks exporter constants warn and die ==== Tools and techniques the debugger trace options to log function calls log files Data::Dumper , Data::Dump ==== your application your data model ( database, configs) input and output avaliable doc stakeholders (users, contributers, maintainers) to ask Analyze analyze the dependencies of your subs (x calls y calls z) analyze the shared variables try to visualize the dependencies in a graph the hierarchy should help identifying logical units (aka modules) use tools to help you analyze like B::Xref other tools like mentioned here: Searching for duplication in legacy code identify dead code (never called subs, out-commented trash) Documentation once you understood a mechanism, write it down use Pod headers when possible normalize (beautify) your code with Perl::Tidy review your pod2html from time to time and fill gaps Modularisation / Namespaces ? bundle subs into modules by functionality not technology (not all sql in one module, eg look at the TABLE names ) `require` into the same namespace might be an easier intermediate step before learning to use `exporter` � modules normally require namespaces (packages) package variables in other modules need to be fully qualified when used outside `$Pkg::var` � same for `Pkg::subs()` � modules allow to export vars and subs when `use` d modules allow to pass and init shared variables when using `import` (like a database handle) Object Oriented Programming logical modules are sometimes better OO classes check guides on "when OOP is better" indicator: if you have to pass around same arguments indicator: group of subs access always same global vars indicator: init() routines for globals are sometimes better ->new() easier to construct an object with encapsulated instance vars and class vars have a look at `Moo` before doing old style OOP Improvement make your code more fault tolerant add argument checking to your subs rewrite many positional args into named args condense duplicated code into new functions limit the scope of vars and subs if possible give identifiers like variables meaningful names document your strategy for future maintainers See Also Perl split file into modules Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice} update Expanded the OOP part �) that's an intermediate step exporting and importing is much cleaner �) As a rule of thumb from easier to better require files same namespace (NB: my-lexicals can't be shared) try to group connected subs into namespaces with package use module different namespaces => full qualified identifiers for shared data use module different namespaces => exporting and importing shared data use oo-class : shared data in instance/class-vars and methods, transport via constructor and setter/getters	[reply] [d/l] [select]
Re^2: Splitting program into modules by eyepopslikeamosquito (Archbishop) on Nov 12, 2018 at 06:05 UTC
Object Oriented Programming: logical modules are sometimes better OO classes ... As for whether and when to use OO, my simple rule of thumb is to ask "do I need more than one?": if the answer is yes, an object is indicated; if the answer is no, a module. A (non Perl-specific) design checklist (derived from On Coding Standards and Code Reviews): Coupling and Cohesion. Systems should be designed as a set of cohesive modules as loosely coupled as is reasonably feasible. Testability. Systems should be designed so that components can be easily tested in isolation. Data hiding. Minimize the exposure of implementation details. Minimize global data. Interfaces matter. Once an interface becomes widely used, changing it becomes practically impossible (just about anything else can be fixed in a later release). Design the module's interface first. Design interfaces that are: consistent; easy to use correctly; hard to use incorrectly; easy to read, maintain and extend; clearly documented; appropriate to your audience. Be sufficient, not complete; it is easier to add a new feature than to remove a mis-feature. Use descriptive, explanatory, consistent and regular names. Correctness, simplicity and clarity come first. Avoid unnecessary cleverness. If you must rely on cleverness, encapsulate and comment it. DRY (Don't repeat yourself). Establish a rational error handling policy and follow it strictly.	[reply]
Re^3: Splitting program into modules by LanX (Saint) on Nov 12, 2018 at 10:58 UTC
I once had to maintain code which had many subs accessing a bunch of global states which where switched by calling an "init()" routine or passed flags. After long analysis (Freudian yes) I realized that these routines where effectively methods, the states where instance vars and the so called init() routine switched the instances. Well actually that was only a simplified description of what happened, I don't wanna give you nightmares. :) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice} </div	[reply]
Re: Splitting program into modules by karlgoethebier (Abbot) on Nov 11, 2018 at 19:33 UTC
I don�t know how you count. Perhaps it�s not so much code if you skip the shebangs, pragmas and all the blanks? And you could consider to use Class::Tiny and Role::Tiny to organize your code? Please use SuperSearch to find some examples i�ve provided in the past. And sorry, i�m on my IPad and copying the links is pain in the ass 😕 on this device. Regards, Karl �The Crux of the Biscuit is the Apostrophe� `perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'`Help	[reply] [d/l]
Re^2: Splitting program into modules by LanX (Saint) on Nov 11, 2018 at 19:46 UTC
I'm mostly on Android and I just copy the URLs and my nodelet hack does the rest for me! :p EDIT I.e. this `https://perlmonks.org/?node_id=1153804` becomes this Good Intentions: Wikisyntax for the Monastery after preview/posting. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l]
Re: Splitting program into modules by harangzsolt33 (Chaplain) on Nov 12, 2018 at 23:52 UTC
Is there a program that reads a Perl source code and prints out the names of all the subs declared in that file? Perhaps an even more useful program might also show the dependencies, so at a quick glance you could see all the subs and which one depends on which one. If they are called sub a3e {} then, of course, that won't reveal much. But if they are called "calc_offset" or "getTimeZone" or something that is self-explanatory, then such a program would help a lot in breaking down this huge code into comprehensible chunks. Is there such a program?	[reply]
Re^2: Splitting program into modules by stevieb (Canon) on Nov 13, 2018 at 00:45 UTC
My Devel::Examine::Subs can list subs within files. Single file example: `use warnings; use strict; use Devel::Examine::Subs; my $des = Devel::Examine::Subs->new(file => 'lib/Devel/Examine/Subs.pm +'); my $subs = $des->all; print "$_\n" for @$subs;` [download] Output: `BEGIN new all has missing lines module objects search_replace replace inject_after inject remove order backup add_functionality engines pre_procs post_procs run valid_params _cache _cache_enabled _cache_safe _clean_config _clean_core_config _config _file _params _read_file _run_directory _run_end _write_file _core _pre_proc _proc _post_proc _engine _pod` [download] You can also do entire directory structures: `use warnings; use strict; use Devel::Examine::Subs; my $des = Devel::Examine::Subs->new(file => '.'); my $data = $des->all; for my $file (keys %$data){ print "$file:\n"; for my $sub (@{ $data->{$file} }){ print "\t$sub\n"; } }` [download] Snipped example output: `t/test/files/sample.pm: one one_inner one_inner_two two three four function five six seven eight examples/write_new_engine.pl: dumps lib/Devel/Examine/Subs/Sub.pm: BEGIN new name start end line_count lines code lib/Devel/Examine/Subs/Preprocessor.pm: BEGIN new _dt exists module inject replace remove _vim_placeholder` [download] The software does a ton of useful things, but these are examples of the most basic functionality. It does not know how to see sub dependencies of other subs. However, I do have another software that does, however, it is intrusive (it actually writes into the Perl files, and you have to run the software to get usable trace information (ie. if you don't call all scenarios, it may not find all flows). I don't have the time at the moment to write a proper scenario for that, but have a look at Devel::Trace::Subs if you're interested. If you don't come up with anything else by morning, I'll create a good example.	[reply] [d/l] [select]
Re^3: Splitting program into modules by stevieb (Canon) on Nov 14, 2018 at 03:23 UTC
So I've put together a very basic display of how the Devel::Trace::Subs works. Again, it's intrusive; it actually writes into the files you want to capture tracing info from (I wrote this software that another piece of software required, primarily out of sheer curiosity). Here's the original Perl file we're working with (`./test.pl`): `use warnings; use strict; three(5); sub three { return two(shift); } sub two { return one(_helper(shift)); } sub one { my $num = calc(shift); display($num); } sub calc { my $num = shift; return $num 3; } sub display { my $num = shift; print "$num\n"; } sub _helper { my $num = shift; return ++$num; }` [download] When run, it produces this output: `216` Very basic. Now, install `Devel::Trace::Subs`, and from the command line, tell it to become traceable: `perl -MDevel::Trace::Subs=install_trace -e 'install_trace(file => "test.pl")'` ...now the `test.pl` file looks like this: use warnings; use Devel::Trace::Subs qw(trace trace_dump); # injected by Devel::Trac +e::Subs use strict; three(5); sub three { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs return two(shift); } sub two { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs return one(_helper(shift)); } sub one { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs my $num = calc(shift); display($num); } sub calc { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs my $num = shift; return $num 3; } sub display { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs my $num = shift; print "$num\n"; } sub _helper { trace() if $ENV{DTS_ENABLE}; # injected by Devel::Trace::Subs my $num = shift; return ++$num; } [download] I'd like to point out that the design for this software was to be used within modules not normal scripts, but I digress. In order to get the output from the tracing, you have to add a couple of things to your calling script (in this case, it's the original script itself). We'll pretend we're calling modules infected with the trace software here. Add the trace enabling flag, then after all of your calls have been made you want to get the trace info from, call the `dump_trace()` function::wq `$ENV{DTS_ENABLE} = 1; three(5); # this is the original call stack you're running trace_dump();` [download] Now, you get the original output, but you also get the code flow and stack trace information: 216 Code flow: 1: main::three 2: main::two 3: main::_helper 4: main::one 5: main::calc 6: main::display Stack trace: in: main::three sub: - file: test.pl line: 7 package: main in: main::two sub: main::three file: test.pl line: 13 package: main in: main::_helper sub: main::two file: test.pl line: 17 package: main in: main::one sub: main::two file: test.pl line: 17 package: main in: main::calc sub: main::one file: test.pl line: 21 package: main in: main::display sub: main::one file: test.pl line: 22 package: main [download] You can opt via parameters to `trace_dump` to display just the code flow or the stack trace or both (as is the default as shown above), in text or HTML output formats. This is a very basic example of how I've used this software. Again, we're using it in a single file here. Normally I'd have a test script using external modules, so the command to return your original code is this: `perl -MDevel::Trace::Subs=remove_trace -e 'remove_trace(file => "test.pl")'` ...which returns the script back to default, except for the manual lines (which wouldn't normally be in an original `.pl` file). Delete these lines manually: `$ENV{DTS_ENABLE} = 1; trace_dump();` [download] I'll try to put together a much better example of how I really use it in the coming days.	[reply] [d/l] [select]
Re^4: Splitting program into modules by LanX (Saint) on Nov 14, 2018 at 13:44 UTC
Re^5: Splitting program into modules by stevieb (Canon) on Nov 14, 2018 at 14:21 UTC
Re^3: Splitting program into modules by LanX (Saint) on Nov 14, 2018 at 17:07 UTC
> It does not know how to see sub dependencies of other subs. From what I can see it also doesn't show dependencies from "outer" variables (globals or closure), right? update which is relevant in this thread Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply]
Re^2: Splitting program into modules by Anonymous Monk on Nov 14, 2018 at 06:24 UTC
Is there a program that reads a Perl source code and prints out the names of all the subs declared in that file? Devel::NYTProf generates an HTML report with sortable lists of subs that you can click to view the code with full timing information: `perl -d:NYTProf script.or.module.pl; nytprofhtml --open` [download]	[reply] [d/l]
Re: Splitting program into modules by Anonymous Monk on Nov 12, 2018 at 07:05 UTC
A contrarian view: 14000 lines of working code is no joke. Why bother refactoring? If you don't understand it, keep trying. To add new functionality simply write new subroutines that follow the conventions of the original code. Fragmenting the code may make it even harder to understand and maintain. Perl makes delicious and potent spaghetti, just add more sauce, and enjoy. K.I.S.S.	[reply]
Re^2: Splitting program into modules by eyepopslikeamosquito (Archbishop) on Nov 12, 2018 at 10:42 UTC
A contrarian view: 14000 lines of working code is no joke. Why bother refactoring? Successful software tends to live a long time: bugs are fixed; new features added; new platforms supported; the software adapted to new markets. That is, successful software development is a long term activity. Planning for success means planning for your code to be maintained by a succession of many different programmers over a period of many years. Not planning for that is planning to fail. This is the primary reason for refactoring and continuously keeping the code clean, to make long term code maintenance sustainable. Put another way, it's the difference between Programming "Hey, I got it to work!" and Engineering "What happens when code lives a long time?". A quick one-off hack is fine if the code only needs to run a couple of times ... but not if it becomes a long-lived critical feature. Programming is easy, Engineering hard. You need to hire programmers with sound technical skills and domain knowledge, enthusiastic, motivated, get things done, keep the code clean, resilient, innovative, team players ... and then motivate them, train them, keep them happy so they don't want to leave, yet have effective handovers when they do ... a hard problem. Yet to be successful that's what you need to do. See also: Why Create Coding Standards and Perform Code Reviews?	[reply]
Re^2: Splitting program into modules (Divide And Conquer) by LanX (Saint) on Nov 12, 2018 at 14:01 UTC
> 14000 lines of working code is no joke. Why bother refactoring? Such monster are mostly full of bugs because maintenance becomes impossible if you've lost the overview. Let's be generous and assume 100 lines of code and clutter per function in average. That'll mean 140 functions... ... divide this by 5 or 10 or 15 ... > K.I.S.S. D.A.C.D. � Splitting up into smaller units, included with `do` or `require` is pretty safe� ... and will add far better overview already. easier POD-Documentation better control over global vars granulated revision control by changing single files instead of a whole bundle easier deployment more efficient testing and I haven't even talked yet about the possibilities to improve this code further like described in my first post. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice} �) Divide and conquer, Dumbo! �) file scoped lexicals must be in the same file like the functions they access	[reply]
Re: Splitting program into modules by harangzsolt33 (Chaplain) on Nov 11, 2018 at 03:50 UTC
I have written a little sub that includes other files in your perl code. And I think, it's exactly what you need. Just try it and see if it works: include('database.pl'); OR my $whatever = include('getTime.pl'); ... sub include{open my$H,'<:raw',$_[0];read($H,my$E,999999)or die"Error: Can't include \"$_[0]\"";close$H;eval$E;}	[reply]
Re^2: Splitting program into modules by Corion (Patriarch) on Nov 11, 2018 at 07:29 UTC
Can you tell us how your code improves over do and require? Also, please note the limitations of your code, like that it doesn't handle files larger than a megabyte.	[reply]
Re^3: Splitting program into modules by harangzsolt33 (Chaplain) on Nov 12, 2018 at 03:06 UTC
Oh, yes, there seems to be no difference between include() and require. I haven't thought of that! :/	[reply]
Re^4: Splitting program into modules by LanX (Saint) on Nov 12, 2018 at 03:19 UTC
Re^5: Splitting program into modules by afoken (Chancellor) on Nov 12, 2018 at 15:17 UTC
Re: Splitting program into modules by Anonymous Monk on Nov 12, 2018 at 11:53 UTC
what am i missing here? How do properly split this code into logical chunks of separate files, but keeping namespace "main"? There's nothing particularly "proper" about splitting the code into separate files. You keep things in main by writing subroutines, not by fragmenting the codebase and then struggling to unify it. One program should be one file, unless the "parts" are truly going to be reused by other programs (which they usually are not).	[reply]
Re^2: Splitting program into modules by eyepopslikeamosquito (Archbishop) on Nov 12, 2018 at 20:51 UTC
One program should be one file, unless the "parts" are truly going to be reused by other programs (which they usually are not) I hope you're not recommending 14,000 lines of main program in a single file! On the contrary, I recommend keeping the main program file short, with most of the work done in (highly cohesive, loosely coupled) modules -- with documentation and a test suite around each module. You can find many examples of this approach on the CPAN. For example, in Perl::Tidy and Perl::Critic, the `perltidy` and `perlcritic` main programs are not much more than one-liners, essentially just: `use Perl::Tidy; Perl::Tidy::perltidy();` [download] and: `use Perl::Critic::Command qw< run >; run();` [download] with all the work being done in (well-documented) modules with test suites around each module.	[reply] [d/l] [select]
Re^3: Splitting program into modules by Anonymous Monk on Nov 13, 2018 at 05:13 UTC
I hope you're not recommending 14,000 lines of main program in a single file! I prefer writing, and hacking on, single file programs. It's much easier than remembering which module contains what code that's performing some action from a distance. I like to keep as much code as possible in the main program file. That being said, I also use plenty of modules, impose sane order on the source to ease navigation, and document everything.	[reply]


more useful options
	PerlMonks

Splitting program into modules

Incremental strategy

Prerequisites

Analyze

Documentation

Modularisation / Namespaces ?

Object Oriented Programming

Improvement

See Also

update

EDIT

update