Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

What are (popular) modules that access/modify @ARGV?

by frozenwithjoy (Priest)
on Apr 14, 2014 at 08:55 UTC ( [id://1082204]=perlquestion: print w/replies, xml ) Need Help??

frozenwithjoy has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a project that needs to access @ARGV before anything else does. I figured I'd add a subroutine that checks for potential conflicts and issues a warning if any of the more popular modules that access/modify @ARGV are loaded at the time my module is imported.

I know that it isn't feasible to list/check all potentially conflicting modules (see @conflicts below). Nevertheless, I'd like to come up with a decent list of popular/commonly used modules that touch @ARGV. What are your favorites?

This code identifies modules loaded both directly and indirectly (i.e., as dependencies). It works fine and I'm not asking for help with it (but I certainly won't turn away any comments/suggestions):

use Module::Loaded; sub _check_for_conflicts { my @conflicts = qw(AppConfig Getopt::Args Getopt::Long Getopt::Simple Getopt +::Std); my @loaded; for (@conflicts) { push @loaded, $_ if defined is_loaded($_); } if ( scalar @loaded > 0 ) { print STDERR <<EOF; WARNING: A module that accesses '\@ARGV' has been loaded before Log::Reproducib +le. To avoid potential conflicts, we recommended changing your script such that Log::Reproducible is imported before the following module(s): EOF print STDERR " $_\n" for sort @loaded; print STDERR "\n"; } }

Thanks all!

Replies are listed 'Best First'.
Re: What are (popular) modules that access/modify @ARGV?
by Corion (Patriarch) on Apr 14, 2014 at 09:35 UTC

    If you want to do easy restart properties for failed jobs, logging @ARGV will not help you. At least, I have cases where there is a distinct difference, passing time, which makes reusing the exact command line switches unfeasible in my case.

    I have lots of "convenience" command line switches, mostly used for crontab entries which select reporting dates:

    ./my_job --date prev-workday foo bar

    Of course, if such a job has failed, restarting it with that command line is not useful as the prev-workday now may well be a different date than what it was when the job failed.

    To solve this problem, I have come to the pattern of logging the parameters of the "main subroutine" for the job for restarting. The main subroutine must be called with what I call "effective parameters", that is, no more relative dates but only absolute information:

    GetOptions( 'date:s' => \my $reporting_date, ); $reporting_date||= $today; $reporting_date= Corion::Calendar->dwim( $date )->ymd; # convert symbo +lic names to YYYYMMDD generate_report( reporting_date => $reporting_date, ... );

    generate_report then logs its parameters for easy restarting, and in the case of an error usually outputs a command line that can be pasted verbatim into a console session verbatim to restart the job.

Re: What are (popular) modules that access/modify @ARGV?
by moritz (Cardinal) on Apr 14, 2014 at 09:33 UTC
      Wow, that's a really cool tool! Thanks.
Re: What are (popular) modules that access/modify @ARGV?
by Bloodnok (Vicar) on Apr 14, 2014 at 09:48 UTC
    Hmmm,

    My first thought was whether, or not, you could put B::Xref to use ... and then you wouldn't need to pre-construct a list of likely candidates (which will vary over time), because (with any luck and a following wind) you should be able to ascertain the information on-the-fly...

    Justa thought...

    A user level that continues to overstate my experience :-))

      So, I took your suggestion and wrote the following subroutine. Much of it is just the process of comparing the module's path to the object being called to determine the module name. It works well if I put it in a script; however, when I put it in a module (such that it runs on import), running a script that calls the module essentially fork bombs my machine. Any suggestions on possible workarounds? When I get a chance tonight, I'll look at the B::Xref source and see if I can figure something out. Thanks.

      sub check_conflicts { my @xref_out = `perl -MO=Xref,-r $0 2> /dev/null`; @xref_out = grep { /ARGV/ } @xref_out; my %conflicts; for (@xref_out) { my ($module_path, $object ) = split /\s/; $module_path =~ s|/|::|g; my @object_path = split /::/, $object; for (0 .. $#object_path) { my $module_name = join "::", @object_path[0..$_]; if ($module_path =~ /$module_name\.pm/) { $conflicts{$module_name} = 1; last; } }; } say $_ for keys %conflicts; }

        I suspect (but don't know for sure) that your "bombs my machine" problem is due to the cyclic nature of: run script ... load module ... run B::Xref ... run script ... load module ... run B::Xref ... ad infinitum.

        There's some other (potential) issues, which include: you're currently finding all modules using @ARGV, not just the ones loaded before your module; splitting the output from B::Xref on whitespace when pathnames may contain whitespace; portability of /dev/null; assuming modules have a .pm extension.

        I created module PM::1082204::Log::Reproducible in ~/tmp/PM/1082204/Log/. This is intended to mirror your Log::Reproducible. Here's some notes:

        • I've used a BEGIN block, instead of a subroutine called from import(), so that there's no reliance on import() being called and the @ARGV checks are done first.
        • I've taken the code from the beginning of the calling script (up to, but not including, the use Log::Reproducible) and saved to a File::Temp temporary file. B::Xref operates on this temporary file which fixes the cyclic/infinite loop issue mentioned above.
        • I've used IPC::Open3 to deal with the /dev/null issue: STDERR output is accessible from <CERR> but that filehandle isn't read.
        • B::Xref discards a leading './' in the filename path but retains instances of embedded '/./'. I've handled this (e.g. /^(?:\.[\\\/]?)?(.*)$/) but there may be better ways to do this.
        • There's quite a few tweaks I could envisage you making to this; the code here should at least provide a framework as a starting point.

        Here's the PM::1082204::Log::Reproducible code:

        package PM::1082204::Log::Reproducible; use 5.010; use strict; use warnings; use autodie; use File::Temp (); use IPC::Open3; BEGIN { my $code = do { open my $fh, '<', $0; local $/; <$fh> }; my ($code_to_test) = $code =~ /(\A .*?) use \s+ @{[__PACKAGE__]}/s +x; my ($temp_fh, $temp_filename) = File::Temp::tempfile(); print $temp_fh $code_to_test; local(*CIN, *COUT, *CERR); my $cmd = "$^X -MO=Xref,-r $temp_filename"; my $pid = open3(\*CIN, \*COUT, \*CERR, $cmd); my $re = '(?:' . join('|' => map { /^(?:\.[\\\/]?)?(.*)$/; "\Q$1" +} @INC) . ')[\\\/]?(\S+?)(?:\.\S+)?\s'; my %argv_modules; for (<COUT>) { next unless /\@\s+ARGV/; (my $module) = /$re/; $module =~ s{[\\\/]}{::}g; ++$argv_modules{$module}; } waitpid $pid, 0; my @warn_modules = sort keys %argv_modules; if (@warn_modules) { warn "WARNING:\n", "Modules using'\@ARGV' before " . __PACKAGE__ . " loaded:\ +n"; warn "\t$_\n" for @warn_modules; } } 1;

        In addition, the Qwerty module, in ~/tmp/PM/1082204/Log/X X/, tests for pathnames with whitespace:

        package Qwerty; use 5.010; use strict; use warnings; sub xxx { say "ARGV not empty." if @ARGV; } 1;

        And, the ModuleWithoutPmExtension, in ~/tmp/, tests pretty much what it says:

        package ModuleWithoutPmExtension; sub test { print "I'm ", __PACKAGE__, "\n"; print "\@ARGV has elements.\n" if @ARGV; } 1;

        I ran the tests from ~/tmp. Here's pm_1082204_b_xref.pl:

        #!/usr/bin/env perl use strict; use warnings; BEGIN { require 'ModuleWithoutPmExtension'; ModuleWithoutPmExtension::test(); } use lib './../tmp/./PM/1082204/Log/X X'; use Qwerty; { use Getopt::Long; use PM::1082204::Log::Reproducible; } use Getopt::Std;

        Output:

        I'm ModuleWithoutPmExtension WARNING: Modules using'@ARGV' before PM::1082204::Log::Reproducible loaded: Getopt::Long ModuleWithoutPmExtension Qwerty

        Getopt::Std isn't in the list. If you load it earlier in the code, it does appear in the list.

        -- Ken

      I'm really intrigued by this approach! I've got it figured out and partially written, but need to sleep. I'll post my attempt tomorrow. Thanks.
Re: What are (popular) modules that access/modify @ARGV?
by tobyink (Canon) on Apr 14, 2014 at 09:34 UTC

    App::Cmd's run method and utf8::all's import spring to mind.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1082204]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-19 18:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found