Re^2: problem retrieving return code of an execution with back ticks (updated)

First of all Thank you very much for your reply

The most frustrated thing that can happen to a developer did happen: I was trying to debug the error following your advise (checking the value of the signal to verify whether if the script completed, printing the content of the output @flush)... When the error disappeared. The machine is the same and I have only introduced debugging lines, so now I have the unsettling feeling that my code hides a spurious bug.

But at least it is safer and it will be easier to find what is going on with the lines you suggested the next time it fails :)

I also realised thanks to your observations that it is useless and even redundant to execute start before the script, since start is precisely the executable file that sets the value of $UNXEXSCRIPT, it's own path.

The script iax_0tasksimp.pl is simply a launcher of script with a timeout and check of return code and output.

The script iax_0taskchklog.pl is a launcher of script with a timeout check of return code, output and line in a log.

These lauchers do verify the whole $??, and I did the timeout with a countdown child and interruption to the father, who kills the child performing the main task (but not the grandchildren, that is tricky for my level). Surely perl has a built-in library to do the same (launching a script with a timeout/kill and return code verification).

The script that encapsulates them just launch these "launchers" sequentially. They are generic mini tools that I reuse for multiple purposes.

Just in case you are curious, in this case I am using these little tools to redefine a Tomcat instance, for that I have a perl script that launchs sequentially:

- 1) iax_0taskchklog.pl launching a certain script tbtom.sh stop_all /users/iax00/exploit/data/tom85/admin_portal.xml, with a timeout of 15 seconds, expecting it to return a code 0 and looking for the message "Command successful. Status: 0." to appear in its log. If it is successful, it goes on, if not, it aborts.

- 2) iax_0taskchklog.pl launching tbtom.sh delete_all /users/iax00/exploit/data/tom85/admin_portal.xml, with a timeout of 15 seconds, expecting it to return a code 0 and looking for the message "Command successful. Status: 0." to appear in its log. If it is successful, it goes on, if not, it aborts.

- 3) iax_0taskchklog.pl launching tbtom.sh define_all /users/iax00/exploit/data/tom85/admin_portal.xml, with a timeout of 15 seconds, expecting it to return a code 0 and looking for the message "Command successful. Status: 0." to appear in its log. If it is successful, it goes on, if not, it aborts.

- 4) iax_0taskchklog.pl launching tbtom.sh start_all /users/iax00/exploit/data/tom85/admin_portal.xml, with a timeout of 15 seconds, expecting it to return a code 0 and looking for the message "Command successful. Status: 0." to appear in its log. If it is successful, it goes on, if not, it aborts.

- Finally, iax_0tasksimp.pl launching a verifTomcatAdminPortal.sh, which returns 0 if there is one and only one instance tomcat called admin_portal running (I do this with a simple ps | grep ), and 5 if it fails. The problem was that there was no tomcat running, and the verification was still ok, manually, my iax_0tasksimp.pl launching verifTomcatAdminPortal.sh returned 5, but within the sequence of tasks, it said the return code was 0.

The previous steps tbtom.sh give all an OK and they should not, they should return an error at some point, as the tomcat instance has not started, but I have checked that they indeed return 0, and its log has a happy final line "Command successful. Status: 0." at each execution. tbtom.sh is not my script tho, and I am not allowed to touch it, just to report its malfunction (the same way that I expect that my colleagues report my bugs if they use my launchers. And I don't mind at all if they touch them, although generally they do not have the time :)

Comment on Re^2: problem retrieving return code of an execution with back ticks (updated)

Replies are listed 'Best First'.
Re^3: problem retrieving return code of an execution with back ticks by haukex (Archbishop) on May 09, 2017 at 19:22 UTC
now I have the unsettling feeling that my code hides a spurious bug Well, that's a good argument for more defensive coding - you said that your scripts are mainly task launchers, so I'll repeat my suggestion to check all possible error conditions, or use a module to do it for you (e.g. IPC::System::Simple). One thing I'd also check is that the processes' `STDOUT` and `STDERR` is properly monitored - there are some cases where tools might exit with a code of zero, but have printed some serious warnings to `STDERR`. Of course it's also a good argument for writing tests for your code :-) For example, until I wrote my own module to simplify this, I'd sometimes write code something like this: `use IPC::Run3 'run3'; sub myrun { my ($cmd) = @_; run3 $cmd, undef, \my $out, \my $err or die "run3 failed"; my $rv=$?; chomp($err); die "command '$$cmd[0]' wrote to STDERR: '$err'" if $err; die "command '$$cmd[0]' exit value indicates error: \$?=$rv" unless $rv==0; return $out; }` [download] Surely perl has a built-in library to do the same (launching a script with a timeout/kill and return code verification). For timeout support, you could take a look at IPC::Run. launching a verifTomcatAdminPortal.sh ... a simple ps \| grep Note that unless you do `set -e` or equivalent in the shell script, you might be missing errors there, which goes against my aforementioned advice to code defensively. It's something you can do in Perl - for example, you could use IPC::System::Simple's `capturex` to run `ps` and use Perl's grep. And of course there are modules too, e.g. Proc::ProcessTable.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: problem retrieving return code of an execution with back ticks
by haukex (Archbishop) on May 09, 2017 at 19:22 UTC

now I have the unsettling feeling that my code hides a spurious bug

Well, that's a good argument for more defensive coding - you said that your scripts are mainly task launchers, so I'll repeat my suggestion to check all possible error conditions, or use a module to do it for you (e.g. IPC::System::Simple). One thing I'd also check is that the processes' STDOUT and STDERR is properly monitored - there are some cases where tools might exit with a code of zero, but have printed some serious warnings to STDERR. Of course it's also a good argument for writing tests for your code :-)

For example, until I wrote my own module to simplify this, I'd sometimes write code something like this:

use IPC::Run3 'run3';
sub myrun {
    my ($cmd) = @_;
    run3 $cmd, undef, \my $out, \my $err
        or die "run3 failed";
    my $rv=$?;
    chomp($err);
    die "command '$$cmd[0]' wrote to STDERR: '$err'" if $err;
    die "command '$$cmd[0]' exit value indicates error: \$?=$rv"
        unless $rv==0;
    return $out;
}
[download]

Surely perl has a built-in library to do the same (launching a script with a timeout/kill and return code verification).

For timeout support, you could take a look at IPC::Run.

launching a verifTomcatAdminPortal.sh ... a simple ps | grep

Note that unless you do set -e or equivalent in the shell script, you might be missing errors there, which goes against my aforementioned advice to code defensively. It's something you can do in Perl - for example, you could use IPC::System::Simple's capturex to run ps and use Perl's grep. And of course there are modules too, e.g. Proc::ProcessTable.

[reply]
[d/l]
[select]