Re: <rant>CPAN modules failing to check for write errors</rant>
by jepri (Parson) on May 29, 2004 at 11:34 UTC
|
I think you have the right of it, but there is no central authority that can force coders to do anything. That's the downside to the whole community coding thing.
I think a lot of authors ignore print failures because they program on Unix, and on Unix if you have a valid file handle you can write to it. But checking your opens is pretty poor form, I'll agree.
OTOH, I find it pretty annoying when a module throws a die. It means I have to wrap it in an eval rather than just check the return code.
If you choose to write an article, why not submit it directly to perl.com or even post it here? I hear a few CPAN authors are members of PerlMonks
The CPAN authors I've talked with have been fairly sympathetic to adding additional error checking. I suspect it's a bit of the omniscent author problem - the author knows hir code so well that error messages aren't really needed. It's actually quite hard to write good error messages for people who don't know what your code does.
Perhaps it is time to start a new perl meme for CPAN authors :)
____________________
Jeremy
I didn't believe in evil until I dated it.
| [reply] |
|
|
Thanks for the reply.
I appreciate throwing exceptions can be a pain in the **** - Java seems to have taken this to the extreme and it's hard to say if that's good or bad.
Actually I don't mind if a method/function returns true/false as long as I can easily find out what the error was and where it occurred - I think exceptions are the way to go when building larger applications, but can be overkill for small scripts. BUT I do need to know if an error occurred :)
The only reason why I'm loathed to write an article directly to perl.com is because I'm not a known name in the community, but yes I might knock something up and post it here for review. Anything to get authors just thinking about things anyway!
| [reply] |
|
|
I appreciate throwing exceptions can be a pain in the **** - Java seems to have taken this to the extreme
As a sysadmin (I seem to be saying that a lot thesedays), I've seen uncaught exceptions on some genuinely insignificant errors. Plus, since they are written for programmers, the error messages are mostly useless for sysadmins. If error messages aren't going to be useful for users or sysadmins, I question their usefulness in production code.
I agree with the 'know if error occurred', but I'd argue that showing it to the user should be the last ditch desperation response, not the default fallback. I found this with smalltalk, which would throw me a popup window for any uncaught exception - as if anyone is going to use a system like that!
Please do write your article up. The highest quality articles I've seen on Perlmonks have been by authors who have been annoyed beyond endurance by a 'feature' of perl, or a community shortfall. Well thought out critiscism of perl is a staple here.
____________________
Jeremy
I didn't believe in evil until I dated it.
| [reply] |
|
|
Re: <rant>CPAN modules failing to check for write errors</rant>
by borisz (Canon) on May 29, 2004 at 11:53 UTC
|
Can monks suggest anything that can be done to get CPAN modules reviewed and fixed for such problems? I'm prepared to review the odd module or two, but not when I know half the time I'll get ignored by the module author (I'm not suggesting this is the case with XML::Smart BTW, I took this at random as an example!)
You have several options, here are some:
- Contact the author and ask to fix that.
- Provide a patch to the author.
- Fill a bug report on http://rt.cpan.org/.
- Ignore the module.
- Ignore the author.
| [reply] |
|
|
| [reply] |
Re: <rant>CPAN modules failing to check for write errors</rant>
by saskaqueer (Friar) on May 29, 2004 at 11:38 UTC
|
If your hard drive is full, you have more to worry about than a perl script failing to die 'gracefully'. I'm quite the opposite of you, if I saw a script that had every single print statement checked for an error, I'd probably not take a second glance at the code. Yes, error checking is nice, but there is a point at which it becomes ridiculous. So for me, I'd hope that the CPAN isn't cluttered with unnecessary error checking everywhere.
The last thing you need to worry about when a hard drive fills up is your perl script. Chances are your system won't manage itself, let alone your scripts. If you do error checking, it's probably better to simply check disk space before you begin and when you finish your operation. And that's only if your script outputs enough data to the hard disk to worry about. A script that throws a few kilobytes of data to disk shouldn't be worrying about disk space; it's the scripts and programs that will use a lot of disk space that should beware of this. And as previously noted, the problem should be seen ahead of time, not during the execution of a print statement.
Added minutes later... Oh yeah, about the open() and close() calls, of course those should be checked :)
| [reply] |
|
|
Sorry, that's absurd - if your salary depends on writing code, you can never ignore failures due to system errors! Check Camel page 606.
How does checking the amount of free disk space before and after execution help?
This script isn't the only process running - lots of other stuff will be using up disk space as well in an unpredictable manner. And if the partition's full at run end, that doesn't tell me if my output succeeded or not. Finally if I'm querying variable length fields out of a database, I dont know how much data I'm going to write before run time.
Checking disk space can *never* work.
>If your hard drive is full, you have more to worry about than a
>perl script failing to die 'gracefully'
Some of these scripts drive the back ends for multi-million pound companies, I DO need to worry about how they handle errors. If a partition fills up it doesn't mean the system will fail - like most companies, our system directories are mounted on different drives from our data directories.
If a write fails during a run I'd better try damn hard to log the failure into a database, get an email to an administrator, then try to switch output to a different mounted drive. Yes there are sysadmin checks for disk space, but that's no excuse for a script failing to act sensibly when the worst happens.
Unfortunately, this is the kind of reply I expected - "they're only a few scripts, why does it matter if you lose some data?". Well it does matter for people doing this stuff commercially, and I wish (some) authors would start to realise this!
When I'm writing a 20 liner at home, yes i don't care too much, but these are CPAN modules, and by implication others are going to use them, and they therefore need to be robust and behave sensibly when errors occur.
Thanks for the reply - It's good to get an alternative viewpoint even if I have to politely disagree :) | [reply] |
|
|
Some of these scripts drive the back ends for multi-million pound companies [...]
Right, but the guy who wrote the CPAN module probably did so to help him out with a 20 liner at home; conversely, if the module had been written for a multi-million pound company in the first place, chances are it would never have been okayed for release to CPAN.
Now I don't run a multi-million pound company, but I do see it as my responsibility to audit the CPAN code I employ, more or less carefully according to the criticality of the operations I'm using it for. If I'm not happy with the code I don't use it - either I roll my own, or I adapt it for my needs.
While it would be nice if all the code offered for free on CPAN were perfect, we're unlikely to reach that happy state any time soon.
One perfectly valid option is to offer the author some money to harden their code, either direct or through the convenience of a TPF grant.
Hugo
| [reply] |
|
|
|
|
|
|
|
|
Some of these scripts drive the back ends for multi-million pound companies, I DO need to worry about how they handle errors.
What scripts? The ones you write? Do you follow your own advice?
Well it does matter for people doing this stuff commercially, and I wish (some) authors would start to realise this!
What are you paying these authors?
| [reply] |
|
|
|
|
|
Re: <rant>CPAN modules failing to check for write errors</rant>
by graff (Chancellor) on May 29, 2004 at 16:03 UTC
|
I would expect (hope) that authors of CPAN modules would habituate themselves to using "carp" and "croak" -- always -- instead of "warn" or "die" within their module code. This way, when the module hits a bad condition, the module user gets immediate feedback saying where to start looking in the calling code for the problem. The Carp module was created for just this sort of consideration, and module authors who don't use it are, quite frankly, being inconsiderate.
In fact, one possible way to assess a module's suitability would be to grep for "die|warn" and "carp|croak" in the module's perl source file(s). If you only see the latter and not the former, the module is likely to be easy to use and good enough for production code; if only the former, you might be in for some extra effort to make proper use of it (or might change your mind about using it). If you see neither (or both), use extreme caution, or look for another way to do what it's supposed to do.
I agree with jepri about the ugliness and hassle of having to wrap module calls in eval blocks to trap errors -- some modules require this (like some operations using Encode in 5.8.x), and there's probably a good reason for it in some cases, but it's still a bit ugly.
I really would expect (hope) that when a module call implements some common, familiar function (like "open(); print; close;"), it would mimic the behavior of common Perl functions that do basically the same thing: return "true" when it succeeds and "false" otherwise (with $! set to something meaningful in the latter case).
In the particular example you cited, the unchecked open is in a method called "save", whose purpose is to take the in-memory XML data object and write it as an XML stream to a named file (with file name given as an arg to the call). So as a module user, I'd like to write my code in the "normal" way:
$xml->save($filename) or die "Couldn't save my xml data to $filena
+me: $!";
This means that the module code should have been written like this (I'll include the full source for XML::Smart->save(), because it turns out the author did include some checks for problems):
# ... near the top:
use Carp; # THIS NEEDS TO BE ADDED
# ...
sub save {
my $this = shift ;
my $file = shift ;
if (-d $file || (-e $file && !-w $file)) {
# NEED TO ADD THE "carp" CALL:
carp "Unusable file name passed to XML::Smart->save ($file is
+a directory or read-only file)\n";
return ;
}
my ($data,$unicode) = $this->data(@_) ;
my $fh;
open( $fh, ">$file" ) or return;
binmode( $fh ) if $unicode;
print $fh $data or return;
close $fh; # save() will return true if this succeeds, false oth
+erwise
}
In this case, the author probably thought he covered all the possible failure conditions, but in fact he missed a few -- apart from the "disk-full" issue, if the module user provides a file name like "nonexistent/path/file.xml", this passes the tests for "is not a directory", and "does not exist with read-only permission". | [reply] [d/l] [select] |
|
|
Yes, you're right about Carp - I do use carp in modules, but was giving a quick example (as you can tell by the fact I managed to "dir" instead of "die"). No excuses though - if I complain and give a duff example, I should expect to get corrected :)
The one issue with returning true or false in this example is that I don't know what caused the error - was it a failure on the file open or the print - ok I can report what's in $! but that's it.
A small issue though - at least I know the write failed, and that's the point. I'm not suggesting I can write stuff better than many CPAN authors, simply pointing out that failing to check stuff like write failures is going to effect everyone who uses the module, and that *should* be a cause for complaint.
| [reply] |
Re: <rant>CPAN modules failing to check for write errors</rant>
by jdhawke (Acolyte) on May 29, 2004 at 14:48 UTC
|
Well, I can understand your frustration, but I believe most if not all items on CPAN are licensed under the GPL or the Artistic license. In either case there is language to the effect of:
"This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
Should you be running such a mission critical application you should have programmers on staff knowledgeable enough to make modifications to code, or write custom code, that meets your current requirements.
--JDHawke | [reply] |
|
|
Of course authors releasing code they've spent their own time and money on should not have to take any kind of legal liability - I'm not suggesting they should.
I'm trying to say that they shouldn't release the stuff on CPAN until there's robust error checking - surely CPAN modules are there for reuse, and for at least a number of developers it's not possible to reuse modules that can fail in fairly catastrophic ways. I'd still argue it's in everyone's interest to have robust modules on CPAN, even though some seem to be arguing the opposite! I'm not so worried if a module throws exceptions or returns error status values from functions, but it *has* to do something!
Should you be running such a mission critical application
you should have programmers on staff knowledgeable enough to
make modifications to code, or write custom code, that meets
your current requirements.
Yes but what's desired and what happens in reality are often slightly (occasionally vastly) different - you have to factor in deadlines, staff turnover, development costs etc. You may be supplying code to cash rich companies, but they're usually none too generous on the amount they're willing to pay for that code - tends to be the one of the reasons why they're cash rich! You're talking about an ideal which sadly doesn't exist in reality.
What may happen (and I've had this once so far) is that companies increasingly refuse to use non-core CPAN modules. That's bad all round - it increases development costs, increases the chance of bugs creeping in, and hinders the development of existing CPAN modules because authors lose potentially useful feedback/bug reports/code improvements. | [reply] |
Re: CPAN modules failing to check for write errors (fix it)
by tye (Sage) on May 29, 2004 at 20:19 UTC
|
Wow, you sure picked the wrong way to try to fix this problem. This 'fix as many places where writing is done in as many CPAN modules as we can' idea is just plain untenable.
Meanwhile, you can rather simply fix the problem. Extend Fatal to support universally overriding print / syswrite / printf / etc. such that failures are reported. For best results, support treating failures in a void context (which some might wish to be fatal) different from other failures (these might be better to log / warn about but let the calling code deal with the failure).
If something like that is available, then I'd use it in my modules to override my module's write operations such that failures are reported. Without such, I'll continue to not check the results of write operations but to warn of close failures.
Just to be clear, if you write this, then you can fix your production scripts with it without having to modify any of the modules that get used. I would use it in my modules so that people who don't bother to do this would still benefit from write failures being detected.
(updated)
| [reply] |
|
|
print Replace it with $1 || carp( "$1 has failed catastrophically: $!" ) or
print
...
entersub
...Is that what you were proposing? Can you think of any way to perform a net-equivalent operation without requiring that the optree be altered? | [reply] |
|
|
Wow, you sure picked the wrong way to try to fix this problem.
This 'fix as many places where writing is done in as many CPAN
modules as we can' idea is just plain untenable.
Hold on, I havn't picked any way to fix this - I gave an example of the kind of thing I would *expect* to see in modules on CPAN, and I'm simply seeing what others think. IIRC there's already a pragma to force system errors to die instead of return false (though I can't remember it off hand), but that really isnt the point.
I'm not suggesting every/anyone goes through every module on CPAN fixing stuff in the manner of the *example* I gave - I'm saying authors should be thinking about these things *before* releasing modules to CPAN or maybe there should be be some basic quality checks on modules before they get released to CPAN.
Without such, I'll continue to not check the results of write
operations but to warn of close failures.
Warn or die on close failures? If you just warn, that's fine - carry on doing what you do, but at some point you'll lose data and probably never realise it unless you've got a sig warn handler installed.
You or I or anyone else could use, say, XML::Smart and write a script that appeared to run fine but actually failed to output data. Anyone seriously suggesting that's fine needs their head examining IMHO, and I'm amazed that a number of replies have said exactly that.
Edit by tye, remove PRE tags, replace with BLOCKQUOTE
| [reply] |
Re: <rant>CPAN modules failing to check for write errors</rant>
by diotalevi (Canon) on May 29, 2004 at 17:31 UTC
|
This is easy to detect with a module that uses B::Utils. I don't have time to create Devel::DetectUncheckedPrint this morning but am now sharing the contents of my emacs window with you so you can fill in the remaining administrivia. Huh. I can't figure out how to copy/paste between emacs and mozilla. I guess not. FooSo now that I have a moment longer... this should actually be called Devel::DetectUncheckedOp. The two things to check for are void context and NEXT pointing to leavesub.
| [reply] [d/l] |
|
|
OT: get the mozex plugin for emacs - let's you use emacs to fill in all mozilla textareas and view source on mozilla pages.
| [reply] |
|
|
Sounds interesting - if you could post something here when you get time, I would be very interested - I confess to knowing nothing about B:: modules in general.
Thanks
| [reply] |
Re: <rant>CPAN modules failing to check for write errors</rant>
by PodMaster (Abbot) on May 30, 2004 at 09:26 UTC
|
I've even written to a couple of authors politely explaining that its useful to know when print is failing because your hard drive is full, but with no success.
By "No success", do you mean
they didn't respond or they just refused to oblige?
Whenever I write patches, I first install them locally (if I had the module installed of course)
and then I post them on rt.cpan.org for all the world to see.
I don't get ignored that often,
but when I do I simply keep patching on my own (not very often) or
just abandon the module (happened once or twice).
If the author is ignoring you, keep your own copy, simple as that.
| MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!" | | I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README). | | ** The third rule of perl club is a statement of fact: pod is sexy. |
| [reply] |
Re: <rant>CPAN modules failing to check for write errors</rant>
by toma (Vicar) on May 30, 2004 at 18:48 UTC
|
One of the difficulties with lots of error checking
is that it is difficult to test. To test code that
fails when the filesystem is full, you need a full
filesystem.
You can create a small filesystem to overfill
for test purposes, but this probably introduces
platform dependencies.
Would you want a lot of error checking that is
mostly untested?
Does anyone have suggestions for testing code that
uses lots of error handling? Or is it all supposed
to work perfectly the first time?
It should work perfectly the first time! - toma
| [reply] |
|
|
print FILEHANDLE LIST
print LIST
print Prints a string or a list of strings. Returns true if successful.
If print returns false, that's good enough for me - the important thing is I know the data wasn't written - I can log $! and try to save the data somewhere else etc.
Camel pg 606 rightly bangs on about always checking failed system calls but it sometimes seems to fall on deaf ears :( | [reply] |
|
|
The nuclear power plant software looks like it
is written in VB, but it might be MFC.
That, along with Flash and Crystal Reports.
I hope you feel confident about that!
For the necessity of testing,
I was thinking of the earlier
example of the failover of a full nfs filesystem
to a new one. That seems like a fair amount of
code, and if I wrote it, I would need to test it.
For example, I was thinking
about writing a little test system to see how a
module such as File::Slurp handles
a full filesystem. Perhaps I could write something
like File::Slurp::Failover, which would do the clever
nfs failover trick.
By the way, File::Slurp has some
options for error handling. Perhaps it could
be used as an example of a good way to handle
I/O errors. Or is there a better example of a
module that has good I/O error handling?
It should work perfectly the first time! - toma
| [reply] |
|
|
Re: <rant>CPAN modules failing to check for write errors</rant>
by itub (Priest) on May 31, 2004 at 15:28 UTC
|
It's very nice to wish that CPAN modules were audited or reviewed. The problem is that someone has to do it. If you want a CPAN module to be reviewed, you will either have to do it yourself or pay someone to do it. The tooth fairy won't do it for you. That's what you can expect for free.
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: <rant>CPAN modules failing to check for write errors</rant>
by jepri (Parson) on May 30, 2004 at 02:31 UTC
|
One thing that nobody seems to have brought up in the above discussion is that you really shouldn't $xml->save($data) or die and quit the program during a save data routine. If you quit the program, you immediately loose all the data.
What really needs to be done is to return an error message so the rest of the program can alert the user, or save an emergency file or something.
____________________
Jeremy
I didn't believe in evil until I dated it.
| [reply] [d/l] |
|
|
Well ok in that particular case, you'd:
eval { $xml->save($data) };
do_whatever($@) if $@;
| [reply] |
|
|
And this is exactly why I don't like exceptions. Rather than handling errors within the module, the code just sproings them up to the next level.
Except then everyone has to check all their code all the time for exceptions. If one person doesn't, then a trivial error rips it's way up the call stack until it is caught by a piece of code that may not even be aware of the sub-sub-sub-modules existance.
Just like threading, some people immediately try to do everything through exceptions even when it isn't necessary. I rather see people writing decent code that tries to recover and do the right thing.
If that's going to be the standard answer in Perl6, then I can see I'll be in for a lot of debate.
____________________
Jeremy
I didn't believe in evil until I dated it.
| [reply] |
|
|
Re: <rant>CPAN modules failing to check for write errors</rant>
by hardburn (Abbot) on Jun 01, 2004 at 14:57 UTC
|
Printing to a file handle with insufficient disk space does not produce a failure, at least not on my system. Rather, the failure happens when you call close(). A lot of people don't check the return value of close() (including me), but maybe they should.
To test this, I created a 1 MB loopback filesystem and wrote a bit of Perl to write a lot of data to it:
$ dd if=/dev/zero of=./test_small_lo count=1 bs=1M
1+0 records in
1+0 records out
$ sudo /sbin/losetup /dev/loop0 ./test_small_lo
$ sudo /sbin/mkfs.ext2 /dev/loop0
[ snip output ]
$ sudo mount /dev/loop0 /mnt/data
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/loop0 1003 13 939 2% /mnt/data
[ snip other mounted file systems ]
$ cd /mnt/data
$ sudo touch test_file
$ sudo chmod 666 test_file # So I don't have to run code below as root
$ perl -e '
> open( FH, ">", "test_file" ) or die $!;
> for( 0 .. 10_000_000 ) { print FH 1 or die $! }
> close(FH) or die $!
'
No space left on device at -e line 3.
I honestly don't see the point in checking print()'s return value. It puts in more code that will only happen for such extremely rare occasions that you can simply discount them.
Update: See my reply to adrianh below.
----
send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.
| [reply] [d/l] [select] |
|
|
Printing to a file handle with insufficient disk space does not produce a failure, at least not on my system.
It does on some systems. I think it might on yours too if you have $| set or add some "\n"s to the output (I suspect the close is doing the final flush that's putting the file over the limit.)
I honestly don't see the point in checking print()'s return value. It puts in more code that will only happen for such extremely rare occasions that you can simply discount them.
This, of course, depends on how you feel about those rare occasions and the impact of data being lost. Some of my nastiest debugging sessions in the past have been dealing with exactly this sort of rare occasion and I would prefer people to take the care to die or whatever when they occurred.
(not that I think any sort of approval system for CPAN is the right way to go about fixing these sorts of problem)
| [reply] |
|
|
$ perl -e '
> die '
Died at -e line 2.
I mis-counted the lines in my example of printing to a file handle without enough space. The die was occuring at the print after all. I'm not sure if you can rely on this feature to be cross-platform (my particular system is Linux 2.6.5, glibc 2.3.2, Perl 5.8.2).
----
send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.
| [reply] [d/l] [select] |