<rant>CPAN modules failing to check for write errors</rant>

Hi monks

I greatly appreciate all the work developers do in preparing and releasing modules to CPAN - they've enabled me and everyone else here to get stuff done quickly and usually reliably.

But there's one thing in a lot of modules on CPAN that really worries me - too often I see stuff like this (XML::Smart in this case):

  my $fh ;
  open ($fh,">$file") ; binmode($fh) if $unicode ;
  print $fh $data ;
  close ($fh) ;
[download]

Why do so many authors fail to check for print failures (well ok, open and close failures in this case as well) to files?

I've even written to a couple of authors politely explaining that its useful to know when print is failing because your hard drive is full, but with no success.

I know you can force perl to die on such errors, but I would hope code going into CPAN modules would be doing stuff like:

  my $fh ;
  open ($fh,">$file")  or dir "Failed to open: '$file' for write: $!";
  binmode($fh) if $unicode ;
  print $fh $data      or die "Failed to write to: '$file': $!";
  close ($fh)          or warn "Failed to close: '$file': $!";
[download]

I work on a lot of data processing scripts, and ignoring a write failure would probably get me the sack - but surely there are relatively few situations where data loss is acceptable?

Can monks suggest anything that can be done to get CPAN modules reviewed and fixed for such problems? I'm prepared to review the odd module or two, but not when I know half the time I'll get ignored by the module author (I'm not suggesting this is the case with XML::Smart BTW, I took this at random as an example!)

If what I'm saying is valid, is there some well known and respected name in the perl community who could write a small article (or I'll write it for them!) for perl.com discussing this - then I can point authors to such an article and they might take notice!

Thanks

Comment on <rant>CPAN modules failing to check for write errors</rant> Select or Download Code

Replies are listed 'Best First'.

Re: <rant>CPAN modules failing to check for write errors</rant>
by jepri (Parson) on May 29, 2004 at 11:34 UTC

I think a lot of authors ignore print failures because they program on Unix, and on Unix if you have a valid file handle you can write to it. But checking your opens is pretty poor form, I'll agree.

OTOH, I find it pretty annoying when a module throws a die. It means I have to wrap it in an eval rather than just check the return code.

If you choose to write an article, why not submit it directly to perl.com or even post it here? I hear a few CPAN authors are members of PerlMonks

The CPAN authors I've talked with have been fairly sympathetic to adding additional error checking. I suspect it's a bit of the omniscent author problem - the author knows hir code so well that error messages aren't really needed. It's actually quite hard to write good error messages for people who don't know what your code does.

Perhaps it is time to start a new perl meme for CPAN authors :)

____________________
Jeremy
I didn't believe in evil until I dated it.

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 12:45 UTC

I appreciate throwing exceptions can be a pain in the **** - Java seems to have taken this to the extreme and it's hard to say if that's good or bad.

Actually I don't mind if a method/function returns true/false as long as I can easily find out what the error was and where it occurred - I think exceptions are the way to go when building larger applications, but can be overkill for small scripts. BUT I do need to know if an error occurred :)

The only reason why I'm loathed to write an article directly to perl.com is because I'm not a known name in the community, but yes I might knock something up and post it here for review. Anything to get authors just thinking about things anyway!

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by jepri (Parson) on May 29, 2004 at 13:02 UTC

I appreciate throwing exceptions can be a pain in the **** - Java seems to have taken this to the extreme

As a sysadmin (I seem to be saying that a lot thesedays), I've seen uncaught exceptions on some genuinely insignificant errors. Plus, since they are written for programmers, the error messages are mostly useless for sysadmins. If error messages aren't going to be useful for users or sysadmins, I question their usefulness in production code.

I agree with the 'know if error occurred', but I'd argue that showing it to the user should be the last ditch desperation response, not the default fallback. I found this with smalltalk, which would throw me a popup window for any uncaught exception - as if anyone is going to use a system like that!

Please do write your article up. The highest quality articles I've seen on Perlmonks have been by authors who have been annoyed beyond endurance by a 'feature' of perl, or a community shortfall. Well thought out critiscism of perl is a staple here.

____________________
Jeremy
I didn't believe in evil until I dated it.

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Joost (Canon) on May 30, 2004 at 00:12 UTC

Re: <rant>CPAN modules failing to check for write errors</rant>
by borisz (Canon) on May 29, 2004 at 11:53 UTC

Can monks suggest anything that can be done to get CPAN modules reviewed and fixed for such problems? I'm prepared to review the odd module or two, but not when I know half the time I'll get ignored by the module author (I'm not suggesting this is the case with XML::Smart BTW, I took this at random as an example!)

Contact the author and ask to fix that.
Provide a patch to the author.
Fill a bug report on http://rt.cpan.org/.
Ignore the module.
Ignore the author.

Boris

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by valdez (Monsignor) on May 29, 2004 at 13:59 UTC

The author in this case is gmpassos.

Ciao, Valerio

Re: <rant>CPAN modules failing to check for write errors</rant>
by saskaqueer (Friar) on May 29, 2004 at 11:38 UTC

If your hard drive is full, you have more to worry about than a perl script failing to die 'gracefully'. I'm quite the opposite of you, if I saw a script that had every single print statement checked for an error, I'd probably not take a second glance at the code. Yes, error checking is nice, but there is a point at which it becomes ridiculous. So for me, I'd hope that the CPAN isn't cluttered with unnecessary error checking everywhere.

The last thing you need to worry about when a hard drive fills up is your perl script. Chances are your system won't manage itself, let alone your scripts. If you do error checking, it's probably better to simply check disk space before you begin and when you finish your operation. And that's only if your script outputs enough data to the hard disk to worry about. A script that throws a few kilobytes of data to disk shouldn't be worrying about disk space; it's the scripts and programs that will use a lot of disk space that should beware of this. And as previously noted, the problem should be seen ahead of time, not during the execution of a print statement.

Added minutes later... Oh yeah, about the open() and close() calls, of course those should be checked :)

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 12:33 UTC

How does checking the amount of free disk space before and after execution help?

This script isn't the only process running - lots of other stuff will be using up disk space as well in an unpredictable manner. And if the partition's full at run end, that doesn't tell me if my output succeeded or not. Finally if I'm querying variable length fields out of a database, I dont know how much data I'm going to write before run time.

Checking disk space can *never* work.

>If your hard drive is full, you have more to worry about than a 
>perl script failing to die 'gracefully'

If a write fails during a run I'd better try damn hard to log the failure into a database, get an email to an administrator, then try to switch output to a different mounted drive. Yes there are sysadmin checks for disk space, but that's no excuse for a script failing to act sensibly when the worst happens.

Unfortunately, this is the kind of reply I expected - "they're only a few scripts, why does it matter if you lose some data?". Well it does matter for people doing this stuff commercially, and I wish (some) authors would start to realise this!

When I'm writing a 20 liner at home, yes i don't care too much, but these are CPAN modules, and by implication others are going to use them, and they therefore need to be robust and behave sensibly when errors occur.

Thanks for the reply - It's good to get an alternative viewpoint even if I have to politely disagree :)

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by hv (Prior) on May 29, 2004 at 12:56 UTC

Some of these scripts drive the back ends for multi-million pound companies [...]

Right, but the guy who wrote the CPAN module probably did so to help him out with a 20 liner at home; conversely, if the module had been written for a multi-million pound company in the first place, chances are it would never have been okayed for release to CPAN.

Now I don't run a multi-million pound company, but I do see it as my responsibility to audit the CPAN code I employ, more or less carefully according to the criticality of the operations I'm using it for. If I'm not happy with the code I don't use it - either I roll my own, or I adapt it for my needs.

While it would be nice if all the code offered for free on CPAN were perfect, we're unlikely to reach that happy state any time soon.

One perfectly valid option is to offer the author some money to harden their code, either direct or through the convenience of a TPF grant.

Hugo

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Ninthwave (Chaplain) on May 29, 2004 at 13:36 UTC

Re: Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 18:57 UTC

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 18:47 UTC

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 08:00 UTC

Some of these scripts drive the back ends for multi-million pound companies, I DO need to worry about how they handle errors.

Well it does matter for people doing this stuff commercially, and I wish (some) authors would start to realise this!

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 10:47 UTC

Re: Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 31, 2004 at 03:07 UTC

Some notes below your chosen depth have not been shown here

Re: <rant>CPAN modules failing to check for write errors</rant>
by graff (Chancellor) on May 29, 2004 at 16:03 UTC

always

In fact, one possible way to assess a module's suitability would be to grep for "die|warn" and "carp|croak" in the module's perl source file(s). If you only see the latter and not the former, the module is likely to be easy to use and good enough for production code; if only the former, you might be in for some extra effort to make proper use of it (or might change your mind about using it). If you see neither (or both), use extreme caution, or look for another way to do what it's supposed to do.

I agree with jepri about the ugliness and hassle of having to wrap module calls in eval blocks to trap errors -- some modules require this (like some operations using Encode in 5.8.x), and there's probably a good reason for it in some cases, but it's still a bit ugly.

I really would expect (hope) that when a module call implements some common, familiar function (like "open(); print; close;"), it would mimic the behavior of common Perl functions that do basically the same thing: return "true" when it succeeds and "false" otherwise (with $! set to something meaningful in the latter case).

In the particular example you cited, the unchecked open is in a method called "save", whose purpose is to take the in-memory XML data object and write it as an XML stream to a named file (with file name given as an arg to the call). So as a module user, I'd like to write my code in the "normal" way:

    $xml->save($filename) or die "Couldn't save my xml data to $filena
+me: $!";
[download]

# ... near the top:
use Carp;  # THIS NEEDS TO BE ADDED
# ...

sub save {
    my $this = shift ;
    my $file = shift ;

    if (-d $file || (-e $file && !-w $file)) {
# NEED TO ADD THE "carp" CALL:
        carp "Unusable file name passed to XML::Smart->save ($file is 
+a directory or read-only file)\n";
        return ;
    }

    my ($data,$unicode) = $this->data(@_) ;
 
    my $fh;
    open( $fh, ">$file" ) or return;
    binmode( $fh ) if $unicode;
    print $fh $data or return;
    close $fh;   # save() will return true if this succeeds, false oth
+erwise
}
[download]

[reply]
[d/l]
[select]

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 18:24 UTC

The one issue with returning true or false in this example is that I don't know what caused the error - was it a failure on the file open or the print - ok I can report what's in $! but that's it.

A small issue though - at least I know the write failed, and that's the point. I'm not suggesting I can write stuff better than many CPAN authors, simply pointing out that failing to check stuff like write failures is going to effect everyone who uses the module, and that *should* be a cause for complaint.

Re: <rant>CPAN modules failing to check for write errors</rant>
by jdhawke (Acolyte) on May 29, 2004 at 14:48 UTC

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 11:18 UTC

I'm trying to say that they shouldn't release the stuff on CPAN until there's robust error checking - surely CPAN modules are there for reuse, and for at least a number of developers it's not possible to reuse modules that can fail in fairly catastrophic ways. I'd still argue it's in everyone's interest to have robust modules on CPAN, even though some seem to be arguing the opposite! I'm not so worried if a module throws exceptions or returns error status values from functions, but it *has* to do something!

Should you be running such a mission critical application
you should have programmers on staff knowledgeable enough to
make modifications to code, or write custom code, that meets
your current requirements.

What may happen (and I've had this once so far) is that companies increasingly refuse to use non-core CPAN modules. That's bad all round - it increases development costs, increases the chance of bugs creeping in, and hinders the development of existing CPAN modules because authors lose potentially useful feedback/bug reports/code improvements.

Re: CPAN modules failing to check for write errors (fix it)
by tye (Sage) on May 29, 2004 at 20:19 UTC

Wow, you sure picked the wrong way to try to fix this problem. This 'fix as many places where writing is done in as many CPAN modules as we can' idea is just plain untenable.

Meanwhile, you can rather simply fix the problem. Extend Fatal to support universally overriding print / syswrite / printf / etc. such that failures are reported. For best results, support treating failures in a void context (which some might wish to be fatal) different from other failures (these might be better to log / warn about but let the calling code deal with the failure).

If something like that is available, then I'd use it in my modules to override my module's write operations such that failures are reported. Without such, I'll continue to not check the results of write operations but to warn of close failures.

Just to be clear, if you write this, then you can fix your production scripts with it without having to modify any of the modules that get used. I would use it in my modules so that people who don't bother to do this would still benefit from write failures being detected.

(updated)

Re: Re: CPAN modules failing to check for write errors (fix it)

by diotalevi (Canon) on May 30, 2004 at 15:31 UTC

Extend Fatal to support universally overriding print / syswrite / printf etc. such that failures are reported.

This is an interesting idea. Did you have any ideas about how this might be done other than doing something s///-like to the optree?

Match (?:print|syswrite|printf|etc).

print

Replace it with $1 || carp( "$1 has failed catastrophically: $!" )

or
   print
      ...
   entersub
      ...

Is that what you were proposing? Can you think of any way to perform a net-equivalent operation without requiring that the optree be altered?

Re: Re: CPAN modules failing to check for write errors (fix it)

by Anonymous Monk on May 29, 2004 at 23:39 UTC

Wow, you sure picked the wrong way to try to fix this problem. This 'fix as many places where writing is done in as many CPAN modules as we can' idea is just plain untenable.

I'm not suggesting every/anyone goes through every module on CPAN fixing stuff in the manner of the *example* I gave - I'm saying authors should be thinking about these things *before* releasing modules to CPAN or maybe there should be be some basic quality checks on modules before they get released to CPAN.

Without such, I'll continue to not check the results of write operations but to warn of close failures.

You or I or anyone else could use, say, XML::Smart and write a script that appeared to run fine but actually failed to output data. Anyone seriously suggesting that's fine needs their head examining IMHO, and I'm amazed that a number of replies have said exactly that.

Edit by tye, remove PRE tags, replace with BLOCKQUOTE

Re: <rant>CPAN modules failing to check for write errors</rant>
by diotalevi (Canon) on May 29, 2004 at 17:31 UTC

This is easy to detect with a module that uses B::Utils. I don't have time to create Devel::DetectUncheckedPrint this morning but am now sharing the contents of my emacs window with you so you can fill in the remaining administrivia.

Huh. I can't figure out how to copy/paste between emacs and mozilla. I guess not. Foo

So now that I have a moment longer... this should actually be called Devel::DetectUncheckedOp. The two things to check for are void context and NEXT pointing to leavesub.

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by jZed (Prior) on May 29, 2004 at 17:40 UTC

OT: get the mozex plugin for emacs - let's you use emacs to fill in all mozilla textareas and view source on mozilla pages.

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 29, 2004 at 19:01 UTC

Thanks

Re: <rant>CPAN modules failing to check for write errors</rant>
by PodMaster (Abbot) on May 30, 2004 at 09:26 UTC

I've even written to a couple of authors politely explaining that its useful to know when print is failing because your hard drive is full, but with no success.

If the author is ignoring you, keep your own copy, simple as that.

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

Re: <rant>CPAN modules failing to check for write errors</rant>
by toma (Vicar) on May 30, 2004 at 18:48 UTC

You can create a small filesystem to overfill for test purposes, but this probably introduces platform dependencies.

Would you want a lot of error checking that is mostly untested?

Does anyone have suggestions for testing code that uses lots of error handling? Or is it all supposed to work perfectly the first time?

It should work perfectly the first time! - toma

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 19:25 UTC

Any, back to the topic in hand - say perldoc -f print:

print FILEHANDLE LIST
print LIST
print Prints a string or a list of strings. Returns true if successful.

Camel pg 606 rightly bangs on about always checking failed system calls but it sometimes seems to fall on deaf ears :(

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by toma (Vicar) on May 31, 2004 at 01:29 UTC

For the necessity of testing, I was thinking of the earlier example of the failover of a full nfs filesystem to a new one. That seems like a fair amount of code, and if I wrote it, I would need to test it.

For example, I was thinking about writing a little test system to see how a module such as File::Slurp handles a full filesystem. Perhaps I could write something like File::Slurp::Failover, which would do the clever nfs failover trick.

By the way, File::Slurp has some options for error handling. Perhaps it could be used as an example of a good way to handle I/O errors. Or is there a better example of a module that has good I/O error handling?

It should work perfectly the first time! - toma

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 31, 2004 at 10:44 UTC

Re: <rant>CPAN modules failing to check for write errors</rant>
by itub (Priest) on May 31, 2004 at 15:28 UTC

It's very nice to wish that CPAN modules were audited or reviewed. The problem is that someone has to do it. If you want a CPAN module to be reviewed, you will either have to do it yourself or pay someone to do it. The tooth fairy won't do it for you. That's what you can expect for free.

A reply falls below the community's threshold of quality. You may see it by logging in.

Re: <rant>CPAN modules failing to check for write errors</rant>
by jepri (Parson) on May 30, 2004 at 02:31 UTC

$xml->save($data) or die

What really needs to be done is to return an error message so the rest of the program can alert the user, or save an emergency file or something.

____________________
Jeremy
I didn't believe in evil until I dated it.

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 09:49 UTC

eval { $xml->save($data) };

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by jepri (Parson) on May 30, 2004 at 10:52 UTC

Except then everyone has to check all their code all the time for exceptions. If one person doesn't, then a trivial error rips it's way up the call stack until it is caught by a piece of code that may not even be aware of the sub-sub-sub-modules existance.

Just like threading, some people immediately try to do everything through exceptions even when it isn't necessary. I rather see people writing decent code that tries to recover and do the right thing.

If that's going to be the standard answer in Perl6, then I can see I'll be in for a lot of debate.

____________________
Jeremy
I didn't believe in evil until I dated it.

Re: Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by Anonymous Monk on May 30, 2004 at 16:12 UTC

Re: <rant>CPAN modules failing to check for write errors</rant>
by hardburn (Abbot) on Jun 01, 2004 at 14:57 UTC

Printing to a file handle with insufficient disk space does not produce a failure, at least not on my system. Rather, the failure happens when you call close(). A lot of people don't check the return value of close() (including me), but maybe they should.

To test this, I created a 1 MB loopback filesystem and wrote a bit of Perl to write a lot of data to it:

$ dd if=/dev/zero of=./test_small_lo count=1 bs=1M
1+0 records in
1+0 records out
$ sudo /sbin/losetup /dev/loop0 ./test_small_lo
$ sudo /sbin/mkfs.ext2 /dev/loop0
 [ snip output ]
$ sudo mount /dev/loop0 /mnt/data
$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/loop0                1003        13       939   2% /mnt/data
 [ snip other mounted file systems ]
$ cd /mnt/data
$ sudo touch test_file
$ sudo chmod 666 test_file # So I don't have to run code below as root

$ perl -e '
> open( FH, ">", "test_file" ) or die $!;
> for( 0 .. 10_000_000 ) { print FH 1 or die $! }
> close(FH) or die $!
'
No space left on device at -e line 3.
[download]

I honestly don't see the point in checking print()'s return value. It puts in more code that will only happen for such extremely rare occasions that you can simply discount them.

Update: See my reply to adrianh below.

----
send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.

[reply]
[d/l]
[select]

Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by adrianh (Chancellor) on Jun 01, 2004 at 16:59 UTC

Printing to a file handle with insufficient disk space does not produce a failure, at least not on my system.

It does on some systems. I think it might on yours too if you have $| set or add some "\n"s to the output (I suspect the close is doing the final flush that's putting the file over the limit.)

I honestly don't see the point in checking print()'s return value. It puts in more code that will only happen for such extremely rare occasions that you can simply discount them.

This, of course, depends on how you feel about those rare occasions and the impact of data being lost. Some of my nastiest debugging sessions in the past have been dealing with exactly this sort of rare occasion and I would prefer people to take the care to die or whatever when they occurred.

(not that I think any sort of approval system for CPAN is the right way to go about fixing these sorts of problem)

Re: Re: Re: <rant>CPAN modules failing to check for write errors</rant>

by hardburn (Abbot) on Jun 01, 2004 at 17:40 UTC

Actually, I just realized something:

$ perl -e '
> die '
Died at -e line 2.
[download]

I mis-counted the lines in my example of printing to a file handle without enough space. The die was occuring at the print after all. I'm not sure if you can rely on this feature to be cross-platform (my particular system is Linux 2.6.5, glibc 2.3.2, Perl 5.8.2).

----
send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.

[reply]
[d/l]
[select]

Back to Meditations