leriksen has asked for the wisdom of the Perl Monks concerning the following question:

I have been using the module Filter::Handle with perl 5.6.1 for over 12 months, but have recently upgraded to Perl 5.8.1. Reinstalling this module fails in the 'make test' portion with what appears to be a recursive loop that eventually segfaults.

Filter::Handle is a neat little module I use to cache large blocks of dynamically generated output, that may be needed again with just a few small changes. Specifically, I have a program that generates markup (not HTML, but specific to the print industry) for several hundred thousand statements to customers. These document are generated via some very expensive DBI calls, so if a document is 'almost right' I use some perl code to rejig on the fly rather than spend time doing another DBI call. Filter::Handle intercepts our prints of these documents to the output file handle, and places them in an array. Once the document is generated, multiple copies and variations are created from the captured array, and then all the output happens in one big, seemingly quite efficient, burst. We did this because our first test with multiple DB calls took 12 days to run, not good when you have a weekly print cycle.

But to the issue with Filter::Handle - we use version 0.03, which hasn't changed in over 3 years. tests 3 and 4 tie \*STDOUT to Filter::Handle, with an anonymous sub sprintf'ing the print args to a string. This string has its contents tested after untie'ing STDOUT. For some reason, the print seems to go resursive. I have verified this on both Linux and Solaris, both running Perl 5.8.1

Filter::Handle does a lot of work with typeglobs and packages, and it is a really good study for getting these under your belt.

I have narrowed the problem down to the Filter::Handle::PRINT sub. Previously, this called 'the real print' but now it recurse's to itself and calls Filter::Handle::PRINT again - but I dont understand why the behaviour has changed. I tried things like &main::print to force the 'right' behaviour, so that we call the 'right' print, but no joy - do the functions listed in chapter 3 of the Camel book live in a namespace ??

Does anyone have clue about why this is happening - what could have changed in 5.8.1 to break this so badly ?

+++++++++++++++++
/#!/usr/bin/perl
use warnings;use strict;use brain;

Replies are listed 'Best First'.
Re: Filter ::Handle segfaults 5.8.1
by ysth (Canon) on Nov 17, 2003 at 08:48 UTC
    Starting with perl5.8.0, there was an internal change to how tied filehandles are implemented. This may be the cause. Can you reduce it to a test case that doesn't involve Filter::Handle, but just a small class tied to STDOUT?

    You can force a call to the real builtin instead of an overriding sub by saying CORE::print, but that will probably still recurse for you.

      It does indeed continue to recurse.

      the code I think is resposible is in two parts

      *print = *PRINT; *printf = *PRINTF; *new = *TIEHANDLE;

      those calls happen outside of any sub in the Filter::Handle package, so they cause usage of those symbols in F::H to resolve to those subs in the Filter::Handle package.

      sub PRINT { my $self = shift; my $fh = *{$self->{fh}}}; # the file handle we tied in TIEHANDLE print $fh $self->{output}->(@_); # the sub we passed in in TIEHANDLE }

      that print never used to call PRINT again.

      I'll try to reduce it to a simple case, but it will have to wait till I get gome - it is 8:30pm here.

        Never mind reducing it, the change I suspected is in fact the culprit. That *{$self->{fh}} wraps a new glob around the file handle to try to keep from recursing. This worked pre-5.8.0 when the tie magic was placed on the glob itself. As of 5.8.0, the tie magic attaches to just the handle portion of the glob, so the new wrapper glob is still using a tied handle, and the print recurses. (If you add a use warnings to Filter::Handle you will see a Deep recursion warning, just before it segfaults :)

        I think a workaround could be done by saving a dup'ed handle in Filter::Handle::TIEHANDLE. Out of time for now.

Re: Filter ::Handle segfaults 5.8.1
by ysth (Canon) on Nov 17, 2003 at 17:01 UTC
    Just wanted to mention that you should perlbug this. There are two bugs here. First, recursion like
    $ perl -we'sub TIEHANDLE {bless []} sub PRINT { &PRINT(@_) } tie *FOO, + "main"; print FOO "Bar"'
    shouldn't segfault. Second, if possible, what Filter::Handle is doing shouldn't recurse.
      OK. I'll do that today - I have also emailed the author, hopefully he still works for pair.com, who host perlmonks.


      /#!/usr/bin/perl
      use warnings;use strict;use brain;

Re: Filter ::Handle segfaults 5.8.1
by sth (Priest) on Nov 17, 2003 at 14:33 UTC

    Not that this has anything to do with your question and it sounds like ysth has pointed you in the right direction. I was just curious to know why your run with DBI calls took 12 days? I realize you time is probably short, I was just curious what DBMS you were using.

    STH

      ok, well the full run took 12 days with about 200,000 customer documents, so it wasn't one DBI call for 12 days - and it was a bug in the client data that triggerred it. We use Oracle, and as I said it was a first run, so while we fixed various issues and optimised our calls and code, we just let the test run continue for a benchmark. Some of the data sets are huge - before going live, we loaded >2Gb of historical data first. We now run our end-of-month prints in about 6 hours.