ironchicken has asked for the wisdom of the Perl Monks concerning the following question:
My application includes a SAX filter which parses a simple markup language into XML elements and which is being executed as part of a XML::SAX::Machines Pipeline within a mod_perl2 handler.
I'm finding that the characters method of my filter emits characters for only two HTTP requests after Apache is started and, for any subsequent HTTP requests, it does not emit characters, although it does continue to emit elements correctly.
I checked the logic of my filter quite carefully and when everything seemed fine, I tried checking to make sure that XML::SAX::Base was receiving the character data my filter was emitting. I did this by altering XML::SAX::Base's characters method thus:
sub characters { my $self = shift; print "\nXML::SAX::Base::characters Received DATA: |" . $_[0]->{Da +ta} . "|\n"; if (defined $self->{Methods}->{'characters'}) { $self->{Methods}->{'characters'}->(@_); } else { my $method; my $callbacks; ...
i.e., I inserted that print statement.
I've found that, when characters are successfully emitted, the print statement gets executed twice. But when the filter stops emitting characters, that statement gets executed only once.
The filter's characters implementation parses the supplied character data, looks for instances of the simple markup language elements and emits chunks of the original character data along with newly generated XML elements.
The filter overrides XML::SAX::Base's start_element method like this:
sub start_element { my ($self, $element) = @_; $self->{parsing_markup} = allow_markup($element->{Name}); $self->SUPER::start_element($element); }
In which allow_markup is a function which determines whether a particular element in the source XML is one for whose content this simple markup language should be applied.
There is an implementation of characters like this:
sub characters { my ($self, $chars) = @_; if ($self->{parsing_markup}) { $self->parse_markup($chars->{Data}); } else { $self->SUPER::characters({Data => $chars->{Data}}); } }
Which sends the character data to parse_markup or just hands it on to XML::SAX::Base's characters method.
The parse_markup method is quite complicated, but its functioning boils down to a mixture of $self->SUPER::start_element, $self->SUPER::end_element, and $self->SUPER::characters calls. The start_element and end_element calls are very likely to be correct as I always get the appropriate tags in the output. But there could be something going awry with the characters calls as this is where the data is going missing.
The call to $self->SUPER::characters looks like this:
my $c = {Data => substr $chars, $from, $upto - $from}; unless ($upto - $from <= 0) { print "\n=> calling SUPER::characters +with " . Dumper($c) . "\n"; } $self->SUPER::characters($c) unless ($upto - $from <= 0);
Which includes some more debugging output, that conditional print call. This output is always as I would expect.
I'm fairly sure that this must have something to do with Apache or mod_perl. But I'm now at a loss as to how to debug further. Any suggestions?
Perl: v5.14.2; mod_perl: 2.0.5; Apache: 2.2.22; XML::SAX::Base: 1.07; all installed from Debian pacakges from the unstable archive.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: SAX filter in mod_perl
by Anonymous Monk on May 04, 2012 at 19:45 UTC | |
by ironchicken (Novice) on May 05, 2012 at 01:10 UTC | |
by roboticus (Chancellor) on May 05, 2012 at 13:07 UTC | |
|
Re: SAX filter in mod_perl
by Anonymous Monk on May 06, 2012 at 18:41 UTC |