Ovid has asked for the wisdom of the Perl Monks concerning the following question:

This has been X-Posted to the "Higher Order Perl" discussion mailing list.

As we're using Parser.pm at my company (with the proper copyright notice, I might add) and I've started writing tests for it. Things have gone well until I encountered a case with &concatenate which doesn't match my expectations. Are my expectations wrong or is this a bug?

Below is a minimal test case demonstrating the problem. This assumes Stream.pm and Parser.pm are in the same directory as this script:

#!/usr/bin/perl use strict; use warnings; use lib '.'; use Stream qw/node list_to_stream/; use Parser ':all'; use Test::More 'no_plan'; my @tokens = ( node( OP => '+' ), node( VAR => 'x' ), node( VAL => 3 ) ); my $stream = list_to_stream(@tokens); my $parser = concatenate( lookfor('OP'), lookfor('VAR'), ); my ( $parsed, $remainder ) = $parser->($stream); is_deeply $parsed, [qw/+ x/], 'concatenate should return the parsed values'; is_deeply $remainder, [ VAL => 3 ], '... and the rest of the stream'; # the next test fails ... $parser = concatenate( lookfor('OP'), lookfor('VAR'), lookfor('VAL'), ); ( $parsed, $remainder ) = $parser->($stream); is_deeply $parsed, [qw/+ x 3/], 'concatenate should return the parsed values'; ok !defined $remainder, '... and the remaining stream should be empty';

When I try to concatenate just the first two tokens, everything is fine. However, the if I try and concatenate all of the tokens together, the parser fails to find a match.

I'm sure I can fix this, but I want to make sure this is really a bug and not me misunderstanding how this code is supposed to work. I seached the errata but couldn't find and example of this problem.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re: Testing the HOP Parser
by Ovid (Cardinal) on Aug 24, 2005 at 02:16 UTC

    Also X-Posted to the HOP-Discuss mailing list.

    OK, I confess I was lazy. I posted my query in hopes of someone supplying me with a quick answer and I went on and did other things. Given that I was Warnocked, I decided to dig into this.

    What I think is the problem is how the lookfor() function was returning the last item of the stream. Running the following code gives us a clue:

    use Stream qw/node drop list_to_stream/; my @tokens = ( node( OP => '+' ), node( VAR => 'x' ), node( VAL => 3 ) ); my $stream = list_to_stream(@tokens); use Data::Dumper::Simple; while (my $node = drop($stream)) { print Dumper($node, $stream); }

    That prints out the following (tightened up a bit for clarity):

    $node = [ 'OP', '+' ]; $stream = [ # AoA [ 'VAR', 'x' ], [ 'VAL', 3 ] ]; $node = [ 'VAR', 'x' ]; $stream = [ 'VAL', 3 ]; # not AoA $node = 'VAL'; $stream = 3; # scalar

    That doesn't look like what the parser is expecting. Prior to the tail, it was always returning an AoA. However, when it got to the tail, it merely returned an array reference. I fixed this by adjusting the parser that lookfor() returns.

    The parser looked like this:

    my $parser = parser { my $input = shift; return unless defined $input; my $next = head($input); for my $i (0 .. $#$wanted) { next unless defined $wanted->[$i]; return unless $wanted->[$i] eq $next->[$i]; } my $wanted_value = $value->($next, $u); return ($wanted_value, tail($input)); };

    By testing the tail to see if it matches expectations, I get this:

    my $parser = parser { my $input = shift; return unless defined $input; my $next = head($input); for my $i (0 .. $#$wanted) { next unless defined $wanted->[$i]; return unless $wanted->[$i] eq $next->[$i]; } my $wanted_value = $value->($next, $u); my $tail = tail($input); # is this the last token from the stream? if ($tail && 'ARRAY' eq ref $tail && 'ARRAY' ne ref $tail->[0]) { $tail = [ $tail ]; } return ($wanted_value, $tail); };

    I also have to adjust my first test of the remaining stream:

    my $parser = concatenate( lookfor('OP'), lookfor('VAR'), ); my ( $parsed, $remainder ) = $parser->($stream); is_deeply $parsed, [qw/+ x/], 'concatenate should return the parsed values'; is_deeply $remainder, [[ VAL => 3 ]], # AoA, not just an aref '... and the rest of the stream';

    This allows the final concatenate to match the entire stream:

    $parser = concatenate( lookfor('OP'), lookfor('VAR'), lookfor('VAL'), );

    Have I analyzed this incorrectly or did I set my test up incorrectly with the list_to_stream() function? Should drop() be modified instead?

    Cheers,
    Ovid

    New address of my CGI Course.