comment on

Just in case it can help to people trying to solve a similar problem:

Probably yagg is the righ tool for that.

Though Parse::Eyapp was conceived for parsing, versions 1.137 and later provide support to build a phrase generator from a grammar specification. If you want to know more, read the tutorial Parse::Eyapp:::datagenerationtut. The example used produces sequences of assignment statements:

Parse-Eyapp/examples/generator$ ./Generator.pm

  # result: -710.2
  I=(3*-8+7/5);
  R=2+8*I*4+5*2+I/I
[download]

To specify the language we write a yacc-like grammar, but instead of writing the classical lexer, i. e. scanning the input to produce the next token, we write a token generator: Each time our lexical analyzer is called, it checks the list of expected tokens (available via the method YYExpect) and produces - following some probability distribution - one of them. This is the grammar for the calculator:

  Parse-Eyapp/examples/generator$ cat -n Generator.eyp
     1  # file: Generator.eyp
     2  # compile with: eyapp -b '' Generator.eyp
     3  # then run: ./Generator.pm
     4  %strict
     5  %token NUM VARDEF VAR
     6
     7  %right  '='
     8  %left   '-' '+'
     9  %left   '*' '/'
    10  %left   NEG
    11  %right  '^'
    12
    13  %defaultaction {
    14    my $parser = shift;
    15
    16    return join '', @_;
    17  }
    18
    19  %{
    20  use base q{Parse::Eyapp::TokenGen};
    21  use base q{GenSupport};
    22  %}
    23
    24  %%
    25
    26  stmts:
    27      stmt
    28        { # At least one variable is defined now
    29          $_[0]->deltaweight(VAR => +1);
    30          $_[1];
    31        }
    32    | stmts ';' { "\n" } stmt
    33  ;
    34
    35  stmt:
    36      VARDEF '=' exp
    37        {
    38          my $parser = shift;
    39          $parser->defined_variable($_[0]);
    40          "$_[0]=$_[2]";
    41        }
    42  ;
    43  exp:
    44      NUM
    45    | VAR
    46    | exp '+' exp
    47    | exp '-' exp
    48    | exp '*' exp
    49    | exp '/' exp
    50    | '-' { $_[0]->pushdeltaweight('-' => -1) } 
    51          exp %prec NEG  {
    52          $_[0]->popweight();
    53          "-$_[3]"
    54        }
    55    | exp '^' exp
    56    | '(' { 
                  $_[0]->pushdeltaweight(
                  '(' => -1, ')' => +1, '+' => +1, ); 
                }
    57        exp
    58      ')'
    59        {
    60           $_[0]->popweight;
    61           "($_[3])"
    62        }
    63  ;
    64
    65  %%
    66
    67  unless (caller) {
    68    __PACKAGE__->main(@ARGV);
    69  }
[download]

The difficult part is the management of the probability distribution to produce reasonable phrases and to avoid very long statements. The generation of tokens and its attributes uses Test::LectroTest::Generator. The support subroutines have been isolated in the module GenSupport.pm (see http://cpansearch.perl.org/src/CASIANO/Parse-Eyapp-1.137/examples/generator/GenSupport.pm ).

In reply to Re: Natural Language Sentence Production by casiano
in thread Natural Language Sentence Production by japhy

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.