Ella has asked for the wisdom of the Perl Monks concerning the following question:

So I'm being asked to generate a fairly large number of small text files (about 1 sentence long). The words can be the same every time, but each sentence has to contain a different bunch of numbers. It would look a bit like this:

"The lucky numbers of the day are _1, _2, _3, _4, _5 and don't forget to play again next week."
The aim of this is to test the performance of a recently developed text to speech synthesizer.

Anyway, this is what I've come up with so far (with some help from a friend). It kind of works a bit anyway (well the 2 number generator works, I'm still 'debugging' a.k.a. 'fixing my bad code' on the 5 number generator). I'd appreciate it if the monks have any suggestions for me.

use strict; my $sentence_file = shift (@ARGV); my $numbers_file = shift (@ARGV); open SENTENCE, "$sentence_file" || die "cannot open $sentence_file for + reading: $!"; open NUMBERS, "$numbers_file" || die "cannot open $numbers_file for re +ading: $!"; my $sentence = (<SENTENCE>)[0]; print "sentence: $sentence\n"; my @numbers = <NUMBERS>; foreach my $number (@numbers) { my $working_sentence = $sentence; chomp $number; $number =~ s/\.$//g; my $output_file; if ($working_sentence =~ /_5/) { my $second_number = shift @numbers; my $third_number = shift @numbers; my $fourth_number = shift @numbers; my $fifth_number = shift @numbers; $working_sentence =~ s/_5/$number $second_number $third_number $four +th_number $fifth_number/; $output_file = "number.test5.".$number."_".$fifth_number.".txt"; } elsif ($working_sentence =~ /_2/) { my $second_number = shift @numbers; $working_sentence =~ s/_2/$number $second_number/; $output_file = "number.test2.".$number."_".$second_number.".txt"; } open OUTPUT, ">$output_file" || die "cannot open $output_file for wr +iting: $!"; print OUTPUT $working_sentence."\n"; close OUTPUT; }

I'm sorry - this could be formatted better, (and be a less stupid question) I'm still feeling my way a bit...ok, a lot! :p

'share and enjoy'

Replies are listed 'Best First'.
Re: sentence generation
by blokhead (Monsignor) on Jul 22, 2003 at 23:32 UTC
    This is a perfect use of sprintf. Just use a %d in the sentence template whenever you want a number substituted:
    my $template = "The lucky numbers of the day are %d, %d, %d, %d, %d +" . "and don't forget to play again next week."; my @numbers = qw/ 10 435 239 42 999 /; ## substitute numbers for %d in the template my $output = sprintf($template, @numbers); print "$output\n";
    Your code can vary its behavior depending on how many numbers are in the sentence template: simply count the number of occurences of %d in the template, and splice or slice off that many elements from @numbers to use with that template. I'll leave the rest to you. ;)

    blokhead

Re: sentence generation
by LAI (Hermit) on Jul 23, 2003 at 13:27 UTC

    I'm guessing the sentence templates in $sentence_file include some sort of marker to indicate where numbers should be inserted, right? Right now as near as I can tell you're using '_5' as a placeholder; you might want to change that to something like #5#. This makes the format more forward-compatible; you can regexp for something like /#\d+#/ and use any number of numbers.

    Also: any time you see variables like $second_number, there's bound to be a better way. In this case, you probably want to make an array of your numbers, and join them. Something like:

    # If I were doing this as part of a production piece, and # I figured it would be reused and extended and so forth, # I'd probably write a little sub to spit out a comma- # separated list of integers, and regex it into the # original sentence. But I don't think you need that # complexity. if (working_sentence =~ /#\d+#/) { # now we take an Array Slice* from the array of numbers # we got from file. We use $1 from the preceding regex # because... well, because we can. my @working_numbers = @numbers[0..$1-1]; # remove that slice from the numbers array delete @numbers[0..$1-1]; # join the numbers into a comma-separated list my $number_list = join ', ', @working_numbers; # and insert it into the original sentence $working_sentence =~ /#\d+/$number_list/; }

    The nice thing about doing it this way is that you can vary the number of numbers you want, and even have several lists in one sentence if you convert this to a while loop. For added points convert this to use int rand and eliminate the need for a numbers file.

    * Array slice: you can specify a range within an array, and return it as another array. For more information on slices try perldoc.com.

    Note: This code is all untested and stuff, so there may be glaring mistakes. Be warned.

    LAI

    __END__