Chad C has asked for the wisdom of the Perl Monks concerning the following question:

I am hoping you guys can help me out for school I have to create a Perl script that takes out text from one file (a transcript) then try's to find the same word in a dictionary file that contains the word and and its pronunciation spelling and then writes the word with the pronunciation spelling into a new file. Below is an example.

YES NOW YOU KNOW IF IF EVERYBODY LIKE IN AUGUST WHEN EVERYBODY'S ON VACATION OR SOMETHING WE CAN DRESS A LITTLE MORE CASUAL OR

It would have to find each word in this text in a dictionary file such as this

YERKEY Y ER1 K IY0

YERMAN Y ER1 M AH0 N

YERXA Y ER1 K S AH0

YES Y EH1 S

YESES Y EH1 S IH0 Z

and report back all the terms with the pronunciation.

This is is a continuing of last semesters work I found a script one person created but when I run it its not outputting at all to the new file its only printing the first section on my terminal screen. I have never done perl scripting before and am trying to learn on the fly as the teacher won't help us out. Anything you guys could do would be immensely appreciated. Thanks

#!/usr/bin/perl if( $#ARGV != 2 ) { print "Compares the list of words in a file to the words in a dict +ionary and outputs the words available with pronunciations\n"; print "perl GenerateDictionary WordFile DictionaryFile OutputFile\ +n"; exit; } open( WORD_FILE, "$ARGV[0]" ); open( DICT_FILE, "$ARGV[1]" ); open( OUTP_FILE, ">$ARGV[2]" ); @theDICT = <DICT_FILE>; close( DICT_FILE ); while( <WORD_FILE> ) { my($line) = $_; chomp($line); foreach( @theDICT ) { $tmpLine = $_; @items = split( / /, $tmpLine ); if( @items[0] eq $line ) { print $line."\t".$tmpLine; print OUTP_FILE $tmpLine; } } } close( WORD_FILE ); close( OUTP_FILE ); exit;

Replies are listed 'Best First'.
Re: Matching Text
by NetWallah (Canon) on Apr 03, 2012 at 04:23 UTC
    Shorter, more idiomatic, and more efficient code attached. Feel free to modify. Output goes to STDOUT, which can be directed to a file.
    #!/usr/bin/perl use strict; use warnings; @ARGV > 0 or die "Insufficient arguments: Need word file, and Dict fil +e names"; my ($wordfile, $dictfile) = @ARGV; open my $d, "<", $dictfile or die "Cannot open $dictfile: $!"; open my $w, "<", $wordfile or die "Cannot open $wordfile: $!"; my %dict = map {split /\s/,$_,2} <$d>; close $d; while (defined (my $line=<$w>)){ for (split /\s/,$line){ if (my $meaning = $dict{$_}){ print "$_: $meaning\n"; next; } print "$_ is not in the dictionary\n"; } } close $w;

                 All great truths begin as blasphemies.
                       ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

      my %dict = map {split /\s/,$_,2} <$d>; close $d; while (defined (my $line=<$w>)){ for (split /\s/,$line){ if (my $meaning = $dict{$_}){ print "$_: $meaning\n"; next; } print "$_ is not in the dictionary\n"; } }

      I would write that as:

      my %dict = map split( ' ', $_, 2 ), <$d>; close $d; while ( <$w> ) { for my $word ( split ) { if ( exists $dict{ $word } ) { print "$word: $dict{ $word }"; next; } print "$word is not in the dictionary\n"; } }

      The use of /\s/ with split will screw up the dictionary if there are any leading whitespace.    (Anything that can go wrong, will go wrong. -- Murphy)

        agreed. (++). my split-fu is not the greatest.

                     All great truths begin as blasphemies.
                           ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

        Hey guys thanks this seems to be working real well with printing to the screen. I just need to figure out how to get it to output to an external file

      I can't seem to figure out how to get it to go from STDOUT to output to a file. It would need to be just the dictionary items not the items that aren't in the dictionary and printing to the terminal screen. Thanks again I appreciate all help!!

        Never mind realized I had to put it into the Unix command line and not change the script. Thanks again guys you saved me big time!!