When I was in grad school, I had to refer to concordances a lot, especially Shakespeare and Chaucer. I was trying to find a particular line in a Shakespeare play (Othello, to be exact) the other day and I thought that it would be an entertaining programming exercise to write a concordance generator. Just pass the code a text file and it will generate a full concordance, listing the number of times each word appears in the text, as well as the line numbers, or you can pass it a specific word and a text file, and it will return the line(s) that contain that word.

Now before everyone starts asking why I didn't use strict, my answer is that I did up until the moment I tried to use Getopt::Std. Obviously I'm missing something, but in order to pass strict, I had to declare my $opt variables. But when I did that, it ignored my command line flags. Any help in that regard would be greatly appreciated.

Update: Modified code. Still tweaking..... (btw, the line in Othello I was looking for was the line about throwing away a pearl worth more than the whole tribe. I don't remember why I was looking it up now, but it seemed important at the time.)
#!/usr/bin/perl #--------------------------------------------------------------------# # Concordance Generator # Date Written: 13-Aug-2001 04:02:11 PM # Last Modified: 14-Aug-2001 04:14:00 PM # Author: Kurt Kincaid # # This is free software and may be distributed under the # same terms as Perl itself. # # A simple concordance generator, particularly useful for linguistic # analysis. #--------------------------------------------------------------------# use strict; use vars qw($opt_h $opt_s); use Getopt::Std; my @theseWords; my @theseLines; my @found; my %Count; my %Line; my ( $line, $word, $count, $LineNum ); my $VERSION = "1.0"; getopts( "hs:" ); if ( $opt_h ) { Usage(); } my $file = shift || Usage(); open ( IN, $file ) || die "$file not found\n"; @theseLines = <IN>; close (IN); chomp @theseLines; if ( $opt_s ) { Word($opt_s); } foreach $line ( @theseLines ) { $count++; $line = lc $line; $line =~ s/[.,:;?!]//g; while ( $line =~ /\b\w+\b/g ) { $word = $&; if ( $word =~ /\s/ || $word eq "" ) { next } $Count{$word}++; if ( defined $Line{$word} ) { $Line{$word} =~ m/(\d*?)$/; if ( $1 == $count ) { next; } else { $Line{$word} .= ", $count"; } } else { $Line{$word} = $count; } # push @{$Line{$word}}, $count unless exists $Line{$word} && $L +ine{$word}[-1] == $count; } } @theseWords = keys %Count; @theseWords = sort @theseWords; foreach $word ( @theseWords ) { # print ( "$word ($Count{$word}): ", join ', ', @{$Line{$word}}, "\ +n\n" ); print ("$word ($Count{$word}): $Line{$word}\n\n"); } sub Word { my $word = shift; foreach $line ( @theseLines ) { $LineNum++; $Line{$line} = $LineNum; } @found = grep { /$word/i } @theseLines; foreach $line ( @found ) { print ("$Line{$line}: $line\n"); } exit; } sub Usage { print <<END; Concordance Generator v$VERSION $0 [-h] [-s word] filename -h Print this screen. -s Perform a search for a specific word with immediate context. END exit; }

In reply to Concordance Generator by sifukurt

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.