Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
When I was in grad school, I had to refer to concordances a lot, especially Shakespeare and Chaucer. I was trying to find a particular line in a Shakespeare play (Othello, to be exact) the other day and I thought that it would be an entertaining programming exercise to write a concordance generator. Just pass the code a text file and it will generate a full concordance, listing the number of times each word appears in the text, as well as the line numbers, or you can pass it a specific word and a text file, and it will return the line(s) that contain that word.

Now before everyone starts asking why I didn't use strict, my answer is that I did up until the moment I tried to use Getopt::Std. Obviously I'm missing something, but in order to pass strict, I had to declare my $opt variables. But when I did that, it ignored my command line flags. Any help in that regard would be greatly appreciated.

Update: Modified code. Still tweaking..... (btw, the line in Othello I was looking for was the line about throwing away a pearl worth more than the whole tribe. I don't remember why I was looking it up now, but it seemed important at the time.)
#!/usr/bin/perl #--------------------------------------------------------------------# # Concordance Generator # Date Written: 13-Aug-2001 04:02:11 PM # Last Modified: 14-Aug-2001 04:14:00 PM # Author: Kurt Kincaid # # This is free software and may be distributed under the # same terms as Perl itself. # # A simple concordance generator, particularly useful for linguistic # analysis. #--------------------------------------------------------------------# use strict; use vars qw($opt_h $opt_s); use Getopt::Std; my @theseWords; my @theseLines; my @found; my %Count; my %Line; my ( $line, $word, $count, $LineNum ); my $VERSION = "1.0"; getopts( "hs:" ); if ( $opt_h ) { Usage(); } my $file = shift || Usage(); open ( IN, $file ) || die "$file not found\n"; @theseLines = <IN>; close (IN); chomp @theseLines; if ( $opt_s ) { Word($opt_s); } foreach $line ( @theseLines ) { $count++; $line = lc $line; $line =~ s/[.,:;?!]//g; while ( $line =~ /\b\w+\b/g ) { $word = $&; if ( $word =~ /\s/ || $word eq "" ) { next } $Count{$word}++; if ( defined $Line{$word} ) { $Line{$word} =~ m/(\d*?)$/; if ( $1 == $count ) { next; } else { $Line{$word} .= ", $count"; } } else { $Line{$word} = $count; } # push @{$Line{$word}}, $count unless exists $Line{$word} && $L +ine{$word}[-1] == $count; } } @theseWords = keys %Count; @theseWords = sort @theseWords; foreach $word ( @theseWords ) { # print ( "$word ($Count{$word}): ", join ', ', @{$Line{$word}}, "\ +n\n" ); print ("$word ($Count{$word}): $Line{$word}\n\n"); } sub Word { my $word = shift; foreach $line ( @theseLines ) { $LineNum++; $Line{$line} = $LineNum; } @found = grep { /$word/i } @theseLines; foreach $line ( @found ) { print ("$Line{$line}: $line\n"); } exit; } sub Usage { print <<END; Concordance Generator v$VERSION $0 [-h] [-s word] filename -h Print this screen. -s Perform a search for a specific word with immediate context. END exit; }

In reply to Concordance Generator by sifukurt

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-19 03:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found