Re: filehandles and such

It's tough to figure out what is trying to be done, but here is my stab at a rewrite:

$infile = shift || die "Need a filename!\n";
open(INFILE, "$infile") || die "Could not open $infile: $!\n";
while(<INFILE>) {
  chop;
  tr/A-Z/a-z/;  ## lowercase all characters
  s/\s+/ /g;    ## remove extra spaces
  for $x (split(/\d*\./, $_)) {

    ## Set each to "0" if not found in the hash:
    $site      = $site{$x}      || "0";
    $specimin  = $specimin{$x}  || "0";
    $procedure = $procedure{$x} || "0";

    ## Do not go on if any were not found:
    next unless ($site && $specimin && $procedure);

    ## Grab the results of assigncode:
    $code=&assigncode($site,$specimen,$procedure);

    ## Exit while loop if a good $code is found:
    last if $code;
  }
}
close(IN);

print "$code " if $code;
print "$line\n";
[download]

Notes:

You don't need all the subs. Looks like you are coming from a C background?
The whitespace substitution needs a 'g' on the end
All the things that btrott noted above
The subroutines were missing closing brackets
Parenthesis on 'my' must be to the left of the equal sign:
my(@sample)=@_;
You don't need to sort they keys if you are just looping through to look for a specific value, unless the matches are more likely to be found at the start of the alphabet
Adding a chop is probably a good idea, or you get a newline on the final value returned from the split
The above code assumes that you have &assigncode and $line defined elsewhere.

Comment on Re: filehandles and such Download Code

Replies are listed 'Best First'.
thanks for the help by g man (Initiate) on Apr 26, 2000 at 06:26 UTC
Clearer Explanation (I hope): Given: a hash table with %procedure, %specimen, %site A text file containing medical reports (one report per line) which contain mulitple procedures or specimens Write a perl program that does the following: read in one line at a time, find all procedures and associated site and specimen if found, give it a code (for now just the words) if nothing found on the line, print the line to a different place to keep track off what lines are not being processed go to next line there may be more than one procedure or specimen per line there is a problem in interpeting some data because these words can appear in different order, usage, context (this is an aside) a simple way to look for multiple entries at this time is matching for 1) or a) or a: or 1: , can be any number or letter really weekly reports are generated, and i took an educated guess about the types of words found and how they occur in these reports i want to put these three simple concepts together to create codes, these codes are commonly occurring groups of words which represent some concept if this program works correctly, it should identify the type of sample, and disregard "junk" if someone comes in asks do you have thus and such, i want to say thus and such (pun intended) sorry about the long winded explanation, i am having trouble even writing a basic program which works additional complexity will come in trying to match for terms and looking for effective strategies to identify in what order things could appear at present, i would just be happy with something that spits out at least the first occurence of the sample with the basic info thanks again for your assistance, your code seemed a little easier to follow than what i copied from someone else and did not fully understand	[reply]

Replies are listed 'Best First'.

thanks for the help
by g man (Initiate) on Apr 26, 2000 at 06:26 UTC

[reply]