Thank you very much for your inputs and sorry for the typo; one parenthesis was missing from the code. My text files are operative notes and each note consists of sections that start with a title at the beginning of a line, all in upper case and end in colon. Sections are usually separated by an empty line, although this may not be always the case. The input directory contains 1000 files and my intention is to write the files back to an output directory but with only designated matched sections (title + content). Per recommendation, it seems adding a while loop to my matching RegEx fixed the issue but please do advise me if you find other issues in the code. I seldom do codes but since I am working with text files the RegEx is very powerful helping me for occasional data extraction.I am sure there are much easier ways to code what I coded below. This is a sample input file:
PREOPERATIVE DIAGNOSIS: Left invasive cancer, positive margins.
TITLE OF OPERATION:
1. Left needle-localized segmental mastectomy.
2. intraoperative axillary lymphatic mapping.
3. lymphadenectomy.
ANESTHESIA: General.
INDICATIONS FOR SURGERY: Invasive carcinoma with positive margins and residual calcifications.
COMPLICATIONS : None.
#!/usr/bin/perl use strict; use warnings; my $indir; my $file; my $new; my $string; my $outdir; $indir = 'C:/input'; $outdir ='C:/output'; if(-d $indir) { opendir(DIR, $indir) or die "can't open $!"; } while ($file=readdir(DIR)) { my $fullpath=$indir.'/'.$file; open IN, "$indir/$file"; $new= "$outdir/$file"; open OUT, ">$new"; while(<IN>) { undef ($/); $string=$_; while ($string =~m/(FINDINGS|COMPLICATIONS)(:)(.*?)(^[A-Z])/sgm) { print "processing $file\n"; print OUT "$1$2\t$3"; } } close IN; close OUT; } closedir(DIR); exit;
In reply to Re^2: multiple OR match fails
by zzgulu
in thread multiple OR match fails
by zzgulu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |