Best to enclose this stuff in <CODE>..</CODE> tags so it looks like this...
$dir='C:/texts/'; opendir(directory,$dir) or die "cant"; while($file=readdir directory){ next if $file=~/^\./; $rfname=$dir.$file; # print "Found file: '$rfname'\n"; open (CONT, $rfname); while (<CONT>){ if($_=~m/<a href="/index.pl?node=0-3&lastnode_id=19212">0-3</a>?<a + href="/index.pl?node=0-9%28th%29%3F%28st%29%3F%28nd%29%3F%28rd%29%3F +&lastnode_id=19212">0-9(th)?(st)?(nd)?(rd)?</a>\s+(Jan(uary)?|Feb(rua +ry)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct +(ober)?|Nov(ember)?|Dec(ember)?)\s+<a href="/index.pl?node=0-9&lastno +de_id=19212">0-9</a>?<a href="/index.pl?node=0-9&lastnode_id=19212">0 +-9</a>?<a href="/index.pl?node=0-9&lastnode_id=19212">0-9</a><a href= +"/index.pl?node=0-9&lastnode_id=19212">0-9</a>/ig){ print "$file\t $_\n"; } elsif($_=~m/(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?| +Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+ +<a href="/index.pl?node=1-3&lastnode_id=19212">1-3</a>?<a href="/inde +x.pl?node=0-9&lastnode_id=19212">0-9</a>(th)?(nd)?(st)?(rd)?\s+<a hre +f="/index.pl?node=0-9&lastnode_id=19212">0-9</a>?<a href="/index.pl?n +ode=0-9&lastnode_id=19212">0-9</a>?<a href="/index.pl?node=0-9&lastno +de_id=19212">0-9</a><a href="/index.pl?node=0-9&lastnode_id=19212">0- +9</a>/ig){ print "$file\t $_\n"; } } }
Looking at your code, it prints out the name of the file and the complete line when the line matches the regular expression. If that's not what you want then you'll need to capture part of the match using brackets and print the value of $1, not $_.
In reply to Re: Problem extracting date with regex
by davorg
in thread Problem extracting date with regex
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |