in reply to Problem extracting date with regex

Best to enclose this stuff in <CODE>..</CODE> tags so it looks like this...

$dir='C:/texts/'; opendir(directory,$dir) or die "cant"; while($file=readdir directory){ next if $file=~/^\./; $rfname=$dir.$file; # print "Found file: '$rfname'\n"; open (CONT, $rfname); while (<CONT>){ if($_=~m/<a href="/index.pl?node=0-3&lastnode_id=19212">0-3</a>?<a + href="/index.pl?node=0-9%28th%29%3F%28st%29%3F%28nd%29%3F%28rd%29%3F +&lastnode_id=19212">0-9(th)?(st)?(nd)?(rd)?</a>\s+(Jan(uary)?|Feb(rua +ry)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct +(ober)?|Nov(ember)?|Dec(ember)?)\s+<a href="/index.pl?node=0-9&lastno +de_id=19212">0-9</a>?<a href="/index.pl?node=0-9&lastnode_id=19212">0 +-9</a>?<a href="/index.pl?node=0-9&lastnode_id=19212">0-9</a><a href= +"/index.pl?node=0-9&lastnode_id=19212">0-9</a>/ig){ print "$file\t $_\n"; } elsif($_=~m/(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?| +Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+ +<a href="/index.pl?node=1-3&lastnode_id=19212">1-3</a>?<a href="/inde +x.pl?node=0-9&lastnode_id=19212">0-9</a>(th)?(nd)?(st)?(rd)?\s+<a hre +f="/index.pl?node=0-9&lastnode_id=19212">0-9</a>?<a href="/index.pl?n +ode=0-9&lastnode_id=19212">0-9</a>?<a href="/index.pl?node=0-9&lastno +de_id=19212">0-9</a><a href="/index.pl?node=0-9&lastnode_id=19212">0- +9</a>/ig){ print "$file\t $_\n"; } } }

Looking at your code, it prints out the name of the file and the complete line when the line matches the regular expression. If that's not what you want then you'll need to capture part of the match using brackets and print the value of $1, not $_.


--
http://www.dave.org.uk

European Perl Conference - Sept 22/24 2000
http://www.yapc.org/Europe/

Replies are listed 'Best First'.
RE: Re: Problem extracting date with regex
by Adam (Vicar) on Jun 22, 2000 at 05:21 UTC
    Please DONT do that.
    When using the <code> tags, or <pre> tags for that matter, please avoid long lines. They mess up the whole page for everyone. Thank you.