in reply to Re^5: REGEX for date
in thread REGEX for date

A few more lines down from the line you mention is what I'm after. The following is working pretty well, but missing a few hits here and there (i.e., no result):
if($line=~m/\s+((January|Febuary|March|April|May|June|July|August|Sept +ember|October|November|December)\s+\d+,\s+\d+)/i) {$res->{fisc +al_year_ended} =$1;}
But I'm getting there. Grateful for your help!!!!

Replies are listed 'Best First'.
Re^7: REGEX for date
by huck (Prior) on Mar 06, 2017 at 21:41 UTC

    it may help if you paste me some example lines from @aonly that exhibit the lines you are after, my test group was just pulled from a simple edgar search. This page https://www.sec.gov/Archives/edgar/data/1540531/0000905718-16-001254.txt does not exhibit lines of that type

    my regex will capture anything that looks like a date, it may not be involved in the fiscal year end

    Edit: for instance https://www.sec.gov/Archives/edgar/data/1084869/0001437749-16-024828.txt shows a date match on the line

    <P id=PARA12 style="MARGIN-BOTTOM: 0px; TEXT-ALIGN: center; MARGIN-TOP +: 0px; LINE-HEIGHT: 1.25"><FONT style="FONT-SIZE: 10pt; FONT-FAMILY: +Times New Roman, Times, serif"><B>For the quarterly period ended Dece +mber 27, 2015</B></FONT></P>

    and i repeat my concern about qr/\'\n'/ should be qr/\n/. without that change the lines can be very long and my regex has no ^ for the /m to anchor on