in reply to Re^4: REGEX for date
in thread REGEX for date

Are you after this line?

FISCAL YEAR END: 1231
That looks nothing like your example, but is the only fiscal i find in my test list of @aonly i used.

and remember when i said i was concerned with qr/\'\n'/, im even more concerned now, i suggest replacing

for my $line (split qr/\'\n'/, $partial) {
with
for my $line (split qr/\n/, $partial) {
That splits on pure line endings and doesnt require a quote before and after the \n. it makes adding a line like
if($line=~m/\s*fiscal/mi ) {print $line."\n";}
much more manageable. as far as i can see it doesnt affect your code, except it may make the m in /m unnecessary

Replies are listed 'Best First'.
Re^6: REGEX for date
by wrkrbeee (Scribe) on Mar 06, 2017 at 21:31 UTC
    A few more lines down from the line you mention is what I'm after. The following is working pretty well, but missing a few hits here and there (i.e., no result):
    if($line=~m/\s+((January|Febuary|March|April|May|June|July|August|Sept +ember|October|November|December)\s+\d+,\s+\d+)/i) {$res->{fisc +al_year_ended} =$1;}
    But I'm getting there. Grateful for your help!!!!

      it may help if you paste me some example lines from @aonly that exhibit the lines you are after, my test group was just pulled from a simple edgar search. This page https://www.sec.gov/Archives/edgar/data/1540531/0000905718-16-001254.txt does not exhibit lines of that type

      my regex will capture anything that looks like a date, it may not be involved in the fiscal year end

      Edit: for instance https://www.sec.gov/Archives/edgar/data/1084869/0001437749-16-024828.txt shows a date match on the line

      <P id=PARA12 style="MARGIN-BOTTOM: 0px; TEXT-ALIGN: center; MARGIN-TOP +: 0px; LINE-HEIGHT: 1.25"><FONT style="FONT-SIZE: 10pt; FONT-FAMILY: +Times New Roman, Times, serif"><B>For the quarterly period ended Dece +mber 27, 2015</B></FONT></P>

      and i repeat my concern about qr/\'\n'/ should be qr/\n/. without that change the lines can be very long and my regex has no ^ for the /m to anchor on

Re^6: REGEX for date
by wrkrbeee (Scribe) on Mar 06, 2017 at 21:35 UTC
    I can work it from here huck!! Thank you so much!!!!