krish28 has asked for the wisdom of the Perl Monks concerning the following question:

Hey folks
I am trying to read a text file with recurring occurrences of a pattern using Regex, and then i try to catch just the last occurence of this pattern in the text file..
My question is if there is any one particular regex option that would match an occurence of a recurring pattern at the very end of the input file???
Thanks
Krish

Replies are listed 'Best First'.
Re: Matching Regular expression at EOF
by jwkrahn (Abbot) on Feb 20, 2010 at 20:36 UTC

    Use File::ReadBackwards to read the lines of the file and then stop at the first match.

    use File::ReadBackwards; tie *BW, 'File::ReadBackwards', 'file' or die "can't read 'file' $!"; while ( <BW> ) { if ( /pattern/ ) { print; last; } }
Re: Matching Regular expression at EOF
by ikegami (Patriarch) on Feb 20, 2010 at 19:01 UTC
    From the sounds of it, the entire file is in one variable? If so,
    /^.*pat/s

      Wouldn't anchoring the pattern at the end of the string with  \z or one of its ilk (e.g.,  m{ pat \z }xms) tend to be faster since the  .* doesn't need to 'consume' virtually the whole file? Or is this the kind of thing that just gets optimized away?

      Update: Contrary to krish28's clarification in Re^2: Matching Regular expression at EOF, this question assumes the entire file is held in a single string. But the question still stands.

        He wants the last instance of the pattern in the file. It might not at the very end of the file.

        As for speed, (?s:.*) jumps right to the end of the file. (?s:.) matches anything, so no check needs to be made (and none is done).

        Anchoring at the end of the string only matches if there's an actual match at the end of the string. But the last match doesn't imply it's at the end of the string. For instance, in the string below, there are two matches for /a.b/, on of them the last one, but none for /a.b\z/.
        "123 a!b 456 a?b 789"
      No... its not on a single variable... i am opening the file and reading it line by line...
        Then I assume the match doesn't span more than one line?
        my @match; while (<>)) { my @caps = /pat/ or next; @match = @caps; } if (@match) { print("Captured @match\n"); } else { die("No match\n"); }
        or
        my $match; while (<>)) { $match = $_ if /pat/; } if (defined($match)) { print("Matched $match"); } else { die("No match\n"); }

        All this guessing is leading to suboptimal solutions and wated work. If this is still not good, please provide more info about your problem.

        No... its not on a single variable... i am opening the file and reading it line by line...

        Try:

        my $last_line = undef;
        while( <$fh> ){
          $last_line = $_;
          # process file contents, if needed
        }
        if( defined $last_line ){
          $last_line =~ m/$pat\z/;
        }else{
          warn "file is empty\n";
        }
        
Re: Matching Regular expression at EOF
by JavaFan (Canon) on Feb 22, 2010 at 16:48 UTC
    If the file isn't huge, simplicity is the way to go:
    my $last = (map {/^.*(pat)/} <>)[-1];
    Or maybe the file is huge, but the pattern is simple. Then there's still a simple solution:
    my ($last) = (`grep pat file`)[1] =~ /^.*(pat)/;
    Or
    use autodie; open my $fh, "<", "file"; my $last; while (<>) { $last = $1 if /^.*(pat)/; }
Re: Matching Regular expression at EOF
by JavaFan (Canon) on Feb 22, 2010 at 16:54 UTC
    BTW, what do you consider the "last occurrence"? Given the pattern /(.).*\g{1}/ and the string "{abcdcba}", do you consider "abcdcba" to be the last, or "cdc"? The former has the last finish of the pattern, the latter the last start of the pattern.