in reply to Re^2: Multiple regex matches in single string
in thread Multiple regex matches in single string

If I am reading hipowls's regex correctly it will be checking the negative look-ahead for every character between 'start' and 'end'. Just doing the look-ahead once should locate the last 'start' in a group then the .+? can run without keep checking after every character.

use strict; use warnings; my $string = <<'EOT'; start start start go one end start start start go two end EOT my $rxGroup = qr {(?isx) ( start (?!\nstart) .+? end ) }; print qq{$1\n\n} while $string =~ m{$rxGroup}g;

The output.

start go one end start go two end

I hope I am correct and this slight change will speed up your code.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^4: Multiple regex matches in single string
by hipowls (Curate) on Apr 26, 2008 at 23:33 UTC

    That assumes that the starts are on consecutive lines which may be a perfectly valid assumption. It pays to know your data.

    Another approach is to use the original regex, which may have multiple starts and then trim it using s/^.*start/start/is.

    The loop then looks something like

    while ( $string =~ /(start.+?end)/gis ) { my $data = $1; $data =~ s/^.*start/start/is; print $data, "\n\n"; }
    If the intent is to strip off multiple starts only on consecutive lines then the regex would be s/^(?:start\s*)+start/start/is which used on the input
    start start start go one end start start data start go two end
    would produce
    start go one end start data start go two end
    But as I said you really need to know your data and other factors such as if you need to validate the input.