in reply to How to Extract Nested IF/THEN elements

Arbitrarily nested structures cant be matched by proper regular expressions. Luckily perls regular expression engine isnt all that regular so it probably is possible to do this. (japhy knows how I'm sure.)

However it would be so much easier to do it with Text::Balanced or Parse::RecDescent that its unlikely anyone would try. Here is a solution using Text::Balanced

use Text::Balanced 'extract_tagged'; use Data::Dumper; sub get_ifs { my $str=shift; my $list=shift || []; my ($extracted, $remainder, $prefix, $open, $inside, $close)=extract_tagged($str,"IF","ENDIF","(?s).*?(?=IF)"); if ($extracted) { push @$list,$extracted; get_ifs($inside,$list); get_ifs($remainder,$list) if $remainder; } return $list; } print Dumper(get_ifs(<<ENDOFIFS)); IF(A) anytext IF(B) IF(C) anytext ENDIF IF(D) anytext ENDIF ENDIF ENDIF ENDOFIFS

Replies are listed 'Best First'.
(Clarification of) How to Extract Nested IF/THEN elements
by demerphq (Chancellor) on Jan 24, 2002 at 22:01 UTC
    What my first sentence meant to say was:

    A formal regular expression can only match an arbitrarily nested structure. Thus if we wanted to match to depth of say 6 we could do it, but if we want to match to any depth a formal regex wont work. And as I said above perls regexes arent formal regexes. They are decidedly irregular as they have support for backrefs.

    Yves / DeMerphq
    --
    When to use Prototypes?

Re: Re: How to Extract Nested IF/THEN elements
by Al Shiferaw (Initiate) on Jan 25, 2002 at 02:54 UTC
    Thanks!