in reply to Template Parsing - Finding tag pairs.

My question is do you see any problem with the code?

Yes. For one, half of it is commented out, and there's waaay too much whitespace. But seriously, maintainability is key (and that is not hidden). I took a little time to get this running, and hopefully i'll be the only one (you should really provide a runnable code example in the future, easier to spot pitfalls ;D). The only thing i'd do different is take this out the while loop (and use a few if and elses here and there).

#!/usr/bin/perl -wl use strict; # have to search a little differently for nested tags # making sure that an ending tag belonging to # a nested opening tag is not processed as the ending # tag for the current opening tag. # In case i sounded awkward, here's a little diagram: # # <cfif> <--- this tag # <cfif> <--- nested open tag # </cfif> <--- end tag for the nested open tag # </cfif> <--- end tag for this tag (the one that has # to be picked up) # # Actual example: my @chunks = ("<cfif bool eq 1>\cJ\cI ", "<cfif foo = bar>\cJ\cI\cI ", "<cfif bar = foo>\cJ\cI\cI ", "</cfif>\cJ\cI ", "</cfif>\cJ\cI\cI \cJ\cI BOOL is true!\cJ", "<cfelse>\cJ\cIBOOL is false!\cJ", '</cfif>'); my $opening_tag = qr/\<cfif/; my $closing_tag = qr/\<\/cfif/; my $found_i = 0; my $nested = 0; # count of nested open tags found. while ( ( ($chunks[++$found_i] =~ m/^$closing_tag/) ? ( ($nested > 0) ? ($nested--) : (0) ) : ( ($chunks[$found_i] =~ m/^$opening_tag/) ? (++$nested) : (1) ) ) && ( $found_i < @chunks ) ) { print "F: $found_i ", "N: $nested ", "C: $chunks[$found_i]", ; } __END__ F:\dev\vladb>perl nestag.pl F: 1 N: 1 C: <cfif foo = bar> F: 2 N: 2 C: <cfif bar = foo> F: 3 N: 1 C: </cfif> F: 4 N: 0 C: </cfif> BOOL is true! F: 5 N: 0 C: <cfelse> BOOL is false!
update I suspect you'll be building some kind of data structure, and $nested seems like a prime index ;

 
___crazyinsomniac_______________________________________
Disclaimer: Don't blame. It came from inside the void

perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

Replies are listed 'Best First'.
Re: (crazyinsomniac) Re: Template Parsing - Finding tag pairs.
by vladb (Vicar) on Dec 25, 2001 at 11:52 UTC
    Thanks for your comments. They are very much to the point and will be well taken ;-). I should definitely avoid putting snippets of code that doens't run on it's own hehe. Promise to improve on that the next time.

    Relating to the while loop.. It happened so that I had started with a rather simple one-liner while loop for non-nested tags which worked pretty well. And it happend so that when I thought of adding the nested capability, I simply took that original and added a few boolean clauses to take into account any nested tags (and skip them).

    Here's the code as I have it now (slightly modified to run as a stand-alone):
    #!/usr/local/bin/perl -w use strict; my $is_nested = 0; my @chunks = ("<cfif bool eq 1>\cJ\cI ", "<cfif foo = bar>\cJ\cI\cI ", "<cfif bar = foo>\cJ\cI\cI ", "</cfif>\cJ\cI ", "</cfif>\cJ\cI\cI \cJ\cI BOOL is true!\cJ", "<cfelse>\cJ\cIBOOL is false!\cJ", '</cfif>'); my $opening_tag = qr/\<cfif/; my $closing_tag = qr/\<\/cfif/; my $found_i = 0; unless ($is_nested) { # search for the closing pair (starting at the place the # first tag was found + 1) # note: this search is good for non-nested tags... # FIRST WHILE while (($chunks[++$found_i] !~ m/^$closing_tag/) && $found_i < @ch +unks) {} } else { # have to search a little differently for nested tags # making sure that an ending tag belonging to # a nested opening tag is not processed as the ending # tag for the current opening tag. # In case i sounded awkward, here's a little diagram: # # <cfif> <--- this tag # <cfif> <--- nested open tag # </cfif> <--- end tag for the nested open tag # </cfif> <--- end tag for this tag (the one that has # to be picked up) # my $nested = 0; # count of nested open tags found. # SECOND WHILE while ((($chunks[++$found_i] =~ m/^$closing_tag/) ? ($nested > 0 ? $nested-- : 0) : ($chunks[$found_i] =~ m/^$opening_tag/ ? ++$nested : +1)) && $found_i < @chunks) { print "F: $found_i ", "N: $nested ", "C: $chunks[$found_i]", ; } }
    I, basically, expended this clause
    ($chunks[++$found_i] !~ m/^$closing_tag/)
    found in the first while loop a little bit by adding this
    ? ($nested > 0 ? $nested-- : 0) : ($chunks[$found_i] =~ m/^$opening_tag/ ? ++$nested : 1).

    Certainly, I should agree that in terms of maintainability this may not be a perfect solution. And, therefore, I might have to move that 'logic' inside the while body using a few conditionals (ifs/elses).


    "There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith