in reply to Re^4: while(){}continue{}; Useful?
in thread while(){}continue{}; Useful?

I have now used this structure, and I'm curious what your take on the approach is - either if it changes your opinion of the structure or if you have a more aesthetically pleasing alternative. I'm writing what is essentially a preprocessor for some code autotranslation, and so I have a volatile content string that I process through a series of if (){next} blocks. I then check for the error condition in the continue block, which prevents the possibility of infinite loops and cleans up error handling. I've just started the project, but expect the number of conditional clauses to grow significantly and this strikes me as the most extensible framework.

#!/usr/bin/perl use strict; use warnings; local $/; # slurp my $content = <DATA>; my $error; # Preprocess while (local ($_) = $content =~ /^#include\s(.*?)$/m) { if (/<math/) { # Math library $content =~ s/#include\s$_/import java.lang.Math;/; next; } if (/"stdafx.h"/) { # Autogen MS IDE header for project/system inc +ludes $content =~ s/#include\s$_//; next; } if (/"/) { # A yet unconsidered header file (my $filename) = /"(.*)"/; open my $inc_handle, '<', $filename or warn "File open fail $f +ilename: $!\n" and next; local $/; # slurp my $include = <$inc_handle>; $content =~ s/#include\s$_/$include/; next; } } continue { if ($content =~ s/#include\s$_//) { warn "Unhandled include $_\n"; $error++; } } die "$error errors encountered" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include "AlyLee.h" #include "Common.h"
Note that while lexical variables are defined at the block level (as AnomalousMonk pointed out), the localization in the while conditional holds through the continue block.

Replies are listed 'Best First'.
Re^6: while(){}continue{}; Useful?
by BrowserUk (Patriarch) on Apr 05, 2010 at 19:23 UTC

    That seems a very clumsy way of parsing to me. You're having to re-parse the same information multiple times. And the number of times will only grow as you add more cases.

    A given/when or if/elsif/.../else cascade seems far more appropriate:

    #!/usr/bin/perl use strict; use warnings; my $content; my $error; while( <DATA> ) { unless( s[^#include (.+)$][] ) { chomp; warn "Non-include line '$_' untouched\n"; $content .= $_ . "\n"; } else{ local $_ = $1; if( m[<math] ) { $content .= qq[import java.lang.Math;\n]; } elsif( m["stdafx.h"] ) { $content .= qq[#include "stdafx"\n]; } elsif( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and ++$error and next; local $/; # slurp $content .= <$inc_handle> . "\n"; } else { warn "Unhandled include $_\n"; $error++; } } } print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"

    Produces:

    C:\test>junk49 Unhandled include <stdio.h> Non-include line '// A comment' untouched AlyLee.h: The system cannot find the file specified Common.h: The system cannot find the file specified Content: '#include "stdafx" import java.lang.Math; // A comment ' 3 errors encountered

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      The problem is a simple single pass processor (my first thought, too) doesn't handle the case of nested includes (header files in-lining other header files). As best as I can tell, you either need a recursive parser to handle the new material or need to add new material to the existing work queue. Thinking about it, I've improved the basic structure by changing the while conditional to be based off a work queue. This way I can unshift new material to the top of the stack as I hit includes rather than re-running the same regular expression - this was actually getting some false positives for failure due to multiple headers including the same libraries. I also added some handling for #pragma once and discovered I'll have to handle some nested #ifdefs. I still think the continue seems to be the cleanest way to handle parsing failures (quick exit to standard error message). I do see it's a bit contrived with all the undef $_, next; blocks, but see elsif constructs as messier.

      #!/usr/bin/perl use strict; use warnings; my @lines = <DATA>; my ($error, $content); my %pragma_once; while (defined(local $_ = shift @lines)) { unless (/^#/) { $content .= $_; undef $_, next; } if (/^#include\s*/) { if (/<math/) { $content .= "import java.lang.Math;\n"; undef $_, next; } if( /"stdafx.h"/ ) { # Drop it undef $_, next; } if( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and next; my $include = do {local $/; <$inc_handle>;}; # slurp $include = "" if defined $pragma_once{$1}; $pragma_once{$1}++ if ($include =~ s/^#pragma once//m); unshift @lines, split /(?<=$\/)/, $include; undef $_, next; } warn "Unhandled include\n"; } if (/^#defined/) { #... } if (/^#ifdef/) { #... } if (/^#else/) { #... } if (/^#endif/) { #... } } continue { if (defined) { chomp; warn qq{Unhandled line "$_"\n}; $error++; } } print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"
        As best as I can tell, you either need a recursive parser to handle the new material or need to add new material to the existing work queue.

        Turning what I had into a recursive parser is trivial and probably more robust:

        #!/usr/bin/perl use strict; use warnings; my $error; sub parse { my $fh = shift; my $content = ''; while( <$fh> ) { unless( s[^#include (.+)$][] ) { chomp; warn "Non-include line '$_' untouched\n"; $content .= $_ . "\n"; } else{ local $_ = $1; if( m[<math] ) { $content .= qq[import java.lang.Math;\n]; } elsif( m["stdafx.h"] ) { $content .= qq[#include "stdafx"\n]; } elsif( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and ++$error and next; $content .= parse( $inc_handle ); } else { warn "Unhandled include $_\n"; $error++; } } } return $content; } my $content = parse( \*DATA ); print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"
        I also added some handling for #pragma once and discovered I'll have to handle some nested #ifdefs.

        And I think a recursive parser is the only way to go if your going to start handling conditionals.

        And you're going to have to get a lot more sophisticated. You'll need to start storing state--the current values of #defines etc.--in order that you can decide which branch of #ifdef #else to process, which may determine which includes you need to process. And at that point, logging and error and trying to continue for missing files doesn't work at all. The only thing you can do is die.

        As an example of the use of continue, it doesn't really hold up for me. If all your if blocks have to next to avoid entering the continue anyway, you might as well just stick the error handling at the bottom of the while. But as your code above shows, the idea that you can handle all the possible errors in one place doesn't hold up either.

        If this is a serious project, then you'd almost certainly be better off using an existing pre-processor like m4. Or cl/gcc -E.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.