in reply to Re^5: while(){}continue{}; Useful?
in thread while(){}continue{}; Useful?

That seems a very clumsy way of parsing to me. You're having to re-parse the same information multiple times. And the number of times will only grow as you add more cases.

A given/when or if/elsif/.../else cascade seems far more appropriate:

#!/usr/bin/perl use strict; use warnings; my $content; my $error; while( <DATA> ) { unless( s[^#include (.+)$][] ) { chomp; warn "Non-include line '$_' untouched\n"; $content .= $_ . "\n"; } else{ local $_ = $1; if( m[<math] ) { $content .= qq[import java.lang.Math;\n]; } elsif( m["stdafx.h"] ) { $content .= qq[#include "stdafx"\n]; } elsif( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and ++$error and next; local $/; # slurp $content .= <$inc_handle> . "\n"; } else { warn "Unhandled include $_\n"; $error++; } } } print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"

Produces:

C:\test>junk49 Unhandled include <stdio.h> Non-include line '// A comment' untouched AlyLee.h: The system cannot find the file specified Common.h: The system cannot find the file specified Content: '#include "stdafx" import java.lang.Math; // A comment ' 3 errors encountered

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"I'd rather go naked than blow up my ass"

Replies are listed 'Best First'.
Re^7: while(){}continue{}; Useful?
by kennethk (Abbot) on Apr 05, 2010 at 22:54 UTC
    The problem is a simple single pass processor (my first thought, too) doesn't handle the case of nested includes (header files in-lining other header files). As best as I can tell, you either need a recursive parser to handle the new material or need to add new material to the existing work queue. Thinking about it, I've improved the basic structure by changing the while conditional to be based off a work queue. This way I can unshift new material to the top of the stack as I hit includes rather than re-running the same regular expression - this was actually getting some false positives for failure due to multiple headers including the same libraries. I also added some handling for #pragma once and discovered I'll have to handle some nested #ifdefs. I still think the continue seems to be the cleanest way to handle parsing failures (quick exit to standard error message). I do see it's a bit contrived with all the undef $_, next; blocks, but see elsif constructs as messier.

    #!/usr/bin/perl use strict; use warnings; my @lines = <DATA>; my ($error, $content); my %pragma_once; while (defined(local $_ = shift @lines)) { unless (/^#/) { $content .= $_; undef $_, next; } if (/^#include\s*/) { if (/<math/) { $content .= "import java.lang.Math;\n"; undef $_, next; } if( /"stdafx.h"/ ) { # Drop it undef $_, next; } if( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and next; my $include = do {local $/; <$inc_handle>;}; # slurp $include = "" if defined $pragma_once{$1}; $pragma_once{$1}++ if ($include =~ s/^#pragma once//m); unshift @lines, split /(?<=$\/)/, $include; undef $_, next; } warn "Unhandled include\n"; } if (/^#defined/) { #... } if (/^#ifdef/) { #... } if (/^#else/) { #... } if (/^#endif/) { #... } } continue { if (defined) { chomp; warn qq{Unhandled line "$_"\n}; $error++; } } print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"
      As best as I can tell, you either need a recursive parser to handle the new material or need to add new material to the existing work queue.

      Turning what I had into a recursive parser is trivial and probably more robust:

      #!/usr/bin/perl use strict; use warnings; my $error; sub parse { my $fh = shift; my $content = ''; while( <$fh> ) { unless( s[^#include (.+)$][] ) { chomp; warn "Non-include line '$_' untouched\n"; $content .= $_ . "\n"; } else{ local $_ = $1; if( m[<math] ) { $content .= qq[import java.lang.Math;\n]; } elsif( m["stdafx.h"] ) { $content .= qq[#include "stdafx"\n]; } elsif( m["(.+)"] ) { open my $inc_handle, '<', $1 or warn "$1: $^E\n" and ++$error and next; $content .= parse( $inc_handle ); } else { warn "Unhandled include $_\n"; $error++; } } } return $content; } my $content = parse( \*DATA ); print "\nContent:\n'$content'\n"; die "$error errors encountered\n" if $error; __DATA__ #include "stdafx.h" #include <math.h> #include <stdio.h> // A comment #include "AlyLee.h" #include "Common.h"
      I also added some handling for #pragma once and discovered I'll have to handle some nested #ifdefs.

      And I think a recursive parser is the only way to go if your going to start handling conditionals.

      And you're going to have to get a lot more sophisticated. You'll need to start storing state--the current values of #defines etc.--in order that you can decide which branch of #ifdef #else to process, which may determine which includes you need to process. And at that point, logging and error and trying to continue for missing files doesn't work at all. The only thing you can do is die.

      As an example of the use of continue, it doesn't really hold up for me. If all your if blocks have to next to avoid entering the continue anyway, you might as well just stick the error handling at the bottom of the while. But as your code above shows, the idea that you can handle all the possible errors in one place doesn't hold up either.

      If this is a serious project, then you'd almost certainly be better off using an existing pre-processor like m4. Or cl/gcc -E.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Right now I'm just spec-ing the project and have a few days to muck about - we have an end goal and my boss wants an order of magnitude estimate for planning. It doesn't have to be a general processor, and some exploration of the way this source code is constructed has suggested that an easier solution is manual translation of the (evolving) code base which I will codify into a script.

        If all your if blocks have to next to avoid entering the continue anyway

        I don't know if you're following my program flow - the nexts don't skip the continue block, they're effectively gotos for the continue block. I personally prefer an if(){next} to if(){}elsif structure - to me it reads easier at the expense of a keystroke (wholly subjective) - but that is irrelevant to the overall while-continue structure. By using a continue, it means unless I hit a line where I explicitly say I've successfully parsed (undef $_, next;) the error code is executed.