in reply to C parsing questions

You current HoA data structure looks appropriate. However, for the use of others you might like to put the structure definitions in a __DATA__ section and build the internal representation at run time. That way new structures can be pretty much just copied and pasted onto the end of the script and the default supplied or not as required.

__DATA__ typedef struct { float one; /* 0.0 */ float two; /* 0.0 */ int three; /* 0 */ bool potato; /* false */ } struct_name;

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: C parsing questions
by Nkuvu (Priest) on Nov 29, 2005 at 00:22 UTC

    I really like this approach, thanks for suggesting it. I've had to go in and modify some Python scripts to match changes in the source code, and it's usually quite a pain to find out where things are being set/called/parsed/whatever. Anything I can do to make it easier will definitely be appreciated by non-Perl programmers.

    Of course since it's Monday (that's my excuse and I'm sticking to it) I still feel like I'm writing very messy code. Any suggestions on the added sub would be appreciated.

    Also note that the header files are auto-generated by a tool we're using, so the format of the struct definitions is always the same. And my subroutine takes this into account -- it fails horribly with comments in the code, but works just fine for the "live" code.

    use strict; use warnings; use Data::Dumper; my %default_values = ( 'float' => 0, 'int' => 3, # Unique value for visibility durin +g testing 'bool' => 'false' ); my %structs; parse_struct_definitions(); print Dumper(%structs); sub parse_struct_definitions { # Reads the typedef struct lines in the __DATA__ section to popula +te the # %structs hash. Created for simple updates to the defined struct +ures # (simply copy and paste from the header files into the DATA secti +on # below) local $/ = 'typedef struct {'; while (my $line = <DATA>) { chomp $line; next if $line !~ /\w/; # For me to parse out the data more easily: $line =~ s/\n/ /g; $line =~ s/\s+/ /g; # Break the line into members and the struct name my ($member_string, $name) = $line =~ /([^\}]+)\s*\}\s*(.+)/; my @members = split ';', $member_string; $name =~ tr/; //d; foreach my $member (@members) { next if $member !~ /\w/; my ($type, $member_name) = split " ", $member; push @{$structs{$name}}, [ $type, $member_name, $default_v +alues{$type} ]; } } } # end of parse_struct_definitions __DATA__ typedef struct { float one; float two; int three; bool potato; } struct_name; typedef struct { float one; /* 0.0 */ float two; /* 0.0 */ int three; /* 0 */ bool potato; /* false */ } struct_name_with_comments;

      If you change your data to:

      typedef struct { float one; float two; int three; bool potato; } struct_name; typedef struct { float one; /* 1.1 */ float two; /* 2.2 */ int three; /* 3 */ bool potato; /* true */ } struct_name_with_comments;

      You get the following (wrapped to compress):

      $VAR1 = 'struct_name_with_comments'; $VAR2 = [ ['float', 'one', 0], ['/*', '1.1', undef], ['/*', '2.2', undef], ['/*', '3', undef], ['/*', 'true', undef] ]; $VAR3 = 'struct_name'; $VAR4 = [ ['float','one',0], ['float', 'two', 0], ['int', 'three', 3], ['bool', 'potato', 'false'] ];

      There appear to be bugs :).

      Given you are $line =~ s/\n/ /g; and $line =~ s/\s+/ /g; you could just $line =~ s/[\n\s]+/ /g;.

      <>.The use of $/ is nice. Good to see someone remembering it's there.


      DWIM is Perl's answer to Gödel
        There appear to be bugs :).
        Um, yeah, I mentioned as much. Last (non-code) paragraph from my previous post:
        ...the format of the struct definitions is always the same. And my subroutine takes this into account -- it fails horribly with comments in the code, but works just fine for the "live" code.

        I originally had only one of the s/// operations, then was playing around with another -- now that you point out the combination of the two it's very obvious. Doh.