Um, I suppose your plan could work, if the C code you're handling has been formatted in strict accordance with a specific coding style, and doesn't contain any traps like multi-line quoted strings containing lines that resemble function prototypes.

But if there are portions of code that have been commented out by bracketing a region between "/*" and "*/", and the region happens to contain (strings resembling) function declarations, then adding your extra comment lines is likely to bolix things. (I haven't checked on this in a while, but I recall not being able to rely on whether embedded "/* ... */" comments would be handled correctly by every C compiler.)

To take this sort of task seriously, regex matching won't really do it -- you have to parse the text character by character, so that at any given point, you know what sort of content you're dealing with (quoted string, comment string, function body, "#define" directive, etc), and you know how to interpret each character as you get to it (e.g. whether it was preceded by "\").

Still, if I could assume that some C source code really has a well-behaved format, and "false-alarm" matches of function declarations won't happen, then a regex like this might do:

my $csrc; { local $/; # slurp the source code $csrc = <>; # from stdin or $ARGV[0] } while ( $csrc =~ /\n((?:\w+\*?\s+)+) (\w+\s*) \( (.*?) \) \s* \{ /gsx +) { my ( $functype, $funcname, $funcarg ) = ( $1, $2, $3 ); my @funcargs = split /,\s*/, $funcarg; print "found function def:\n type=$functype\n name=$funcname\n arg +s=\n "; print join( "\n ", @funcargs ), "\n"; # do other stuff with these strings... }
(update: changed the $funcarg split to allow 0-or-more whitespace)

Slurping the file like that makes it easier to handle the multi-line function declarations, but then makes it just a little harder to insert the extra comment strings correctly (not impossible, certainly).

Maybe what you really want is to enhance whatever editor you normally use for writing C code, by adding a macro or function of some sort that will take a highlighted region, copy/paste it, and reformat the upper copy as a comment block. (I'm sure folks have done this numerous times with emacs/elisp.)

What you are doing is going to require manual editing anyway -- someone is supposed to type in explanations for the paramaters, etc, or else the whole exercise is pointless, right? -- so the right tool for this job is a macro in a text editor, not Perl (unless your editor lets you declare macros with embedded perl scripting).

Later Im gonna add for parsing Perl scripts aswell, but that is alot easier.

Heheh, yeah right... NOT (unless your perl source code holds to even more stringent style constraints that your C code). But good luck with that anyway.


In reply to Re: Parsing C Source File Functions. by graff
in thread Parsing C Source File Functions. by Ace128

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.