Item Description: Scans C source code for functions, typedefs, macros, variables, etc.
Review Synopsis: A useful module for extracting information from C source files, with a lot of cool Perl inside.
Description
The C::Scan module performs fast,
accurate scanning of C source code.
It provides an object interface
for accessing information about
a particular C source file.
The main interface
(after creating the initial object)
is to use a get() method
that will fetch information using
a set of pre-defined keywords
which specify the type of information
you want:
- function declarations
- (in-line) function definitions
- macro definitions, with our without macro arguments
- typedefs
- extern variables
- included files
A lot of the information is available
either raw or parsed, depending
on the specific keyword used
(for example, 'fdecls' vs. 'parsed_fdecls').
Why should you use it?
You want to use Perl
to extract information about
C source code,
including the functions declared or defined,
arguments to functions,
typedefs,
macros defined, etc.
Why should you NOT use it?
- You need a full-blown C parser.
C::Scan is not that.
- You need to scan C++.
Any bad points?
The documentation is lacking.
This is really annoying
because almost all of the keyword fetches
that try to parse the text
use complex and arbitrary structures for return values:
an array ref of refs to arrays
that each hold five defined values,
an array ref of,
a hash ref where the hash values are array refs
to two-element arrays,
etc.
Don't be surprised if you have to dive in
to the code to really figure out
what's being returned.
Related Modules
C::Scan is an example of extremely
powerful use of the
Data::Flow module
(not surprising,
as both were originally
written by Ilya).
The keywords you use to fetch information
are the underlying Data::Flow recipe keywords.
Personal notes
I used C::Scan
to create a code pre-processor
that would scan our C source
and dump various information
into structures
for use by an administrative interface.
This ended up eliminating several
steps in our process that would
always break when someone
added a new command function
but didn't update the right help-text table,
etc.
I learned a
lot
from threading my way through
the C::Scan source code.
It makes liberal use of \G in regexes
to loop through text looking
for pieces it can identify
as a function, typedef, etc.,
and the pos builtin to fetch
and set the offset for the searches.
This allows the module to use
multiple copies of the text side-by-side,
one with the comments and strings whited out
and the other with full text.
This way, it can scan a "sanitized" version
to identify C syntax by position,
but then return full text from the other string.
This is an extremely effective
and astonishingly efficient technique.
Example
Examples of a few ways
to pull information from C::Scan:
$c = new C::Scan(filename => 'foo.c',
filename_filter => 'foo.c',
add_cppflags => '-DFOOBAR',
includeDirs => [ 'dir1', 'dir2' ]
);
#
# Fetch and iterate through information about function declarations.
#
my $array_ref = $c->get('parsed_fdecls');
foreach my $func (@$array_ref) {
my ($type, $name, $args, $full_text, undef) = @$func;
foreach my $arg (@$args) {
my ($atype, $aname, $aargs, $full_text, $array_modifiers) = @$
+arg;
}
}
#
# Fetch and iterate through information about #define values w/out arg
+s.
#
my $hash_ref = $c->get('defines_no_args');
foreach my $macro_name (keys %$hash_ref) {
my $macro_text = $hash_ref{$macro_name};
}
#
# Fetch and iterate through information about #define macros w/args.
#
my $hash_ref = $c->get('defines_args');
foreach my $macro_name (keys %$hash_ref) {
my $array_ref = $macros_hash{$macro_name};
my ($arg_ref, $macro_text) = @$array_ref;
my @macro_args = @$arg_ref;
}