Pursuant to the thread started in Embedding pod in C with suggestions from John M. Dlugosz and JavaFan (including starting a new thread to get attract fresh eyes and request a serious review), enclosed is a first cut at a preprocessor. It will extract pod (see perlpod) from languages other than perl, where commenting conventions or stylistic preferences prevent starting everything in column 0. This version shows how it is easy to extend the preprocessing to languages other than C or even to perl itself, to slightly relax the column 0 restrictions. The example is not comprehensive or documented, awaiting comments.

Finding the right set of options will be important. I favor language-wide behavior to encourage "standards", but the model will allow fine tuning of the control over how to recognize the start and stop of the pod, and how to trim it to generate genuine pod. Verbatim lines need some sort of special identification (currently =v followed by whitespace) to allow processed lines to begin in some column other than 0. As requested, a newline is added between blocks of pod (as needed).

The code is mostly initialization, to give a sense of how to control things. Processing the input is quite straightforward, and could be even simpler, if we drop control over whether the start and stop sequences are, themselves, included in the output. Comments, please.

#!/usr/bin/perl -w use strict; use Getopt::Long; my %languages = ( 'c' => [ '^\s*#\s*ifdef\s+pod\b', 0, '^\s*#\s*endif\s*/\*\s*pod\s*\*/', 0, '^\s*', '^\s*=v\s', ], 'awk' => [ '^\s*#\s*=pod\b', 0, '^\s*#\s*=cut\b', 0, '^\s*#\s*', '^\s*#\s*=v\s', ], 'perl' => [ '^\s*=pod$', 0, '^\s*=cut$', 0, '^\s*', '^\s*=v\s', ], ); for my $l qw( C c++ C++ ) { $languages{$l} = $languages{c}; } my $language = 'c'; my ( $start, $showstart, $stop, $showstop, $trim, $verbatim ) = @{ $languages{c} }; my $result = GetOptions( "language=s" => \$language, "start=s" => \$start, "stop=s" => \$stop, "trim=s" => \$trim, "verbatim=s" => \$verbatim, "showstart" => \$showstart, "showstop" => \$showstop, ); exit(1) unless ($result); if ( $language ne 'c' ) { unless ( exists( $languages{$language} ) ) { die("Language '$language' not recognized\n"); } ( $start, $showstart, $stop, $showstop, $trim, $verbatim ) = @{ $languages{$language} }; } $start = qr{$start}; $stop = qr{$stop}; $trim = qr{$trim}; $verbatim = qr{$verbatim}; my $show = 0; my $lastempty = 1; while ( my $line = <DATA> ) { if ( $line =~ $start ) { unless ($lastempty) { $lastempty = 1; print "\n"; } $show = 1; next unless ($showstart); } elsif ( $line =~ $stop ) { $show = 0; goto SHOWSTOPPER if ($showstop); } if ($show) { SHOWSTOPPER: chomp($line); $line =~ s/$trim//; $line =~ s/$verbatim/ /; $lastempty = ( $line eq '' ); print $line, "\n"; } } __DATA__ This could be anything #ifdef pod =head2 title blah, blah, blah, blah, blah =v indent 1 #endif /* pod */ This could be anything, too #ifdef pod =head2 another title yo ho ho #endif /* pod */ more anything
Updated: changed ^.* to ^\s* for c patterns, which was my original intent. Thanks for spotting the error, John M. Dlugosz!

In reply to Embedding pod in other languages by jpl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.