http://qs1969.pair.com?node_id=520150

abcde has asked for the wisdom of the Perl Monks concerning the following question:

Hello
I am making a help file viewer, that reads POD files. Because POD is often used for Perl help, it highlights any code blocks and C< > spans with Perl syntax highlighting.

However, some of the files (especially the man pages, it seems) use code blocks for things that are not code - so my program tries to highlight URLs, program output, and the command-line arguments synopsis as if they were Perl code. I want to leave these as they are, not highlight them.

So, is there any way to determine whether a block of code is Perl code or not-code, or should I try something else?

Replies are listed 'Best First'.
Re: Distinguishing code from not code
by adrianh (Chancellor) on Dec 31, 2005 at 15:58 UTC
    So, is there any way to determine whether a block of code is Perl code or not-code, or should I try something else?

    POD doesn't define C<> and verbatim paragraphs as having to contain perl code so using them for program output, etc. is perfectly acceptable.

    This means you're going to have to come up with some heuristics to guess whether something is/isn't perl - but that's not going to be 100% whatever you do I'm afraid.

Re: Distinguishing code from not code
by turo (Friar) on Dec 31, 2005 at 16:09 UTC

    If your pod files are editted by you, you can create a pseudo-tag or something else to comunicate your highlitghting engine to not render some chunk of text.
    Or use the '=begin' perlpod tag

    =begin code command --this --that something =end code

    But, maybe, that is not the problem ... right?...

    perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
      It's mainly edited by me, but it'd be nice to be compatible with all the other docs out there.

      Currently I am making special cases for URLs and lines with no punctuation in them, and hopefully people will be clever enough to ignore it when it goes wrong (fingers crossed)
Re: Distinguishing code from not code
by adamk (Chaplain) on Jan 01, 2006 at 10:07 UTC
    This might be a good project for someone to get their feet wet with PPI... Since it can now parse anything, even line noise, you could parse everything in each block and then take the resulting document and apply some heuristics to see whether or not it "looks like" Perl. Would certainly be a little more thorough than doing regex-like ways.

        You want to look on the CPAN instead of the general web.

        Makeshifts last the longest.