Good questions. I phrased it as a "brainstorm" as that's where I'm currently at. I've got some ideas of what I want to do and I'm trying to figure out which of the many ways to do it I want to start with. Some of my goal is educational -- so expediency of solution isn't a top priority. (A rare luxury.)

The genesis of my question came from the documentation that doesn't suck thread, which reminded me of a halfway-started, never-completed project of mine from a year or two ago. What I'd like to do is replace my module's POD with some sort of wikitext, wrapped in =begin wiki/=end blocks, and then pre-process those blocks during the module build process to create separate .pod files containing matching pod. It's similar to what ingy has been thinking about for Perldoc and Kwid -- only I'm not sure I'm willing to wait until that's done and documented.

In evaluating CPAN, I can find modules for wiki-to-HTML (though often tightly-coupled), for pod-to-wiki, for html-to-pod, and many others that are less-well documented and harder to sort through. The "easy" approach is to string together a wiki-to-html processor and an html-to-pod processor, but that makes the output dependent on the chain of tools and their idiosyncrasies.

CPAN is great for getting something done and working, but doesn't always get it done exactly the way that you want. It got me thinking about whether I should write my own narrowly-focused wiki-to-pod translator and that got me thinking about whether I should instead write the tool that I had really been hoping to find on CPAN which was a generic wiki parser that could have various wiki grammers plugged into it and which spit out a document model that could be subsequently manipulated or turned into output.

If I get around to tackling it, I'd probably start with simple, existing tools that got the job done even if it wasn't exactly what I wanted (code development being so darned personal) and work out from there, but I was hoping to get more general insights into whether I was even thinking about the longer-term approach to this kind of parsing problem in the right way.

Does that clarify? I asked more vaguely first because I'm more interested in the general insights than the solution to the narrow problem, for which I'm confident I can cobble a solution. Probably I should have explained this in the first place.

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.


In reply to Re^2: Parsing to a format-neutral document model? by xdg
in thread Parsing to a format-neutral document model? by xdg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.