My 'specification' was incomplete, and I apologise for that. I will give some context to make clear what I am trying to figure out.

Ultimate goal: extracting information from a wide variety of text files, representing published articles.

With a largish set of text files, most of them needing their own set of regexes (to split them into usable parts), it would seem to make sense to store regexfile with textfile. I foresee that a lot of strings thus extracted will subsequently (to get that final info-nugget for my database) need unique code. It would make sense to also keep such code associated with the textfile. For this, the codeblocks are one candidate, modules are another, snippet files for future eval yet another.

(a concrete example of the textbase: from one author all published articles, 600 text files, ranging from 1 page to 100+ pages, published over a period of 40 years in several journals.)

So there is the context in which I am trying to find out how 'heavy' those regex codeblocks can be 'loaded' with code, and what communication/steering/instrumentation can be dreamed up. I initially hoped that DBI searches would be possible. This turns out to be almost certainly impossible/unwise, because absolutely no regex can be used inside a codeblock. But I believe subroutines and closures can be defined and called.

So what I'm trying is to logically split very 'specific' code over a text-centric system with many disparate textfiles. It may be a Bad Thing - I am not advocating it, but hope to gain from experience of others on similar exploits.


In reply to Re^4: Communication of program(s) with regex codeblocks - (explaining ultimate goal) by erix
in thread Communication of program(s) with regex codeblocks by erix

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.