Well, if you've got a particularly well constrained data set, then you might be able to achieve what you want without *too* much trouble. However, your example of using a string containing a semi-colon only touches on the complexity of the problem --- here's another:

my $x = do{ print "here's an embedded statement\n"; 42; };

And we can imagine all sorts of complications using other quote-like mechanisms like s//statement;statement;/e or s;;;ge ... But even if you can grok all the combinations of quote-like operators and embedded multi-statement terms, you still have a problem. As merlyn so aptly points out in this node, from which I'll extract (and modify) just one little tidbit:

$x = sin / 25 ; # /; die "Bang! I'm dead!"; $y = time / 25 ; # /; die "I'm only pretending!";

Where does the first statement end in each of these two lines? It isn't just that you have to ignore semi-colons inside of a match operation, you have to know whether you are even in a match operation at all. Thus, while the question as you pose it may *seem* like something far less complex than actually "parsing Perl" (I just want to recognize one leading, semi-colon terminated, arbitrary statement) it really isn't at all.

And we've competely ignored other things such as many statements without terminating semi-colons(in the same vein as Beatnik mentions):

while(<>){ if ($. == 1){ print "Now processing $ARGV ...\n" } print if /something/ .. /something else/ } continue { close ARGV if eof }

Which of course is rather contrived ... but still something to consider depending on what you are *really* trying to accomplish (although you did explicitly mention terminating semi-colons, so perhaps this isn't an issue for you).

On the other hand, I seem to recall that Simon Cozens (I think) was working on a Perl parser in Perl --- but I have no idea how far that went or what became of it.


In reply to Re: Matching first Perl statement. by danger
in thread Matching first Perl statement. by vladb

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.