comment on

We all know (as I hope) one of the Perl slogans. The most famous, of course, is the Timtoady one. Next most famous is, I think, "Perl makes Easy Things Easy and Hard Things Possible". Well, with thise two slogans given, I'd like to add another one: "Only perl can parse Perl".

Great! So now I want to write a Perl script (interpreted by perl so hopefully able to parse Perl, because hard things should be possible, in more than one way) to extract subroutines from another Perl script.

Globally, this would be:

search for sub NAME [(PROTOTYPE)] [: ATTRIBUTES]

search for the opening curly, until the matching closing curly.

Well, this shouldn't be too hard. But mind you! What if there are closing curlies within strings? Of course it is not too hard just to ignore everything between quotes. But what if something like qq() or qw() is used? What if the fancy => operator is used? What if here docs are used? And so on...

So... the main question is: how can I extract a subroutine from a Perl file, beginning with the sub keyword, then the name, prototype and attribute specifications, then the opening curly and from there, everything until the closing curly?
I would be glad if this can be done using regexes, but I don't think they're up to the job (unless they become really, really complex). Another possibility is just to scan byte-by-byte, keeping track of opened and closed curly brackets and opened and closed string (this isn't easy, for there are many types of strings, as mentioned above).

To make a long story short, is there an easy way (module or whatever) to easily extract subroutines from a Perl script using a Perl script?

"2b"||!"2b";$$_="the question"
Besides that, my code is untested unless stated otherwise.
One more: please review the article about regular expressions (do's and don'ts) I'm working on.

In reply to Extract subroutines from a Perl script. OR: "Only perl can parse Perl." But can it help me to do so? by muba

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.