comment on

Hi,

I program perl for a while now and still suck at regexes.

My file looks like:

STARTP
TITLE
some gibberish
some more gibberish
ENDTITLE
TITLE
some gibberish
some more gibberish
ENDTITLE
TITLE
some gibberish
some more gibberish
ENDTITLE
ENDP
STARTP
TITLE
some gibberish
some more gibberish
ENDTITLE
TITLE
some gibberish
some more gibberish
ENDTITLE
TITLE
some gibberish
some more gibberish
ENDTITLE
ENDP
[download]

I read following link: "How do I extract all text between two keywords like start and end?". But it didn't make me any wiser.

I want to get everyting from STARTP until ENDP. En then cut stuff up between TITLE and ENDTITLE. But if I do like the suggested link I get everything fron first STARTP until last ENNDP. And I want to match first from First STARTP until first ENDP in the file and then from next STARTP until next ENDP. And the same for TITLE and ENDTITLE.

And no there is no recursion in these tags.

thanx

--
My opinions may have changed,
but not the fact that I am right

_{janitored by ybiC: Retitle from one-word "regex" nodetitle to avoid hindering site searching. Also converted node link from <a href...> to Monastery style [id://nnnn] to avoid logging out monks with cookie set from different PM domain (perlmonks.(org|net), sans leading "www"...)}

In reply to Regex for simple parsing job by toadi

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.