Purists are cringing at your apparent belief that <p> marks the end of a paragraph. It marks the beginning of a paragraph, which is then terminated by </p>. Your confusion is widespread and pardonable, because the terminal </p> is optional, and your orphan line at the beginning will usually be rendered exactly like a paragraph.
So here's how to do what you are trying to do:
s/(<pre>\n(?:[^\n]*<p>\n)*)([^>\n]*)\n(.*?<\/pre>)/$1$2<p>\n$3/msThis assumes, as you do, that the opening <pre> is on a line of its own. I further assume that you start with no markup of any kind in your <pre> block. The substitution puts <p> at the end of each line that doesn't yet contain markup.
I think my attempt may be the kind of thing you're looking for, but you may find further problems with this approach. Before you spend too much more time on this regex, I'd advise you to either process the file line-by-line (as you're already thinking of doing), or better yet, drop regexes altogether and learn about parsers.
In reply to Re: Substitution inside tags, as 1 line
by Narveson
in thread Substitution inside tags, as 1 line
by tel2
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |