in reply to Re: Removing extra carriage returns...
in thread Removing extra carriage returns...

Thanks!

I modified yours to

$html_str =~ s/(.*)\n{2,}(.*)/$1\n<p>$2<\/p>/g;
Seems to work...Does it look right to you?

Replies are listed 'Best First'.
Re: Re: Re: Removing extra carriage returns...
by graff (Chancellor) on Apr 02, 2004 at 05:07 UTC
    Does it look right to you?

    Um, well, no, not really -- the AM's approach is better, because it just handles each occurrence of multiple line-breaks in a sensible, consistent way without affecting anything in between. Removing all the "\r" characters first is a nice touch, I think, and then replacing each string of two or more "\n" with a "br" tag does what is needed. (Basically all browsers would manage okay if you just used "<P>" instead of "<BR />", and if you like the results better with "p" tags, why not?)

    By contrast, your approach goes to the trouble of trying to capture stuff before and after the 2-or-more line-breaks, just so it can copy it all back with "p" tags around some of it. It's not hard to come up with some examples that would cause close-P tags to show up in the wrong places. Try running that on the following text (with single and double line-breaks present just as shown), and see what happens:

    First paragraph is easy, no matter whether it's one line or several. But a second paragraph with single line-breaks in it will mess you up, because "." in regexes won't match line breaks, and adding the "s" modifier on the regex to "fix" that will just break things worse.