in reply to Re: HTML String Parsing
in thread HTML String Parsing
Personally, i think you should use said module with the 'paras' arg instead of 'lines'. The reason is because the browser does an excellent job with text placement. If you are worried about width, just embed the resulting writeup in a <table>. Besides, 'paras' _does_ eliminate that unwanted white space.
thraxil's solution is nice, by the way. Note how both \n and \r is accounted for. thraxil++
If you are still hell bent on using <br> tags then here is a hack i came up with, borrowing a little from thraxil and accounting for extra whitespace:
The first regex replaces two or more newlines surrounding by possible other whitespace with a <p> on it's own line (and if you think that the two \s* thingies are unecessary, try this without em). I left out the trailing new line in the substitution because i just couldn't get a negative lookahead to work in the next regex. Hence, the third regex. I am sure that there is a way to use a negative lookahead to deprecate having to resort to the third regex, but I would just use HTML::FromText anyway!$comment =~ s/(?:\s*[\n\r]\s*){2,}/\n<p>/g; $comment =~ s/[\n\r](?!<p>)/<br>\n/g; $comment =~ s/<p>/<p>\n/g;
The second regex replaces all newlines that are not followed by a <p> tag with a <br> tag and newline. I would have rather liked for this to work:
but as i said, this just didn't work. :( .o0(?)$comment =~ s/(?:\s*[\n\r]\s*){2,}/\n<p>\n/g; $content =~ s/(?!<p>)[\n\r](?!<p>)/<br>\n/g;
UPDATE:
Looks like you have your solution,
but consider how much
time it takes (barring educational purposes of course)
for you to figure out these little details
instead of finding a CPAN module - especially when puting
together a site. Granted, this one didn't do exactly
what you need - but, do you really need 'exactly' what
you need?
(ask that question to the great film
makers)
jeffa
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: HTML String Parsing
by Ionizor (Pilgrim) on Dec 13, 2001 at 11:33 UTC |