comment on


A few comments on all these comments!

First this is really all you need for *most* circumstances.

$textarea =~ s/\n/<BR>\n/g;

We substitute <BR>\n so that we get the effect:

Was:

blah
blah

Now:

blah<BR>
blah

If we sub just <BR> instead of \n<BR>\n we will get

blah<BR>blah

If you prefer to get

blah
<BR>
blah

then use \n<BR>\n as the sub pattern

Depending on platform, the \n sequence is converted by perl to:

Unix: octal \012       hex 0xA      dec 10      LF      may be \n
Dos:  octal \015\012   hex 0xD0xA   dec 13 10   CRLF    may be \r\n
Max:  octal \015       hex 0xD      dec 13      CR      may be \r

Although perl works for you trying to allow you to just use 
\n as your newline delimiter and let it sort the platform 
dependent details, many common *internet protocols* specify the
\015\012 sequence and unfortunately the values of Perl's \n and \r
are not reliable since they can and do vary from system to system.

I suspect that $textarea is named from its HTML source so you
will probably want to use a truly portable solution like this:

$textarea =~ s/\015\012|\015|\012/<br>\n/g;

If you prefer hex to octal :-)

$textarea =~ s/\xD\xA|\xD|\xA/<br>\n/g;

If you are confused by the \012 or \xA notation all this is
saying to perl is what I want you to match is the ASCII char 
decimal 10 == octal 12 == hex A == binary 1010

In expanded commented /x form:

$textarea =~ s/          # substitute
              \015\012   # a CRLF sequence (DOS, MIME...)
              |          # or
              \015       # a lone LF (mac)
              |          # or   
              \012       # a lone LF (unix)
              /<br>\n    # with literal '<br>' plus newline
              /xg;       # /x allow comments, /g do globally


There are flaws, both major and minor, with *all* solutions posted: 

s|\n|<br />|g  
# you don't need the unnecessary space or the / before the >
# as \ is the escape char, this will sub '<br />' for \n!
# rather than escape the > making it a literal which it is anyway.

tr/\n/<BR>/s   
# you still need /g, not /s even allowing for using s instead of tr

s/\n/\<br\>/g;
# the escapes are correct but both unnecessary. This is the first
# suggestion that will actually work (most of the time)

s|[\r\n]|<br />|g;
# this is wrong. Leaving aside the problems with using \r and \n
# and the fact it will sub '<br />' the problem is this:
# if we have \r\n we will get <br><br> (assuming we fix the sub)
# with \r or \n we will get <br> so we get a different and platform
# dependent result. This is partially fixed by changing to:
s|[\r\n]+|<br>|g;
# however if we have \r\n\r\n or \n\n or \r\r we get just one <br>
# replacing a series of line breaks, probably not what we want

s,\r\n?|\n\r?,<br />\n,g;
# this suffers from \r \n problems, matches \n\r which is not a
# desired result and subs in '<br />' again -> not an HTML tag

Phew, I feel better now I've got that off my chest.

Finally for those that are not familiar with the concept you may use
*almost* any non-alphanumeric char as a regex delimiter. Thus we
could use paired brackets

$textarea =~ s(\015\012|\015|\012)(<br>\n)g;

Unpaired brackets:

$textarea =~ s{\015\012|\015|\012}<<br>\n>g;

Brackets then a pair of something else, even # chars

$textarea =~ s[\015\012|\015|\012]#<br>\n#g;

With brackets we can split onto two lines:

$textarea =~ s    
(\015\012|\015|\012)
[<br>\n]g;

If using non brackets we can even use ; if you are into obfuscation
$textarea =~ s;\015\012|\015|\012;<br>\n;g;

If our delimiter is included as a literal in the pattern we need to
backslash it \ (escape it) to make it take on a literal meaning and 
match itself within the patern rather than be taken by perl as a one
of the regex delimiters

In a regex only these 12 characters need escaping, although when
in doubt it *generally* does no harm to escape a character.

\ | ( ) [ { ^ $ * + ? .

All these chars have special meaning in a regex and if you do much
with regexes you will soon get to know them by heart


Cheers

tachyon
[download]

In reply to Re: converting carriage returns to br tags (was: Simple Question for you guys) by tachyon
in thread converting carriage returns to <br> tags (was: Simple Question for you guys) by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.