If I was in your shoes I would read perlport and then experiment. I don't have your data to test my theories on sorry!

I am assuming that the end of a line is created with the enter key and doesn't rely on word wrap of the editor

Here goes....

From your code:

$text =~ s/\r/\n/gs;

If there is a \r replace it with a \n so if you have a \r\n you now have \n\n which will trigger a html para break in the next line. If you have a DOS text file on Unix you will get a para break inserted every line time there is a end of line. Also do you need the /g and /s modifiers? list context and single-line mode? I didn't need that but I am not a regexp pro. CHeck with others but why not:

$text =~ s/\r\n/\n/;

When reading a DOS text file on UNIX it will replace \r\n with a single \n which is what I beleive you want. UNIX files shouldn't be touched. This is what I did in my project.

The second one is supposed to just replacing two \n's with a </p>\n<p>? I am not a regexp pro so check with the others. I would use

$text =~ s/(\n\s*){2}/<\/p>\n<p>/;

I can't comment about reading both unix and dos text files under windows as I haven't done this.


In reply to Re: line endings by blm
in thread line endings by thpfft

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.