in reply to line endings

If I was in your shoes I would read perlport and then experiment. I don't have your data to test my theories on sorry!

I am assuming that the end of a line is created with the enter key and doesn't rely on word wrap of the editor

Here goes....

From your code:

$text =~ s/\r/\n/gs;

If there is a \r replace it with a \n so if you have a \r\n you now have \n\n which will trigger a html para break in the next line. If you have a DOS text file on Unix you will get a para break inserted every line time there is a end of line. Also do you need the /g and /s modifiers? list context and single-line mode? I didn't need that but I am not a regexp pro. CHeck with others but why not:

$text =~ s/\r\n/\n/;

When reading a DOS text file on UNIX it will replace \r\n with a single \n which is what I beleive you want. UNIX files shouldn't be touched. This is what I did in my project.

The second one is supposed to just replacing two \n's with a </p>\n<p>? I am not a regexp pro so check with the others. I would use

$text =~ s/(\n\s*){2}/<\/p>\n<p>/;

I can't comment about reading both unix and dos text files under windows as I haven't done this.