in reply to line endings
If I was in your shoes I would read perlport and then experiment. I don't have your data to test my theories on sorry!
I am assuming that the end of a line is created with the enter key and doesn't rely on word wrap of the editor
Here goes....
From your code:
$text =~ s/\r/\n/gs;
If there is a \r replace it with a \n so if you have a \r\n you now have \n\n which will trigger a html para break in the next line. If you have a DOS text file on Unix you will get a para break inserted every line time there is a end of line. Also do you need the /g and /s modifiers? list context and single-line mode? I didn't need that but I am not a regexp pro. CHeck with others but why not:
$text =~ s/\r\n/\n/;
When reading a DOS text file on UNIX it will replace \r\n with a single \n which is what I beleive you want. UNIX files shouldn't be touched. This is what I did in my project.
The second one is supposed to just replacing two \n's with a </p>\n<p>? I am not a regexp pro so check with the others. I would use
$text =~ s/(\n\s*){2}/<\/p>\n<p>/;
I can't comment about reading both unix and dos text files under windows as I haven't done this.
|
|---|