in reply to Perl regex txt file new line (not recognised?)

I can confirm that your code works fine for me on Linux when the input file has LF line endings, but not when it has CRLF line endings. So I am guessing that is the problem, to confirm, see the Basic debugging checklist and use e.g. Data::Dumper with $Data::Dumper::Useqq=1; turned on. If CRLF is indeed the issue, there are a couple of ways to fix this: You could add the :crlf PerlIO layer when reading the files, as in open (IN,'<:raw:encoding(UTF-8):crlf', $file) (note I added the :raw as technically, the CRLF conversion should happen after the decoding, although that shouldn't really be a problem with UTF-8 I believe); you could convert the files before processing using e.g. fromdos from Tofrodos; or you could change the regex to adapt, as in s/^[a-z].*[a-z]\r?$//gmi (wouldn't be my preferred solution, but TIMTOWTDI).

Replies are listed 'Best First'.
Re^2: Perl regex txt file new line (not recognised?)
by LanX (Saint) on Jan 21, 2020 at 19:18 UTC
Re^2: Perl regex txt file new line (not recognised?)
by BillKSmith (Monsignor) on Jan 22, 2020 at 15:45 UTC
    haukex,

    I suspected the record-separator issue so I wrote my test case (above) to allow me to experiment with them. I duplicated the original problem when I thought that I was simulating newline separators and the original regex and concluded that the problem was with the regex. Your comments and the OP's conclusions strongly suggest that I was wrong. Can you explain how my memory file fails to simulate the test case you used to confirm the original code?

    OOPS, Simulation is correct. The problem is not duplicated. "Deleted" lines are replaced with a newline. This may be intentional.

    Bill
      Can you explain how my memory file fails to simulate the test case you used to confirm the original code?

      If you change the \n's (LF) in your $disk_file to \r\n (CRLF), you should see the problem that looks like what the OP was describing.