Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Perl Regex to Fix line feed issue

by mmittiga17 (Scribe)
on May 09, 2008 at 18:59 UTC ( [id://685743]=perlquestion: print w/replies, xml ) Need Help??

mmittiga17 has asked for the wisdom of the Perl Monks concerning the following question:

Hi All I am trying to figure out away to fix a text file that comes in with DOS line feeds. For the most part the lines are ok. However there are a few lines that start with ^M and end in ^M. Ending in ^M is ok. some text line here^M ^Msome for text line here What I need is a way to say: when line ends in ^M and the next line begins with ^M join the two lines and add move the ^M the the end of the line. Any suggestions or idea will be greatly appreciated. Thanks MM

Replies are listed 'Best First'.
Re: Perl Regex to Fix line feed issue
by shmem (Chancellor) on May 09, 2008 at 20:51 UTC

    What kind of file is it, actually? Smells like a csv file with multiline fields, in which those lines are separated by ^M. In any case, that smells like an XY problem.

    There's no such thing as a DOS line feed - a line feed is "\n" or ASCII 10. Carriage return is "\r" or ^M or ASCII 13. DOS line endings are "\r\n" (or CRLF). Establish what is your line ending proper (possibly "\r\n"), read the lines setting $/ to that line ending (see perlvar), then convert any (multiple) "\r" occurrences as per the specs of the task (which are those?). A few sample lines would be helpful for further advice.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Perl Regex to Fix line feed issue
by Your Mother (Archbishop) on May 09, 2008 at 20:21 UTC
    moo@cow[1]~>which rnfix rnfix: aliased to perl -pi.bk -e 's/\r\n?/\n/g'

    I find it very handy, hence the alias, but use with caution! It's fine for text files. It will break binary files. I added a .bk for the example. Take it out if you're sure you know what you're doing. (update, took out the /g, pretty sure that was just a stupid reflex; update, update: put it back, I need to lie down.)

      Thanks to all for their replies. Nothing seems to work. I am trying a different approach. if line ends in ^M and the next starts with ^M join lines. Then remove the ^M^M from middle of the line. Any thoughts?

        untested:

        $/ = "\r\n"; while (<>) { if (s/^\r//) { $l .= $_; $l =~ s/\r\r//g; next; } else { print $l; $l = $_; } } print $l;

        although I don't know what the heck you are wanting to to with what files to what end.

        Could you show what you've tried? and perhaps some sample input?

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Perl Regex to Fix line feed issue
by pc88mxer (Vicar) on May 09, 2008 at 19:08 UTC
    If the carriage returns are right next to each other something like this should work:
    use File::Slurp; my $text = read_file('filename'); $text =~ s/\r\r/\r/g; print $text;
    Or you might use: $text =~ s/\r+/\r/g; if there could be multiple adjacent blank lines.

    Knowing exactly the structure of the file would help. What does od -c filename print out near those blank lines?

Re: Perl Regex to Fix line feed issue
by planetscape (Chancellor) on May 10, 2008 at 19:40 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://685743]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-04-25 20:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found