How to remove a carriage return (\r\n)

monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to remove a carriage return (\r\n) by borisz (Canon) on Nov 01, 2005 at 16:16 UTC
look at $/ ( $INPUT_RECORD_SEPARATOR ) `perl -MData::Dumper -e ' $key = "test text\r\n"; local $/ = "\r\n"; chomp $key; print Dumper $key;` [download] Boris	[reply] [d/l]
Re: How to remove a carriage return (\r\n) by radiantmatrix (Parson) on Nov 01, 2005 at 17:05 UTC
The chomp function removes the current 'input record separator' (stored in `$/`, see perlvar) from the end of a given string of text. You have two options to make it behave in the given circumstance. I will assume, from here, that you are reading lines from a file with Windows-style line-endings (`\r\n`). First, you could simply adjust your `$/`: `local $/ = "\r\n"; while (<DATA>) { chomp $_; print STDERR "'$_'\n"; }` [download] Of course, if you don't know what kind of line endings you have, you can simply convert all line endings to newlines first: `while (<DATA>) { s/\r[\n]/\n/gm; # now, an \r (Mac) or \r\n (Win) becomes \n (UNIX +) chomp $_; print STDERR "'$_'\n"; }` [download] Seems a waste to run a regex and* chomp, but you could remove chomp from the code above, and replace the regex with: `s/\r[\n]//gm;` [download] Of course, if your input separator isn't set, you'll read the whole file on the first pass through the while loop. You might consider an alternate strategy: `open FH, '<', $file or die "Can't read '$file': $!"; # find out what kind of line endings we have my $buffer; local $/ = undef; while ( read( FH, $buffer, 1024 ) ) { if ( $buffer=~m/(\r[\n])/s ) { $/ = $1; # set the input separator to what we found last; # stop trying to find the separator } } close FH; # now reopen the FH and read line by line open FH, '<', $file or die "Can't read '$file': $!"; while (<FH>) { chomp; print STDERR "'$_'\n"; } close FH;` [download] There are cases I haven't dealt with, etc. for purposes of simplicity. <-radiant.matrix-> A collection of thoughts and links from the minds of geeks The Code that can be seen is not the true Code "In any sufficiently large group of people, most are idiots" - Kaa's Law	[reply] [d/l] [select]
Re: How to remove a carriage return (\r\n) by pg (Canon) on Nov 01, 2005 at 16:30 UTC
To be a little bit cross-platform, just do: `use strict; use warnings; { my $str = "abcd\r\n"; $str =~ s/\r\|\n//g; print "[$str]"; } { my $str = "abcd\n"; $str =~ s/\r\|\n//g; print "[$str]"; } { my $str = "abcd\r"; $str =~ s/\r\|\n//g; print "[$str]"; }` [download]	[reply] [d/l]
Re^2: How to remove a carriage return (\r\n) by jgallagher (Pilgrim) on Nov 01, 2005 at 18:44 UTC
But what happens when `$str = "ab\ncd\r\n"`? Or can we assume there are no line breaks except at the end of lines?	[reply] [d/l]
Re^3: How to remove a carriage return (\r\n) by 3dbc (Monk) on Nov 13, 2012 at 16:13 UTC
$line =~ s/\R//g;	[reply]
Re: How to remove a carriage return (\r\n) by philcrow (Priest) on Nov 01, 2005 at 16:16 UTC
Chomp removes the line ending native to your platform, because $/ defaults to it. On unix this is simplly a line feed which we often write \n. If chomp won't do it try: `$key =~ s/\r\n//;` [download] Phil Update: Explained why chomp usually removes your platform default line ending.	[reply] [d/l]
Re^2: How to remove a carriage return (\r\n) by VSarkiss (Monsignor) on Nov 01, 2005 at 16:29 UTC
Chomp removes the line ending native to your platform. Actually, chomp knows nothing about your platform. It removes `$/` at the end of the string, as borisz has correctly indicated above. Do not rebuke them with harsh words ... but rather lead them gently - with URLs - so that they may learn wisdom.	[reply] [d/l]
Re^2: How to remove a carriage return (\r\n) by tilly (Archbishop) on Nov 01, 2005 at 16:30 UTC
I prefer `$key =~ s/\r?\n/` so it will handle either style of linefeed, whether your code is running on either Windows or Unix. (On Windows, if binmode is off, you get what look like Unix linefeeds out of Windows linefeeds.)	[reply] [d/l]
Re: How to remove a carriage return (\r\n) by monarch (Priest) on Nov 02, 2005 at 01:06 UTC
Being a paranoid programmer myself, I always use: `sub remove_trailing_newline { $_[0] =~ s/[\r\n]+\Z//; }` [download] no matter which platform I am on. A real trick is trying to find blank lines in a slab of (multiline) text: `if ( m/(\r\n\|\n\r\|\r\|\n)$1/ ) { # two newlines in a row! }` [download] Update: Thanks to rev_1318 for pointing out that the `$1` need be replaced with `\1` in the regexp.	[reply] [d/l] [select]
Re^2: How to remove a carriage return (\r\n) by rev_1318 (Chaplain) on Nov 02, 2005 at 12:45 UTC
`if ( m/(\r\n\|\n\r\|\r\|\n)$1/ ) {` [download] You mean `if ( m/(\r\n\|\n\r\|\r\|\n)\1/ ) {` [download] Backreferences inside the RE are notated as \1, \2, etc. Paul	[reply] [d/l] [select]
Re: How to remove a carriage return (\r\n) by jira0004 (Monk) on Nov 01, 2005 at 18:17 UTC
Hi, It looks like you've gotten plenty of responses to your question, but as already mentioned chomp will remove the platform native line delimiter (0x0a on UNIX, 0x0d 0x0a on Windows). If `$line` contains a line from your file and you want to remove either a UNIX line terminator or a Windows line terminator from the end of `$line`, you could do the following: `$line =~ s/\x0d{0,1}\x0a\Z//s;` The Perl syntax of `=~ s/`<regular expression>`/`<replacement>`/`<qualifiers> causes occurrence(s) of <regular expression> to be replaced by <replacement> and the <qualifiers> indicate how that replacement should be performed. `\x` followed by two hexi-decimal digits matches the byte in `$line` whose value is the given set of hexi-decimal digits -- 0d is the hex-decimal value for carriage return and 0a is the hexi-decimal value for newline (line-feed). Open curly brace '`{`', digit, comma, digit, close curly brace '`}`' indicates the maximum and minimum number of times to match the preceeding character `\x0d{0,1}` will match carriage return 0 times or one time. Regular expression pattern matching is always greedy (maximal) so it will match as many times as it can, thus if it can match `\x0d`, then it will, but if there is no `\x0d`, that's okay ({0, makes the match optional). `\x0a` matches the newline (line-feed) character. `\Z` matches the end of the string (when the `s` qualifier is used `$` at the end of the regular expression and `\Z` at the end of the regular expression both match the end of the string, where as if the `m` qualifier is used, then `\Z` matches the absolute end of the string while `$` matches any platform native line terminators within the given string). The `s` qualifier is used in this case to tell Perl to treat the contents of `$line` as all one string even if it contains newline characters. Thus, `$line =~ s/\x0d{0,1}\x0a\Z//s;` will remove one line terminator from the end of `$line` and it won't matter if it is a UNIX line terminator or a Windows line terminator. Note that on Macintosh the line terminator is `\x0d`. So you would need something like this: `$line =~ s/\x0d{0,1}\x0a{0,1}\Z//s;` This substitution would strip off the line terminators in a UNIX file, a Windows file or an old Macintosh file. Note that in substution you can use `\s`, `\s` matches the space character, the tab character, carraige return or line feed. Thus, I usually use the following: `$line =~ s/\A\s+//s; $line =~ s/\s+\Z//s;` Which strips all of the whitespace charactes from the begining and the ending of `$line`. Note again that this pattern would remove all whitespace characters from the beginning and ending of `$line` which may or may not be what you want. I usually ignore whitespace at the start or end of a line because it usually isn't useful. Regards, Peter Jirak jira0004@yahoo.com	[reply]