in reply to How to remove a carriage return (\r\n)
Hi,
It looks like you've gotten plenty of responses to your question, but as already mentioned chomp will remove the platform native line delimiter (0x0a on UNIX, 0x0d 0x0a on Windows).
If $line contains a line from your file and you want to remove either a UNIX line terminator or a Windows line terminator from the end of $line, you could do the following:
$line =~ s/\x0d{0,1}\x0a\Z//s;
The Perl syntax of =~ s/<regular expression>/<replacement>/<qualifiers> causes occurrence(s) of <regular expression> to be replaced by <replacement> and the <qualifiers> indicate how that replacement should be performed. \x followed by two hexi-decimal digits matches the byte in $line whose value is the given set of hexi-decimal digits -- 0d is the hex-decimal value for carriage return and 0a is the hexi-decimal value for newline (line-feed). Open curly brace '{', digit, comma, digit, close curly brace '}' indicates the maximum and minimum number of times to match the preceeding character \x0d{0,1} will match carriage return 0 times or one time. Regular expression pattern matching is always greedy (maximal) so it will match as many times as it can, thus if it can match \x0d, then it will, but if there is no \x0d, that's okay ({0, makes the match optional). \x0a matches the newline (line-feed) character. \Z matches the end of the string (when the s qualifier is used $ at the end of the regular expression and \Z at the end of the regular expression both match the end of the string, where as if the m qualifier is used, then \Z matches the absolute end of the string while $ matches any platform native line terminators within the given string). The s qualifier is used in this case to tell Perl to treat the contents of $line as all one string even if it contains newline characters.
Thus, $line =~ s/\x0d{0,1}\x0a\Z//s; will remove one line terminator from the end of $line and it won't matter if it is a UNIX line terminator or a Windows line terminator. Note that on Macintosh the line terminator is \x0d. So you would need something like this:
$line =~ s/\x0d{0,1}\x0a{0,1}\Z//s;
This substitution would strip off the line terminators in a UNIX file, a Windows file or an old Macintosh file.
Note that in substution you can use \s, \s matches the space character, the tab character, carraige return or line feed.
Thus, I usually use the following:
$line =~ s/\A\s+//s;Which strips all of the whitespace charactes from the begining and the ending of $line. Note again that this pattern would remove all whitespace characters from the beginning and ending of $line which may or may not be what you want. I usually ignore whitespace at the start or end of a line because it usually isn't useful.
Regards,
Peter Jirak
|
|---|