rgatliff has asked for the wisdom of the Perl Monks concerning the following question:

First Post:
I have a text file which contains a | delimited list. The text within the list contains "\r" (Carriage Return) represended as ^M in XEmacs. Using "\n" as my end of line does not work as it sees the "\r" as a "\n" also.
For Example:
$/="\f";
while (<MEGADATA>) {
print $_."^^^^";
}
outputs:
field1|field2
^^^^

^^^^|field3 etc

if use default $/="\n"
outputs:
field1|field2^^^^|field3
^^^^^
I have tried to pull up the specific characters in the text file in a binary editor to determine the difference in the characters without success. Any ideas from the Perl monks?

Replies are listed 'Best First'.
Re: Parsing a text file
by eg (Friar) on Feb 10, 2001 at 15:02 UTC

    What are you trying to do? I don't understand why you're messing with the input record separator ($/).

    With a |-delimited file, something like this works:

    while(<>) { chomp; my @array = split(/\|/); }

    and who cares if any of the elements in @array have \r?

    If you're looking to strip \r out of your text, it's just what you'd think it would be:

    s/\r//g;

    (Welcome to perlmonks!)

    Update Ah, I see. Ack! I hate dealing with sloppy data :)

      eg- The reason I was messing with the input record seperator was because the text in the fields of the file seemed to contain the default record seperator. If I loaded the data file in Xemacs, it would show the problem text fields containing ^M where as the records themselves were seperated with a return. Both the record seperator and the problem text areas showed as "0D 0A" in a hex editor, so I guess they are identical. I am sure I am missing a simple thing here, but using $/ as "|\n" (every record finishes with a |) is working. Thanks for the reply. I hope I have not made too much of an idiot of myself on my first post....
Re: Parsing a text file
by rgatliff (Sexton) on Feb 10, 2001 at 15:55 UTC
    Found my answer. Upon further investigation of the file in a Hex editor, I found the characters were not different (0D 0A). Because of this, I guess I either need to run a RegEX to determine 'end of record' or just set $/="|\n", which actually solved my problem. Thanks for the time. As a novice/beginning Perl guy, I have used the site often for code snippets.