in reply to Invisible characters

This is sort of tricky because it just looks like it is working...First problem is that the \n is being kept, that's why ramlight's printout has an extra blank line in it. One way is to remove the \n is with chomp(). But there is another problem, because you only split on the "," character, the space before the 2nd token is being kept with it. I modified your print statement to show this:
print "\"$field1\"\t\"$field2\"\n"; output... I took out "" in data input because it was too confusing a printout. "-0.500" " 4.502e-6" "-0.499" " 4.474e-6" "-0.498" " 4.458e-6" "-0.497" " 4.445e-6" "-0.496" " 4.433e-6" "-0.495" " 4.421e-6"
One thing that could be done is split on any sequence of whitespace chars or the ",", like this:
my @fields = split /[\s,]+/;
That would handle extra "blank type" chars like \t. It also appears that you have "'s in input that you don't want. One way to get rid of them would be s/"//g; which just deletes them all.

I suspect that you will find some unprintable character as toolic suggests. Of course one way to deal with this is just modify the regex that gets rid of the " char to deal with any characters that we don't want to see: s/[^\w\.\,\-]//g;, that gets rid of anything not contained within this set which would include the ",\n chars also it possible to use "tr" for the same purpose. tr is a simple minded thing and as such it is faster than the s/// type operation. But fixing the input file is better as this "weirdness" can fester and propagate.

Finally, you can use array slice an get rid of the @fields intermediate variable. So some code with various possibilities...Hope some combination of ideas work for you.

while(<DATA>){ chomp(); #optional #s/[^\w\.\,\-]//g; tr /0-9.-e ,//dc; #another possibility.. my ($field1, $field2) = (split /[\s,]+/)[0,1]; print "\"$field1\"\t\"$field2\"\n"; } __DATA__ "-0.500, 4.502e-6" "-0.499, 4.474e-6" "-0.498, 4.458e-6" "-0.497, 4.445e-6" "-0.496, 4.433e-6" "-0.495, 4.421e-6"

Replies are listed 'Best First'.
Re^2: Invisible characters
by lomSpace (Scribe) on Feb 24, 2010 at 19:26 UTC
    Hi Marshall,
    The input file is an excel file saved as a text document. I am using a Mac to run the code.
    Any suggestions concerning the input file formatting?
    Lom Space
      Hi Lom Space!
      I take it that Excel is running on your Mac that the Perl code is also running on the Mac?

      There can be problems with transferring text files between: Mac,Unix and Windows because there are different "line termination" sequences. Mac uses \r, Unix \n, Windows \r\n so there can be some "weirdness".

      Put the chomp(); in the code. I've found Perl to be pretty smart about dealing with line termination issues. I haven't seen Excel export a "weirdo character" when doing a text export. If you can "cat" the file, then Perl can read it. Put the chomp() in and then just print without any splits or whatever. That should work.

      Now you may be exporting this spreadsheet as a .CSV file, which means "Comma Separated Value". Parsing this type of format is one of those things that appears easy, but is not so easy. There a number of Perl modules that deal with CSV but that doesn't appear to be your main problem. A CSV file is a text file.

      Put the chomp() in and then just print the data lines and see if that works.