dottornomade has asked for the wisdom of the Perl Monks concerning the following question:
Hi there,
it's the first time something like this happens to me.
I have this flat file and simply... the lines gets all messed up when I am printing them on screen, I can't even parse them properly! This is very frustrating. Sadly, I can not attach the file here. When I open and visualize it with gedit, everything looks fine. Here is a snippet:
missense,0.40851449275362317,1.0,-100 2.853,2.853,5.706,2.853,2.853,8.559,8.559,... missense,0.40851449275362317,1.0,0 2.827,2.827,5.655,2.827,2.827,5.655,8.482,... frameshift,0.056074766355140186,0.7697841726618705,-64 5.290,1.763,0.000,8.817,1.763,3.527,1.763,... missense,0.44542772861356933,1.0,0
Basically it alternates two kinds of lines: a title line and a data line.
Now, say I want to print the first line:
And it then prints ALL the odd lines! One after another, ignoring all the lines starting with >. I need to parse this file. When I split the lines with the split command, it considers the last element of an even line to be the whole following odd one, and it ignore the actual true last element. E.g., if I use:open FILE, "<", "weird" or die $!; my @data = <FILE>; for (my $line=0; $line<1; $line++) { print $data[$line]; }
The actual output is:my @tmp = split(',', $data[0]); print $tmp[$#tmp]."\n".$tmp[$#tmp-1];
missense,0.40851449275362317,1.0, 2.853,2.853,5.706,2.853,2.853,8.559,8.559,... 0
Note that there is a \n in the middle of the first line, and a -100 value missing.
What's happening? This file was made by a bot interacting with a server...I guess it may be something related to the file encoding, but I have no idea about how to fix it.
|
|---|