losing precision reading numbers from file

iKnowNothing has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have run across a problem where I am losing some precision when writing numbers to a file, and then re-reading them for future use. I have captured an example of my problem in the following script:


#sample data set
$Day = 0;
$Seconds = 35247;
$MicroSeconds = 755605;


#combine times into one variable
$Time = $Day + ($Seconds+$MicroSeconds/1000000)/86400;

#write this time to a file
open(FILE,"> test.txt");
printf FILE ("%1.20f",$Time);
close(FILE);


#read that time back in from file
open(FILE,"< test.txt");
@FileContents = <FILE>;
$FileTime = $FileContents[0];

#calculate diffence from time stored in memory and written to file
$Difference = $Time - $FileTime;

#print results
print "\nFrom Memory: ";
printf("%1.20f",$Time);
print "\nFromFile   : ";
printf("%1.20f",$FileTime);
print "\nDifference : ";
printf("%1.20f",$Difference);
[download]

When I ran this on my Windows XP machine, it produced the output:

From Memory: 0.40796013431712963000
FromFile   : 0.40796013431712957000
Difference : 0.00000000000000005551
[download]

However, when I looked in the "test.txt" file, the number 0.40796013431712963000 was sitting in there. Why is this different from the "FromFile" number? Is there a way to make sure that the number is read exactly as it is from the file? Thanks for the help.

Comment on losing precision reading numbers from file Select or Download Code

Replies are listed 'Best First'.

Re: losing precision reading numbers from file
by ikegami (Patriarch) on Nov 11, 2005 at 23:52 UTC

Perl uses doubles for floating point numbers. Doubles have 52 bits of precision, which translates to roughly 15 (log₁₀ 2⁵²) digits of precision. Therefore, an error on the 16^th significant digit shouldn't be unexpected when converting the string to a number. Remember, to convert 0.4079... from a string to a number, the computer has to do something like 4*10^-1 + 7*10^-3 + 9*10^-4 + 6*10^-5 + ... and some numbers (such as 0.4, IIRC) are periodic numbers in binary.

[reply]

Re: losing precision reading numbers from file
by Ovid (Cardinal) on Nov 11, 2005 at 23:59 UTC

ikegami has explained the problem, but I'd like to make it clear that this doesn't have anything to do with files. The following demonstrates the problem:

my $Time = 0.40796013431713;
print $Time, sprintf("\n%1.20f", $Time);
[download]

Cheers,
Ovid

New address of my CGI Course.

[reply]
[d/l]

Re: losing precision reading numbers from file
by hossman (Prior) on Nov 12, 2005 at 00:18 UTC

I believe that if you switch to useing Math::BigFloat you will see this discrepency go away -- at the expense of performance.

Another option, is to stringify all of your numbers as an integer number of the smallest unit you use -- in this case microseconds. In this case, you are gaining precission (over using floats) and performance (over Math::BigFloat) at the cost of value space (ie: large numbers will cause Integer overflow)

[reply]

Re: losing precision reading numbers from file
by pg (Canon) on Nov 12, 2005 at 03:09 UTC

Here is a table for reference:

Type	Approximate range	Precision
float	±1.5 × 10−⁴⁵ to ±3.4 × 10³⁸	7 digits
double	±5.0 × 10−³²⁴ to ±1.7 × 10³⁰⁸	15-16 digits

[reply]

Re: losing precision reading numbers from file
by BrowserUk (Patriarch) on Nov 12, 2005 at 00:22 UTC

See Re: machine accuracy, Re: Bug? 1+1 != 2 and Re: Re: Re: Bug? 1+1 != 2.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]
[d/l]

Re: losing precision reading numbers from file
by iKnowNothing (Scribe) on Nov 11, 2005 at 23:30 UTC

#sample data set
$Day = 0;
$Seconds = 35247;
$MicroSeconds = 755605;


#combine times into one variable
$Time = $Day + ($Seconds+$MicroSeconds/1000000)/86400;
$sTime = sprintf("%1.20f",$Time);
#write this time to a file
open(FILE,"> test.txt");
printf FILE ("%1.20f",$Time);
close(FILE);


#read that time back in from file
open(FILE,"< test.txt");
@FileContents = <FILE>;
$FileTime = $FileContents[0];

#calculate diffence from time stored in memory and written to file
$Difference = $Time - $FileTime;

#print results
print "\nFrom Memory: ";
printf("%1.20f",$Time);
print "\nFromsprintf: ";
printf("%1.20f",$sTime);
print "\nFromFile   : ";
printf("%1.20f",$FileTime);
print "\nDifference : ";
printf("%1.20f",$Difference);
[download]

From Memory: 0.40796013431712963000
Fromsprintf: 0.40796013431712957000
FromFile   : 0.40796013431712957000
Difference : 0.00000000000000005551
[download]

[reply]
[d/l]
[select]

Re^2: losing precision reading numbers from file

by iKnowNothing (Scribe) on Nov 12, 2005 at 00:15 UTC

thanks for the replies. Looks like if I use binary files I can keep the precision. I was hoping I wouldn't have to go there.

[reply]

Re: losing precision reading numbers from file
by spiritway (Vicar) on Nov 12, 2005 at 01:34 UTC

Instead of using floats, you might consider using rationals (or bigrats). This will preserve the precision because you retain the number as-is. 1/3 remains as 1/3, not 0.333..., so you don't get that rounding problem. Converting to a floating representation is not difficult, if needed; often it's not needed because you are interested in the integer values for times.

[reply]

Re^2: losing precision reading numbers from file

by pg (Canon) on Nov 12, 2005 at 02:54 UTC

Not all floats are rationals, for example sqrt(2). If you force it to a rational, the process itself reduces precision.

[reply]

Re^3: losing precision reading numbers from file

by spiritway (Vicar) on Nov 12, 2005 at 05:36 UTC

Not all floats are rationals, for example sqrt(2). If you force it to a rational, the process itself reduces precision.

What you say is mathematically true, but computationally false. The decimal representation of sqrt(2) requires an infinite number of digits, hence of memory, which would be very expensive. Almost all "floats" in computers are already approximations, and often they are bad ones. It is a simple matter to represent a rational number as a pair of ordered integers. 1/3 would be simply 1,3. In floating point, you'd get something like 0.3333333333333333, which is close, but inexact. This will come back and bite you if you don't take extraordinary precautions like keeping track of a delta and comparing equality as being within the range of the delta. This is awkward and often unnecessary.

The OP's interest appeared to be in keeping track of dates, which can of course be represented as integers, even if only the integer number of seconds (or milliseconds, or whatever) between two events. Simple integer calculations could be performed to obtain more useful values of days, hours, minutes, etc.

[reply]