iKnowNothing has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have run across a problem where I am losing some precision when writing numbers to a file, and then re-reading them for future use. I have captured an example of my problem in the following script:
#sample data set $Day = 0; $Seconds = 35247; $MicroSeconds = 755605; #combine times into one variable $Time = $Day + ($Seconds+$MicroSeconds/1000000)/86400; #write this time to a file open(FILE,"> test.txt"); printf FILE ("%1.20f",$Time); close(FILE); #read that time back in from file open(FILE,"< test.txt"); @FileContents = <FILE>; $FileTime = $FileContents[0]; #calculate diffence from time stored in memory and written to file $Difference = $Time - $FileTime; #print results print "\nFrom Memory: "; printf("%1.20f",$Time); print "\nFromFile : "; printf("%1.20f",$FileTime); print "\nDifference : "; printf("%1.20f",$Difference);
When I ran this on my Windows XP machine, it produced the output:
From Memory: 0.40796013431712963000 FromFile : 0.40796013431712957000 Difference : 0.00000000000000005551
However, when I looked in the "test.txt" file, the number 0.40796013431712963000 was sitting in there. Why is this different from the "FromFile" number? Is there a way to make sure that the number is read exactly as it is from the file? Thanks for the help.

Replies are listed 'Best First'.
Re: losing precision reading numbers from file
by ikegami (Patriarch) on Nov 11, 2005 at 23:52 UTC

    Perl uses doubles for floating point numbers. Doubles have 52 bits of precision, which translates to roughly 15 (log10 252) digits of precision. Therefore, an error on the 16th significant digit shouldn't be unexpected when converting the string to a number. Remember, to convert 0.4079... from a string to a number, the computer has to do something like 4*10-1 + 7*10-3 + 9*10-4 + 6*10-5 + ... and some numbers (such as 0.4, IIRC) are periodic numbers in binary.

Re: losing precision reading numbers from file
by Ovid (Cardinal) on Nov 11, 2005 at 23:59 UTC

    ikegami has explained the problem, but I'd like to make it clear that this doesn't have anything to do with files. The following demonstrates the problem:

    my $Time = 0.40796013431713; print $Time, sprintf("\n%1.20f", $Time);

    Cheers,
    Ovid

    New address of my CGI Course.

Re: losing precision reading numbers from file
by hossman (Prior) on Nov 12, 2005 at 00:18 UTC

    I believe that if you switch to useing Math::BigFloat you will see this discrepency go away -- at the expense of performance.

    Another option, is to stringify all of your numbers as an integer number of the smallest unit you use -- in this case microseconds. In this case, you are gaining precission (over using floats) and performance (over Math::BigFloat) at the cost of value space (ie: large numbers will cause Integer overflow)

Re: losing precision reading numbers from file
by pg (Canon) on Nov 12, 2005 at 03:09 UTC

    Here is a table for reference:

    Type Approximate range Precision
    float ±1.5 × 10−45 to ±3.4 × 1038 7 digits
    double ±5.0 × 10−324 to ±1.7 × 10308 15-16 digits
Re: losing precision reading numbers from file
by BrowserUk (Patriarch) on Nov 12, 2005 at 00:22 UTC

    See Re: machine accuracy, Re: Bug? 1+1 != 2 and Re: Re: Re: Bug? 1+1 != 2.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: losing precision reading numbers from file
by iKnowNothing (Scribe) on Nov 11, 2005 at 23:30 UTC
    I've done some more investigation, and modified the script to also print the number from memory out after using sprintf. Here's the code:
    #sample data set $Day = 0; $Seconds = 35247; $MicroSeconds = 755605; #combine times into one variable $Time = $Day + ($Seconds+$MicroSeconds/1000000)/86400; $sTime = sprintf("%1.20f",$Time); #write this time to a file open(FILE,"> test.txt"); printf FILE ("%1.20f",$Time); close(FILE); #read that time back in from file open(FILE,"< test.txt"); @FileContents = <FILE>; $FileTime = $FileContents[0]; #calculate diffence from time stored in memory and written to file $Difference = $Time - $FileTime; #print results print "\nFrom Memory: "; printf("%1.20f",$Time); print "\nFromsprintf: "; printf("%1.20f",$sTime); print "\nFromFile : "; printf("%1.20f",$FileTime); print "\nDifference : "; printf("%1.20f",$Difference);
    This produced the output:
    From Memory: 0.40796013431712963000 Fromsprintf: 0.40796013431712957000 FromFile : 0.40796013431712957000 Difference : 0.00000000000000005551
    Notice that when I print the variable directly using printf I get the number ending in 63. However, if I use sprintf on that number first, it gets changed to the number ending in 57. Anybody know what is going on?
      thanks for the replies. Looks like if I use binary files I can keep the precision. I was hoping I wouldn't have to go there.
Re: losing precision reading numbers from file
by spiritway (Vicar) on Nov 12, 2005 at 01:34 UTC

    Instead of using floats, you might consider using rationals (or bigrats). This will preserve the precision because you retain the number as-is. 1/3 remains as 1/3, not 0.333..., so you don't get that rounding problem. Converting to a floating representation is not difficult, if needed; often it's not needed because you are interested in the integer values for times.

      Not all floats are rationals, for example sqrt(2). If you force it to a rational, the process itself reduces precision.

        Not all floats are rationals, for example sqrt(2). If you force it to a rational, the process itself reduces precision.

        What you say is mathematically true, but computationally false. The decimal representation of sqrt(2) requires an infinite number of digits, hence of memory, which would be very expensive. Almost all "floats" in computers are already approximations, and often they are bad ones. It is a simple matter to represent a rational number as a pair of ordered integers. 1/3 would be simply 1,3. In floating point, you'd get something like 0.3333333333333333, which is close, but inexact. This will come back and bite you if you don't take extraordinary precautions like keeping track of a delta and comparing equality as being within the range of the delta. This is awkward and often unnecessary.

        The OP's interest appeared to be in keeping track of dates, which can of course be represented as integers, even if only the integer number of seconds (or milliseconds, or whatever) between two events. Simple integer calculations could be performed to obtain more useful values of days, hours, minutes, etc.