OfficeLinebacker has asked for the wisdom of the Perl Monks concerning the following question:

Hey ho, esteemed monks!

I am parsing through a csv file and converting a bunch of spreads from percentages to basis points (and stripping off the percent sign). The number of digits in any given field can vary. I want the output to have the same amount of accuracy as the input. However, a simple print of the value seems to round in an unacceptable fashion. The code below is an excerpt, so if a variable is undeclared or undefined, please ignore unless relevant:

use strict;#etc, my $parser = Text::CSV::Simple->new(); $parser->field_map(@fm); my @data = $parser->read_file($tf); my $ln = 0; open( my $cfh, '>', $cf ) || die "Can't open $cf: $!"; foreach my $row (@data) { ++$ln; if ( $ln > 1 ) { if ( my $date = ParseDate( $row->{Date} ) ) { $row->{Date} = UnixDate( $date, "%m/%d/%Y" ); } else { print "ERROR: $row->{Date} not a date!"; } foreach my $f ( keys %$row ) { if ( $f =~ m/^Spread\d\d?[my]$/ && $row->{$f} ne q{} ){ print "Fixing a percentage ($f) --now $row->{$f}"; $row->{$f} =~ s/^($RE{num}{decimal})%$/$1*100/e || die "Pattern match failed!"; print "--now $row->{$f}"; } if ( index( $row->{$f}, ',' ) != -1 ) { $row->{$f} = qq{"$row->{$f}"}; } } ## end foreach my $f ( keys %$row ) } ## end if ( $ln > 1 ) print {$cfh} join( ',', @{$row}{@fm} ); } ## end foreach my $row (@data) close $cfh || die "Can't close $cf: $!";
Here's some sample output:
Fixing a percentage (Spread3y) --now 0.781063003540039% --now 78.1063003540039 Fixing a percentage (Spread5y) --now 2.25% --now 225 Fixing a percentage (Spread3y) --now 1.455% --now 145.5 Fixing a percentage (Spread5y) --now 4.9% --now 490 Fixing a percentage (Spread5y) --now 0.79% --now 79 Fixing a percentage (Recovery) --now 75% --now 7500 Fixing a percentage (Spread5y) --now 1.15% --now 115 Fixing a percentage (Spread2y) --now 0.45999999999999996% --now 46 Fixing a percentage (Spread4y) --now 0.9199999999999999% --now 92 Fixing a percentage (Spread1y) --now 0.22999999999999998% --now 23 Fixing a percentage (Spread7y) --now 1.3% --now 130
Not what I am looking for! After reading My floating point comparison does not work. Why ?, I see that sprintf is probably the way to go, but can I make it dynamic so that I don't end up with values like 490.00000000000000 in the munged file?

As always, if you see other areas of the snippet where I could improve the logic/efficiency, feel free to comment!

TIA!

EDIT:Changed title to accuracy, to reflect nomenclature per Math::BigFloat; accuracy means total number of digits, while precision (which I call "significant (?:digits|figures)") means number of digits after the decimal.


I like computer programming because it's like Legos for the mind.

Replies are listed 'Best First'.
Re: Maintain accuracy of floating point numbers in simple arithmetic operation
by BrowserUk (Patriarch) on Jun 13, 2007 at 14:13 UTC

    You could avoid doing math all together and treat the problem as a string manipulation exercise:

    #! perl -slw use strict; while( <DATA> ) { chomp; printf "Before '%s' becomes ", $_; s[ ( \d+ )+ (?: \. ( \d{0,2} ) ( \d* ) )? % ]{ local $^W; my $n = substr( $2 . '00', 0, 2 ); ($1||'') . $n . ( length($3) ? ".$3" : '' ); }xe; print "'$_'"; } __DATA__ 0.781063003540039% 2.25% 1.455% 4.9% 0.79% 75% 1.15% 0.45999999999999996% 0.9199999999999999% 0.22999999999999998% 1.3%

    Gives:

    c:\test>junk4 Before '0.781063003540039%' becomes '78.1063003540039' Before '2.25%' becomes '225' Before '1.455%' becomes '145.5' Before '4.9%' becomes '490' Before '0.79%' becomes '79' Before '75%' becomes '7500' Before '1.15%' becomes '115' Before '0.45999999999999996%' becomes '45.999999999999996' Before '0.9199999999999999%' becomes '91.99999999999999' Before '0.22999999999999998%' becomes '22.999999999999998' Before '1.3%' becomes '130'

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      ++Browser,

      I must admit your solution takes the form of a brain teaser for me. I'll limit my query to the first thing that I can't figure out off the top of my head, which is why the + after the first set of parens? Isn't it redundant, given the + after the \d within?

      In general, IMHO, I think an explanation of your solution would be helpful not only to me, but to others who may come across this thread, especially in the future, perhaps from a search.

      EDIT: I guess what I don't understand is

      • The first thing with the +
      • Why test for $1 when the assumption is that there is always a digit before the decimal (w/ your program, Before '.6789%' becomes '.678900')
        • Note, that assumption is OK, for this application.

      EDIT 2: I see now that the test is not for definedness, but for 0, because you don't want to make 0.779999% become 077.9999.


      I like computer programming because it's like Legos for the mind.
        ... why the + after the first set of parens?

        Because I made a mistake and didn't notice because I was getting the right results. It is redundant and removing it makes no difference--'cept maybe it'll run a insy winsy bit quicker? :)

        There is a second 'artifact' in the replacement. ($1||'') can become just $1.

        I was attempting to avoid the need to suppress warnings, but it turns the code it a mess of bracketed ||s and ternaries; or a long if then else chain. Warnings are optional and turning them off, as an informed decision to avoid complicated code, is (IMO) legit. So I chose the cleaner option. You can replace local $^W; with no warnings 'uninitialised'; or the chain of defined and length tests if you prefer.

        Does this explain it?

        #! perl -slw use strict; while( <DATA> ) { chomp; printf "Before '%s' becomes ", $_; s[ ( \d+ ) ## Grab any digits before a decimal point (?: \. ## If there is a decimal point ( \d{0,2} ) ## Try grab two digits after it ( \d* ) ## And any more into another capture )? ## All conditionally % ## And finally... ]{ local $^W; ## I know some of the captures will be emp +ty. ## Pad (trailing zeros) the digits being moved to ensure exact +ly 2 my $n = substr( $2 . '00', 0, 2 ); ## Reassemble the number from ## Any digits previously before the decimal point ## The two digits being moved ## A decimal point and the remaining digits if there are +any. $1 . $n . ( length($3) ? ".$3" : '' ); }xe; print "'$_'"; } __DATA__

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Updates after the fact will often not be noticed unless you /msg the intended recipient.

        I used your tests which did not include a '.nnn%' example so that possibility didn't get tested. If you were my boss, I would probably deserve admonishment.

        This, as with most posts, is intended to give you ideas for solving your problem, not a fully tested piece of code that you can drop into you codebase for free.

Re: Maintain number of significant digits in simple arithmetic operation
by Moron (Curate) on Jun 13, 2007 at 13:05 UTC
    Math::BigFloat provides control over the accuracy and precision of floating point values. Update: that way you wouldn't need "dynamic" format control anyway!
    __________________________________________________________________________________

    ^M Free your mind!