in reply to Outlier test fail

Test::More::is compares using eq, so it's doing stringification on the floating point values. Different builds of perl can stringify differently in the far-out place values. So even if your module is returning exactly the same value, the string comparison might be different on two different systems.

IF you want numerical equivalence testing, you should use cmp_ok instead -- but that's a big "IF".

It's not Test::More that is returning 0.99999999999999996; it is $embed_pass->compare(...) -- that is, your method is what's returning that value. And if I skimmed the source correctly, you are calling a method from Data::CosineSimilarity , which you do not control. For floating-point math, if you don't control the chain 100%, then you cannot control whether rounding differences will occur in unexpected places. For values that are floating point -- especially when those values go through many steps or go through one or more steps that you do not control -- then I highly recommend you decide what is an acceptable precision for your module's needs, and code your test in such a way that it accepts anything within that range.

If you want to support 32bit floats, then I would say your test should make sure that you are within 1e-6 * $expected -- ie, cmp_ok abs($got/$exp-1), '<=', 1e-6 ... or maybe 0.5e-6 if you're brave (it's been a while since I've done the calcs, for whether that works with 32bit float which has 23bit mantissa -- but my back-of-the-envelope says the ULP is about 0.12e-6 relative to the power of two, and since your mantissa can be nearly twice the power of two, 0.25e-6 is the best, so 0.5e-6 would be as small as I'd want to go.) (Caveat: if $exp is 0, that of course won't work, and you could probably use abs($got-$exp) instead. But in this example, $exp was 1, so you're safe.)

I would think that would be enough precision to make sure your module is doing its part of the job correctly.

Alternately, doing a sprintf '%.6f' on both the $got and $exp would allow you to do a string comparison using 'is', regardless of the floating size and stringification differences.

To sum up: When testing a module that does floating point math, unless you know for sure you can guarantee exact values, you need to test against an expected accuracy rather than looking for exact values (whether that's done through sprintf-rounding or through numeric comparison of fractional-delta-vs-accuracy instead of got-vs-expected).

Replies are listed 'Best First'.
Re^2: Outlier test fail
by choroba (Cardinal) on Jan 11, 2024 at 22:32 UTC
    These tests become much easier to read and write if you switch to Test::Deep with num or Test2::V0 with within.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re^2: Outlier test fail
by etj (Priest) on Dec 12, 2024 at 18:52 UTC
    PDL recently (as of 2.094) incorporated Test::PDL. I wish that had happened years ago, because being able to use is_pdl is just great compared to the contortions needed otherwise: it does approximate matching, with configurable absolute or relative tolerances, it will check types and dimensions, report in some detail any differences, which nearly always means not needing to capture any results in a variable for failure-reporting.

    And using PDL for this sort of data-handling would be very obvious, and there's even a recently-added vector-cosine function for exactly Bod's use-case. What's not to like?