These aren't the right lessons to learn from this issue.
Comparing floats for equality is not a good idea, not just because of the 0.1+0.2!=0.3 behaviour (which, yes, is at least consistent), but because floats - or numerical methods in general - are simply not meant to be used for exact results. If that's what you want - e.g. big integer or fixed point arithmetic - there are other types you should be using. So the takeaway is not that JS has a "glitch" or is messing up - but a misunderstanding of the guarantees of numerical methods.
Also, the suggested solution to compare actual vs. expected value by having their absolute distance be less than some epsilon is half correct, but what you really should be looking at is the relative error, i.e. the absolute error divided by the magnitude of the number.
> but what you really should be looking at is the relative error
This actually depends on the specific problem you're talking about, and there's also the issue of forward vs. backward error. Unless you're doing something that requires additional care, unit-testing against abs(expected - actual) < 0.001 is relatively safe (and if you are, then God help you).
100% agree on the rest of the comment, requiring numerical methods to converge to a specific IEEE 754 double precision floating point is just dumb.
I'm not very experienced with this in practice and can mostly only speak to what I was taught in numerical methods, but it was my understanding that a) relative errors don't arbitrarily depend on your units and b) they seem to be better behaved in theory, or at least, it's how algorithms are typically analysed - e.g. if you multiply two numbers that both have some associated error, the relative error is the sum of the individual relative errors (plus quadratic terms which you would ignore).
> Comparing floats for equality is not a good idea
This is very true, but the mention of data science is really significant here.
If you're trying to productionize a neural network or something similar, you ideally want to be able to pin down every source of noise to make sure your results are reproducible, so you can evaluate whether a change in your results reflects a change in the data, a change in the code, or just luck from your initial weights - with a big enough network, for all you know it decided that the 10th decimal place of some feature was predictive. I wouldn't be surprised if the author brings up tanh as an example specifically because it's a common activation function.
If you're pinning all your Javascript dependencies, but the native code it dynamically links to gets ripped out from under you, it could completely mislead you about what produces good results. Similarly, if you want to be able to ship a model to run client-side, it would be nice if you could just test it on Node with your Javascript dependencies pinned and be reasonably confident it'll run the same on the browser.
Of course, if you can't do that it's not the end of the world, since you can compensate by doing enough runs, and tests on the target platform, to convince yourself that it's consistent, but it's a lot nicer if things are just as deterministic as possible.
> for all you know it decided that the 10th decimal place of some feature was predictive
isn't this kind of a misfeature of the model? If it really depends that highly on a function output that's not guaranteed to be precise even in the best of cases and certainly not when you have to assume noisy input data... that doesn't seem to be particularly robust.
If you want true reproducibility you can save everything in an image/container. Lots of data science companies offer this for end-to-end reproducibility.
>> because floats - or numerical methods in general - are simply not meant to be used for exact results
But one should expect a single function does not vary in the last 5 digits as one example did. That's over 16 bits. Why should anyone use double precision if that's the kind of slop an implementation can have?
> But one should expect a single function does not vary in the last 5 digits as one example did.
Why not? The whole selling point of floats is that they're a fast approximation to real arithmetic in a bunch of useful cases. I'd personally be much more miffed by an excessively slow implementation of a transcendental than one which was off even by a large number of bits. If there's a fast solution with better precision then that's great, but low precision by itself doesn't strike me as particularly problematic.
> Why should anyone use double precision if that's the kind of slop an implementation can have?
Because by using float64 instead of float32 you can cheaply get a ton more precision even with sloppy algorithms. If there's only 16ish bits of slop you could probably get a full-precision float32 operation just by using an intermediate cast to float64 (proof needed for the algorithm in question of course).
Counting the number of different digits has no useful meaning. The example you cite has two results that differ by 1 part in 10 quadrillion. If you have a good reason care about the precise implementation of hyperbolic functions to that level of precision then you don't rely on whatever you happen to stumble on in some version of nodejs.
If you're talking about the tanh example, that differs in the last five bits of the mantissa. You don't want to count the number of different bits because differences in the more significant bits are worse. Count the ulps instead.
> What I was seeing between Node versions really should have been a bug in the testing library, or something in my code, or maybe in simple-statistics itself. But in this case, digging deeper revealed that what I was seeing was exactly what you don’t expect: a glitch in the language itself.
This is such a disheartening conclusion. It really was a bug in your code, describing it as a "glitch in the language" is disingenuous. Your code was assuming higher level of precision (or accuracy? I get those always mixed) from the system was providing, that is not a glitch in the language, but bug in your code making that false assumption.
This also leads to the question where did the original "13098426.039156161" reference value come from? Has he actually verified by hand that it is correct to the last decimal? WolframAlpha gives lot more decimals for the value, how did he decide that exact number of digits?
Of course hindsight 20/20, but numerical code just is generally much more difficult and complex than many people give credit to, at least before they get bitten by some issue like the author.
When Knuth wrote TeX he based all of its mathematical calculations on integer arithmetic. Its one of the reasons that software remains accurate and consistent regardless of platform. Using the corresponding stdlib approach in javascript is not so crazy, if your goal is accurate, consistent results across platforms including future not-yet-released platforms.
you can use integer arithmetic if you're dealing with integer values - or rational ones, since quotients are just (equivalence classes of) pairs of integers.
It doesn't really work when you're dealing with transcendental functions such as the trigonometric ones. There is simply no way to get "accurate" results for such calculations in general, unless of course, you deal with everything symbolically.
Don't compare floats vs floats with an exact comparison.
I've recently seen a version of this error where someone's code was calculating how many bits a field needed to be to hold the maximum value they cared about.
bits = floor(log(max_value) / log(2)) + 1
This is playing with fire: it trusts that dividing one approximated irrational value divided by another approximated irrational value will yield a ratio equal or greater than the desired value when max_value is a power of two. Depending exactly how those approximations are rounded, log(8)/log(2) might return 2.999999999999 (bug!), or it might be 3.000000000 (their assumption), or epsilon more than 3.0 (not a bug).
This is one reason why even code that looks "obviously correct" probably deserves a test. If your assumption is true on one platform but not another, the test's failure on it will alert you of the fact.
That's especially a shame, because log2 of a floating point value is particularly easy to calculate, and most math libraries will internally calculate logs of other based (like ln) by finding the equivalent log2 first and then converting it.
And if you want just the floor of the base-2 log, you almost don't even need to calculate it at all; it's right there in the exponent bits.
Yes, the implementation of Node’s Math package functions is changing, and yes, it's not otherwise standardized.
And one should not expect that the CPUs will ever standardize the results of the functions there. We are de facto left only with IEEE 754, and there are even regarding that continuous attempts to implement less.
So what's left is the wiggle room, but not only in JavaScript as the specification or specific interpreters, but in other languages too.
"stdlib" sounds good, but do they guarantee to never change any implementation of some math function?
I don't agree what the author saw was really "a glitch in the language itself." It's directly allowed by the language to happen.
> Intel also bears some blame for overstating the accuracy of their trigonometric operations by many magnitudes. That kind of mistake is especially tragic because, unlike software, you can’t patch chips.
It is a great rabbit hole, however I have to note that the raised issue seems to be with float32, while JavaScript uses float64 ?
Sorry if I sound rude, but if one needs 15-digit precision for trigonometric functions, is javascript really a suitable choice for such a task (whatever it is)?
Comparing floats for equality is not a good idea, not just because of the 0.1+0.2!=0.3 behaviour (which, yes, is at least consistent), but because floats - or numerical methods in general - are simply not meant to be used for exact results. If that's what you want - e.g. big integer or fixed point arithmetic - there are other types you should be using. So the takeaway is not that JS has a "glitch" or is messing up - but a misunderstanding of the guarantees of numerical methods.
Also, the suggested solution to compare actual vs. expected value by having their absolute distance be less than some epsilon is half correct, but what you really should be looking at is the relative error, i.e. the absolute error divided by the magnitude of the number.
This actually depends on the specific problem you're talking about, and there's also the issue of forward vs. backward error. Unless you're doing something that requires additional care, unit-testing against abs(expected - actual) < 0.001 is relatively safe (and if you are, then God help you).
100% agree on the rest of the comment, requiring numerical methods to converge to a specific IEEE 754 double precision floating point is just dumb.
This is very true, but the mention of data science is really significant here.
If you're trying to productionize a neural network or something similar, you ideally want to be able to pin down every source of noise to make sure your results are reproducible, so you can evaluate whether a change in your results reflects a change in the data, a change in the code, or just luck from your initial weights - with a big enough network, for all you know it decided that the 10th decimal place of some feature was predictive. I wouldn't be surprised if the author brings up tanh as an example specifically because it's a common activation function.
If you're pinning all your Javascript dependencies, but the native code it dynamically links to gets ripped out from under you, it could completely mislead you about what produces good results. Similarly, if you want to be able to ship a model to run client-side, it would be nice if you could just test it on Node with your Javascript dependencies pinned and be reasonably confident it'll run the same on the browser.
Of course, if you can't do that it's not the end of the world, since you can compensate by doing enough runs, and tests on the target platform, to convince yourself that it's consistent, but it's a lot nicer if things are just as deterministic as possible.
> for all you know it decided that the 10th decimal place of some feature was predictive
isn't this kind of a misfeature of the model? If it really depends that highly on a function output that's not guaranteed to be precise even in the best of cases and certainly not when you have to assume noisy input data... that doesn't seem to be particularly robust.
But one should expect a single function does not vary in the last 5 digits as one example did. That's over 16 bits. Why should anyone use double precision if that's the kind of slop an implementation can have?
Why not? The whole selling point of floats is that they're a fast approximation to real arithmetic in a bunch of useful cases. I'd personally be much more miffed by an excessively slow implementation of a transcendental than one which was off even by a large number of bits. If there's a fast solution with better precision then that's great, but low precision by itself doesn't strike me as particularly problematic.
> Why should anyone use double precision if that's the kind of slop an implementation can have?
Because by using float64 instead of float32 you can cheaply get a ton more precision even with sloppy algorithms. If there's only 16ish bits of slop you could probably get a full-precision float32 operation just by using an intermediate cast to float64 (proof needed for the algorithm in question of course).
This is such a disheartening conclusion. It really was a bug in your code, describing it as a "glitch in the language" is disingenuous. Your code was assuming higher level of precision (or accuracy? I get those always mixed) from the system was providing, that is not a glitch in the language, but bug in your code making that false assumption.
This also leads to the question where did the original "13098426.039156161" reference value come from? Has he actually verified by hand that it is correct to the last decimal? WolframAlpha gives lot more decimals for the value, how did he decide that exact number of digits?
Of course hindsight 20/20, but numerical code just is generally much more difficult and complex than many people give credit to, at least before they get bitten by some issue like the author.
It doesn't really work when you're dealing with transcendental functions such as the trigonometric ones. There is simply no way to get "accurate" results for such calculations in general, unless of course, you deal with everything symbolically.
I've recently seen a version of this error where someone's code was calculating how many bits a field needed to be to hold the maximum value they cared about.
This is playing with fire: it trusts that dividing one approximated irrational value divided by another approximated irrational value will yield a ratio equal or greater than the desired value when max_value is a power of two. Depending exactly how those approximations are rounded, log(8)/log(2) might return 2.999999999999 (bug!), or it might be 3.000000000 (their assumption), or epsilon more than 3.0 (not a bug).And if you want just the floor of the base-2 log, you almost don't even need to calculate it at all; it's right there in the exponent bits.
And one should not expect that the CPUs will ever standardize the results of the functions there. We are de facto left only with IEEE 754, and there are even regarding that continuous attempts to implement less.
So what's left is the wiggle room, but not only in JavaScript as the specification or specific interpreters, but in other languages too.
"stdlib" sounds good, but do they guarantee to never change any implementation of some math function?
I don't agree what the author saw was really "a glitch in the language itself." It's directly allowed by the language to happen.
It is a great rabbit hole, however I have to note that the raised issue seems to be with float32, while JavaScript uses float64 ?
However, the point was not so much about precision, it was about consistency.