What range of numbers can be represented in a 16-, 32- and 64-bit IEEE-754 systems?

For a given IEEE-754 floating point number X, if 2^E <= abs(X) < 2^(E+1) then the distance from X to the next largest representable floating point number (epsilon) is: epsilon = 2^(E-52) % For a 64-bit float (double precision) epsilon = 2^(E-23) % For a 32-bit float (single precision) epsilon = 2^(E-10) % For a … Read more

Which is the first integer that an IEEE 754 float is incapable of representing exactly?

2mantissa bits + 1 + 1 The +1 in the exponent (mantissa bits + 1) is because, if the mantissa contains abcdef… the number it represents is actually 1.abcdef… × 2^e, providing an extra implicit bit of precision. Therefore, the first integer that cannot be accurately represented and will be rounded is: For float, 16,777,217 … Read more

Large numbers erroneously rounded in JavaScript

You’re overflowing the capacity of JavaScript’s number type, see §8.5 of the spec for details. Those IDs will need to be strings. IEEE-754 double-precision floating point (the kind of number JavaScript uses) can’t precisely represent all numbers (of course). Famously, 0.1 + 0.2 == 0.3 is false. That can affect whole numbers just like it … Read more

What is the rationale for all comparisons returning false for IEEE754 NaN values?

I was a member of the IEEE-754 committee, I’ll try to help clarify things a bit. First off, floating-point numbers are not real numbers, and floating-point arithmetic does not satisfy the axioms of real arithmetic. Trichotomy is not the only property of real arithmetic that does not hold for floats, nor even the most important. … Read more