Bit Shaving
A few days ago a coworker asked why Python returned different values for the example below:
(~)$ python
>>> 4.32 - 3.92
0.40000000000000036
>>> 3.32 - 2.92
0.3999999999999999
As you can see both of these differences would be 0.40 in base 10, but why aren’t they in Python?
The short answer: four in binary is 100 and three is 11, so four takes one more bit to represent. Floating point numbers have a fixed bit budget, so the fractional parts of 4.32 and 3.32 end up stored differently.
The rest of this post walks through the details. I used a decimal-to-IEEE-754 converter and a binary-to-decimal converter along the way to keep things readable.
That bit budget is finite by design. Irrational numbers can never be represented exactly, and rational numbers with long fractional parts may not fit either. Something has to give, and IEEE 754 is the standard that defines how.
The IEEE 754 specifies a standard for floating-point computation. It defines formats, rounding rules, operations, and exception handling. The standard storage format for double-precision floating-points includes 1 bit for the sign of the number, 11 bits for the exponent, and 52 bits for the mantissa. The value of an IEEE 754 double-precision floating point is computed as: -1 ^ sign * mantissa * 2 ^ exponent. When a number is represented this way it is in scientific notation.
Whew. After you’ve digested all of the above we can finally address why the example above returns two different results for sets of numbers equidistant apart.
# s exponent mantissa
4.32: 0 10000000001 0001010001111010111000010100011110101110000101001000
3.32: 0 10000000000 1010100011110101110000101000111101011100001010001111
# binary
4.32: 100.010100011110101110000101000111101011100001010 [01000]
3.32: 11.0101000111101011100001010001111010111000010100 [01111]
At a quick glance you can see that the fractional portion of 4.32 and 3.32 in binary follow the same pattern until their last five bits. Due to the IEEE 754 formatting restrictions, 4.32 must use one less bit to represent 0.32 than 3.32. This is because the mantissa is only allowed 52 bits and 4 (100 in binary) takes up one more bit than 3 (11 in binary). Even though we entered 0.32 for the fractional portion of 4.32 and 3.32 in the interpreter, Python can’t store their values exactly the same. Below is a translation of 4.32 and 3.32’s bracketed binary fractional values to decimal.
4.32's bracketed fractional value in binary
0.000000000000000000000000000000000000000000000 [01000]
4.32's bracketed fractional value converted to decimal
0.00000000000000710542735760100185871124267578125
3.32's bracketed fractional value in binary
0.0000000000000000000000000000000000000000000000 [01111]
3.32's bracketed fractional value converted to decimal
0.000000000000006661338147750939242541790008544921875
error between the two bracketed fractional values in decimal
4.440892098500626e-16
Below we revisit our initial example and compare the error between the two representations of 0.4.
(~)$ python
>>> x = 4.32 - 3.92
>>> x
0.40000000000000036
>>> y = 3.32 - 2.92
>>> y
0.3999999999999999
>>> x - y # error between the two terms
4.440892098500626e-16
You can now see the error due to bit shaving and the error between the two values presented to us by Python are the same!