The Problem of Float Accuracy in DBMS and Programming Languages

1. The "My Compiler is Broken" Phase
Every developer eventually goes through a stage of reality denial. You write a piece of trivial code, and the computer returns a result that contradicts primary school arithmetic.
In PostgreSQL, you run a simple SELECT SUM(amount) and get 10500.55. Five minutes later, you run the exact same query on the same data, and it returns 10500.5499999998. You check the logs—no changes. You start questioning the database's sanity.
In JS or PHP, you write a condition: if (a + 1 === a). Common sense tells you this is impossible. Yet, the script hangs in an infinite loop or returns true.
A junior dev rushes to open a bug report in the language's tracker. A senior engineer sighs and reaches for the IEEE 754 standard documentation.
2. The Naive Model
Our mental model of numbers is shaped by school and our habit of using the decimal system. We believe computers treat numbers as abstract mathematical objects.
We rely on two postulates:
- The number
0.1in code is exactly one-tenth. - Addition is associative: ((a + b) + c = a + (b + c)). The order of operations does not change the sum.
In the world of REAL, FLOAT, and DOUBLE PRECISION, both of these postulates are false.
3. The Reality of Binary Projections
Computers don't store "numbers"; they store binary projections. The problem arises during the translation from "human" decimal to "machine" binary.
Think of the fraction (1/3). In the decimal system, you cannot write it precisely: it’s 0.3333... going on forever. We eventually run out of paper and cut it off, creating a tiny error.
The number 0.1 (one-tenth) faces the same fate in binary. A fraction is finite only if the denominator's prime factors are factors of the system's base. In decimal (base 10 = 2 × 5), fractions with denominators like 2, 5, and 10 are finite. In binary (base 2), only fractions with denominators that are powers of 2 (2, 4, 8, 16...) are finite.
Since 10 contains a 5, 0.1 becomes an infinite periodic fraction in binary: 0.0001100110011....
A Float or Double is like a fixed-size sticky note (32 or 64 bits). The computer is forced to truncate this infinite "tail." Before any math even happens, 0.1 is no longer 0.1. It becomes something like 0.10000000000000000555....
When you add two of these "errors" together, the result (0.30000000000000004) no longer matches the expected "clean" 0.3.
4. The "Giant Eats the Tiny" Problem
The second issue lies in how floating-point numbers are structured: a mantissa [^1] and an exponent [^2].
Imagine you can only store 4 significant digits. You want to add 1000 and 0.001:
[ 1.000 x 10^3 + 1.000 x 10^-3 ]
To add them, the CPU must align the exponents. The smaller number becomes (0.000001 x 10^3). Now we add the mantissas: (1.000 + 0.000001 = 1.000001). But we only have room for 4 digits! The extra digits are sliced off, and we are left with exactly 1000.
The addition happened, but the result didn't change. This is precision loss due to bit depth. The large number "swallowed" the small one because the data type lacked the "resolution" to see the difference. This is why in floating-point math, (Large + Small) + Small might lose data that (Small + Small) + Large would preserve.
5. PostgreSQL and Parallelism
Why does SUM() in Postgres fluctuate even if the data is static?
Modern DBs use Parallel Aggregation. The planner splits the table into chunks and hands them to multiple Workers. Each Worker calculates a Partial Aggregate, then the Leader sums those results.
The order in which workers return their sums is not guaranteed. One day you sum (A + B) + C, the next day it’s A + (B + C). Because of the "swallowing" effect mentioned above, different summation orders of billions of floats result in mathematically different totals. It’s not a bug; it’s physics.
6. PHP/JS and Large Integers
In dynamic languages, Integer often silently promotes to Float (Double) when it exceeds PHP_INT_MAX or the safe integer limit in JS.
A 64-bit Double has 53 bits of precision. Once a number exceeds (2^{53}) (roughly 9 quadrillion), the "gap" between representable numbers (machine epsilon) becomes greater than 1.
In this range, the computer can store number (X) and number (X+2), but (X+1) physically cannot exist—it falls into the "gap" and rounds back to (X). This is how you get an infinite loop while doing $i++.
7. Practical Solutions
How do we deal with this?
- In Databases (Postgres, MySQL): Use
NUMERICorDECIMAL. These types don't use the FPU (Floating Point Unit); they are handled via software, storing numbers as arrays of digits. It's slower, but it ensures0.1 + 0.2is exactly0.3. - For Money: NEVER use Float/Double for currency. Use
DECIMALin the DB, or use the "Money Pattern": store values as integers in the smallest unit (cents/pennies). 100 cents is an integer, and integers don't have "tails." - In Code: If you need absolute precision (billing, scientific research), use arbitrary-precision libraries:
- Python:
decimalmodule. - PHP:
BCMathorGMP. - Java:
BigDecimal. - JS:
BigInt(for integers) or libraries likedecimal.js.
- Python:
8. The Golden Rule
Never compare floats for equality (==). Instead, check if the absolute difference between them is less than a tiny threshold (epsilon):
[ | a - b | < 0.00001 ]
Terminology
A floating-point number is stored as two parts: mantissa x 2^exponent. The mantissa is the actual digits of the number that the computer actually stores. The exponent is a power of two that determines the scale of the number (how large or small it is). Simplied:
[^1]: The mantissa is responsible for the precision.
[^2]: The exponent is responsible for the size of the number.
Memory for the mantissa is limited. Therefore, the computer can only store a certain number of digits. When a number becomes very large, there is no room for small changes. Because of this:
- A large number may not change after adding 1
- A small number may "disappear" when added to a larger number