Introduction to Fixed Point Numbers #
Fixedpoint numbers are a common representation of real numbers in computer systems, used for a wide range of applications such as embedded systems, signal processing, and digital communications. Fixedpoint numbers are typically represented as binary numbers with a fixed number of bits allocated for the integer and fractional parts. The range of fixedpoint numbers that can be represented depends on the number of bits allocated for the integer and fractional parts.
Fixed point numbers have the range :
2^(Integer Part) to 2^(Integer Part)2^(Fractional Part)
and a resolution / step size of :
2^(Fractional Part)
8 Bit Signed Fixed Point Number Example #
Sign Integer Fractional
Bit Part Part
  
v v v
++ +++++ ++++
 0   0  1  0  1   1  0  1 
++ +++++ ++++

The sign bit is a single bit that indicates the sign of the fixedpoint number. It can be either 0 or 1, where 0 represents a positive number and 1 represents a negative number.

The integer part consists of a certain number of bits that represent the whole number portion of the fixedpoint number. In this example, we have 4 bits reserved for the integer part, which can represent values from 0 to 15.

The fractional part also consists of a certain number of bits that represent the decimal portion of the fixedpoint number. In this example, we have 3 bits reserved for the fractional part, which can represent values from 2^0 to 2^3.
Adding Fixed Point Numbers #
When adding fixedpoint numbers, the integer and fractional parts are added separately. If there is a carry from the fractional part addition, it is added to the integer part.
Example: Adding FixedPoint Numbers
Number 1: 1 1 0.1 1 (1.25 in decimal) 5 bits
Number 2: 0 1 1.0 1 (3.25 in decimal) 5 bits

Product: 0 0 1 0.0 0 (2 in decimal) 6 bits
In the example above, we add two fixedpoint numbers. The integer parts (1 + 3) and fractional parts (0.25 + 0.25) are added separately. The resulting sum is 2 in decimal.
A Python library that can help us explore fixed point math is fxpmath, if you have pip install via pip install fxpmath
.
from fxpmath import Fxp
a = Fxp(1.25, signed=True, n_word=5, n_frac=2)
b = Fxp(3.25, signed=True, n_word=5, n_frac=2)
c = a+b
c
c.bin()
Multiplying Fixed Point Numbers #
When multiplying fixedpoint numbers, the integer and fractional parts are multiplied separately, and the fractional part of the result is truncated or rounded as needed.
Example: Multiplying FixedPoint Numbers
Number 1: 1 1 0.1 1 (1.25 in decimal) 5 bits
Number 2: 0 1 1.0 1 (3.25 in decimal) 5 bits

Product: 1 1 1 0 1 1.1 1 1 1 (4.0625 in decimal) 10 bits
from fxpmath import Fxp
a = Fxp(1.25, signed=True, n_word=5, n_frac=2)
b = Fxp(3.25, signed=True, n_word=5, n_frac=2)
c = a*b
c
c.bin()
In the example above, we multiply two fixedpoint numbers. Note we grow by adding the number of bits!
BitGrowth #
One important consideration in fixedpoint arithmetic is bit growth, which can occur when adding or multiplying fixedpoint numbers. Bit growth refers to the possibility of the result of an operation having more bits than the original operands, which can result in overflow or loss of precision.
Addition Overflow #
For example, when adding fixedpoint numbers, if the sum of the fractional parts requires more bits than originally allocated, overflow can occur, resulting in loss of precision or truncation.
Example: Bit Growth in Addition
Number 1: 0 1.1 1 (1.75 in decimal) 4
Number 2: 0 0.1 1 (0.75 in decimal) 4

Sum: 0 1 0.1 0 (2.5 in decimal) 5
from fxpmath import Fxp
a = Fxp(1.75, signed=True, n_word=4, n_frac=2)
b = Fxp(0.75, signed=True, n_word=4, n_frac=2)
c = a*b
c
c.bin()
In the example above, when adding the fractional parts (0.75 + 0.75), the result requires 3 bits to represent the sum. What if we didn’t grow the correct number of bits and the number stayed 2 bits for the integer and 2 bits for the fractional part?
In that case our sum would be:
Example: Bit Growth in Addition
Number 1: 0 1.1 1 (1.75 in decimal) 4
Number 2: 0 0.1 1 (0.75 in decimal) 4

Sum: 1 0.1 0 (1.5 in decimal) 4
from fxpmath import Fxp
a = Fxp(1.75, signed=True, n_word=4, n_frac=2)
b = Fxp(0.75, signed=True, n_word=4, n_frac=2)
c = Fxp(a+b, signed=True, n_word=4, n_frac=2, overflow='wrap')
c
c.bin()
That is our sum wrapped around to 1.5. We do not have enough bits to store the result!
Addition Bit Growth Requirements #
Adding two N bit numbers together requires an extra bit to represent the result. If you have a signed number on the range
2^(N1) <= x <= 2^(N1) 1
Then adding two together produces either:
 most positive:
(2^N1) + (2^N1) = 2^(N+1)2
 most negative:
2^N  2^N = 2^(N+1)
When adding M
Nbit
numbers with the together the bitgrowth is
floor(log2(M * 2^N)) + 1
Multiplication Overflow #
Similarly, when multiplying fixedpoint numbers, the product of the fractional parts may require more bits, resulting in overflow
Example: Bit Growth in Multiplication
Number 1: 0 0 . 0 1 (0.25 in decimal) 4 bits
Number 2: 0 0 . 1 0 (0.5 in decimal) 4 bits

Product: 0 0 0 0 . 0 0 1 0 (0.125 in decimal) 8 bits
from fxpmath import Fxp
a = Fxp(0.25, signed=True, n_word=4, n_frac=2)
b = Fxp(0.5, signed=True, n_word=4, n_frac=2)
c = a*b
c
c.bin()
In the example above, when multiplying the fractional parts (0.25 x .5), the result requires 4 bits to represent the fractional product.
What would happen if we only allocated 2 bits for the fractional part, resulting in bit growth and potential overflow?
from fxpmath import Fxp
a = Fxp(0.25, signed=True, n_word=4, n_frac=2)
b = Fxp(0.5, signed=True, n_word=4, n_frac=2)
c = Fxp(a*b, signed=True, n_word=4, n_frac=2)
c
c.bin()
If we don’t grow the require bits and truncate back down to 1 sign bit, 1 integer bit, and 2 fractional bits our product will be 0!
Multiplication Bit Growth Requirements #
Consider multiplying an N
bit number by a M
bit number:
(2^(N1)) * (2^(M1)) = 2^(N+M2) < 2^(N+M1)
When you do multiplication you need to to grow N+M
bits!
Always Allocate Enough Bits! #
To mitigate bit growth, careful consideration of the number of bits allocated for the integer and fractional parts is necessary. It’s important to allocate enough bits to accommodate the expected range and precision of the fixedpoint numbers being used in calculations to avoid overflow or loss of precision.