This site uses cookies to deliver our services and to show you relevant ads and job listings. By using our site, you acknowledge that you have read and understand our Cookie Policy , Privacy Policy , and our Terms of Service . Your use of the Related Sites, including DSPRelated.com, FPGARelated.com, EmbeddedRelated.com and Electronics-Related.com, is subject to these policies and terms.

Introduction

The finite-word representation of fractional numbers is known as fixed-point. Fixed-point is an interpretation of a 2's compliment number usually signed but not limited to sign representation. It extends our finite-word length from a finite set of integers to a finite set of rational real numbers [1]. A fixed-point representation of a number consists of integer and fractional components. The bit length is defined as:

$$

X_{Nbits} = X_{Integer Nbits} + X_{Fraction Nbits} + 1

$$







IWL is the integer word length, FWL is the fractional word length, and WL is the word length. For 2's compliment fixed-point representation $ WL = 1 + IWL + FWL $. With this representation the range of the number is $ \left[ -2^{IWL}, 2^{IWL} \right) $ and a step size (resolution) of $ 2^{-FWL} $ [2]. The fixed-point number is defined by its format wl, iwl, fwl or its properties range, resolution, and bias.

As described, a fixed-point number is defined by its range and resolution, instead of the number of bits. When designing a fixed-point system it is logical to use range and resolution requirements and algorithmically resolve the number of bits when it comes time to implement the design. In other words, start with range and resolution when designing / specifying and move to the number of bits required when implementing.

Typically, a short hand is used to represent the format. The Q-format is common when discussing fixed-point processors. I will not use the Q-format because it is not as flexible and can be confusing with the notation used in older fixed-point processor documents [3]. Here I will use an explicit notation, W=(wl,iwl,fwl). This notation defines the number of integer bits, fraction bits, and the total bit length. Example, W=(4,0,3) is a 4-bit number with three fractional bits and a sign bit (the sign bit is implicit). All fixed-point numbers in this post will be 2's complement representation (sign bit is always present).

Why Use Fixed-Point

Why use fixed-point representation? Why not simply normalize all values to an integer range and deal with integers. Depending on your point of view this might be obvious or not. I have found many designers prefer to normalize all values to integers but this produces very unreadable code and documentation. Fixed-point is normally used for convenience (similar to the use of $ \exp{j\omega} $ ). Example, if I am looking at source code and I see:

c0 = fixbv(0.0032, min=-1, max=1, res=2**-15) c1 = fixbv(-0.012, min=-1, max=1, res=2**-15)

I can easily relate this back to the coefficients of a filter but if I see the integer use only:

c0 = intbv(0x0069, min=-2**15, max=2**15) c1 = intbv(-0x0189, min=-2**15, max=2**15)

additional information (documentation) is required. The integer only is difficult to understand what the values are intended to represent. In addition to the representation benefits, tools to handled rounding and overflow are common with fixed-point types.

Now that we have given the basic definition of a fixed-point binary word and given a couple reasons why fixed-point might be used. Let's take a look at some fractional values converted to a fixed-point type. Table 1 is an example of

fixed-point representations.

Table 1: Fixed-Point Examples Binary Hex Integer Floating Point Fraction Fixed-Point Fraction Actual 0100000000000000 4000 16384 0.50000000 0.50000000 1/2 0010000000000000 2000 8192 0.25000000 0.25000000 1/4 0001000000000000 1000 4096 0.12500000 0.12500000 1/8 0000100000000000 0800 2048 0.06250000 0.06250000 1/16 0000010000000000 0400 1024 0.03125000 0.03125000 1/32 0000001000000000 0200 512 0.01562500 0.01562500 1/64 0000000100000000 0100 256 0.00781250 0.00781250 1/128 0000000010000000 0080 128 0.00390625 0.00390625 1/256 0000000001000000 0040 64 0.00195312 0.00195312 1/512 0000000000100000 0020 32 0.00097656 0.00097656 1/1024 0000000000010000 0010 16 0.00048828 0.00048828 1/2048 0000000000001000 0008 8 0.00024414 0.00024414 1/4096 0000000000000100 0004 4 0.00012207 0.00012207 1/8192 0000000000000010 0002 2 0.00006104 0.00006104 1/16384 0000000000000001 0001 1 0.00003052 0.00003052 1/32768 0010101010101011 2AAB 10923 0.33333000 0.33334351 0.33333 0101101001111111 5A7F 23167 0.70700000 0.70700073 0.707 0000000000001010 000A 10 0.0003141592 0.00030518 0.0003141592 0000000000000011 0003 3 0.000086476908 0.00009155 0.000086476908

b7 b6 b5 b4 b3 b2 b1 b0 S F F F F F F F 0 1 0 1 1 0 0 1 +-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Value = + 1/2 + 1/8 + 1/16 + 1/128 which equals = + 0.5 + 0.125 + 0.0625 + 0.0078125 = 0.6953125

The previous two examples illustrate how to represent a fractional value using fixed-point nomenclature. In general the value of the binary word is the following if the integer and fractional parts are treated as separate bit-vectors (i.e. don't share a bit index):

$$

{\displaystyle\sum_{k=0}^{IWL-1} b_{I}[k]2^k} + {\displaystyle\sum_{k=0}^{FWL-1}b_{F}[k]2^{-(k+1)}}

$$

In the above equations, it assumes the bit magnitude increases with the bit index. If visualized the fractional word would be flipped relative to the depiction above.

Multiplication and Addition Examples

The following are fixed-point examples for multiplication and addition. Fixed-point subtraction can be calculated in a similar manner to a 2's complement subtraction (addition with a negative). The difference being the "point" bookkeeping required which is the same as addition. For addition and subtraction the points need to be aligned before the operation (bookkeeping). The "point" bookkeeping is usually calculated at design time and not run time (I say usually because there is the idea of block fixed-point but that is beyond the scope of this post).

Multiplication

Fixed-point multiplication is the same as 2's compliment multiplication but requires the position of the "point" to be determined after the multiplication to interpret the correct result. The determination of the "point's" position is a design task. The actual implementation does not know (or care) where the "point" is located. This is true because the fixed-point multiplication is exactly the same as a 2's complemented multiplication, no special hardware is required.

The following is an example and illustrates $ 6.5625 (W=(8,3,4)) * 4.25 (W=(8,5,2)) $.

0110.1001 == 6.5625 000100.01 == 4.25 01101001 x 00010001 ------------ 01101001 00000000 00000000 00000000 01101001 00000000 00000000 00000000 -------------------- x000011011111001 == 0000011011.111001 == 27.890625

The number of bits required for the product (result) is the multiplicand's WL + the multiplier's WL ($WL_{multiplicand} + WL_{multiplier}$). It is typical for the results of multiplication (and addition) to be resized and the number of bits reduced. In fixed-point this makes intuitive sense because the smaller fractional bits are discarded and the value is rounded based on the discarded bits. Additional hardware is commonly included to perform the rounding task and reduce the number of bits to a reasonable data-path bus width.

Addition

Addition is a little more complicated because the points need to be aligned before performing the addition. Using the same numbers from the multiplication problem:

0110.1001 == 6.5625 000100.01 == 4.25 0110.1001 + 000100.01 ------------- 001010.1101 == 10.8125

When adding (subtracting) two numbers an additional bit is required for the result. When adding more than two numbers all of the same $WL$ width, the number of bits required for the result is $WL + log_2(N)$ where $N$ is the number of elements being summed.

Conclusion

Fixed-point representation is convienent and useful when dealing with signal processing implementations. This post is a basic introduction to fixed-point numbers. For a more comprehensive coverage of the subject see the references for more information.

[1]: Yates, Randy, "Fixed-Point Arithmetic: An Introduction"

[2]: K.B. Cullen, G.C.M. Silvestre, and .J. Hurley, "Simulation Tools for Fixed Point DSP Algorithms and Architectures"

[3]: Texas Instruments Example

[4]: Ercegovac, Milos Lang, Thomas, "Digital Arithmetic"

[5]: Kuo, Lee, Tian, "Real-Time Digital Signal Processing"

log:

01-Jul-2015: Fixed typo in the fixed-point binary word equation.

09-Apr-2013: Updated to use tuple format and use mathjax.

04-Apr-2011: Initial post.