The IEEE has a standard IEEE-754 to represent floating point numbers , . It has three parts:
Sign: The sign bit is a single bit at the MSB where
0 = Positive Number
1 = Negative Number
Exponent: The exponent is represented with a Bias added to it in order to represent both positive and negative exponent. The exponent has a base of 2.
Exponent = Stored value at exponent bits – Bias
Stored value = 150
Bias = 127 (single precision)
Actual Exponent: 150 – 127 = 23
Exponent -127 (all 0s) and +128 (all 1s) are reserved for special numbers.
Mantissa (aka Significand): The Mantissa is composed of Fraction part that represents precision bits.
Example: 1101 = 1.101 x 2 3
Mantissa = 101
Exponent = 3
The IEEE 754 has two formats:
- Single Precision Floating Point (32 bit)
- Double Precision Floating Point (64 bit)
Multiplying Two IEEE 32 bit floating point number
Sign bit: If one of them negative then negative, otherwise positive
Exponent: The two exponents are added, since both are on bias 127, 127 is subtracted from the sum. Example:
Exp1: 200, Exp2: 150
Exp = Exp1 + Exp2 – Base = 200 + 150 - 127 = 223
Mantissa: The Mantissa is obtained by multiplying the two 23 bit input Mantissa with an implicit one added to MSB making two 24 bit numbers. The result is again truncated to 23 bits leaving the implicit 1.
Example: Mantissa = Mantissa1 x Mantissa2
1.01 x 1.0011 = 1. 011111, again the 1 becomes implicit
The result is 8 bit long in which Mantissa (8 downto 1) is taken instead of (7 downto 0) and 1 is added to the Exponent (truncate ).
Top Level Design Components:
24 bit Multiplier
Two 8 bit Adder
46 bit 1 selector Multiplexer
2 input XOR gate
Floating Point Multiplier Controller.
M1: 24 bit Multiplier:
The two mantissas, [input(22 downto 0)] are fed into a 24 bit multiplier with implicit 1 at the MSB. The multpllier is based on Booth Recoding Algorithm  and produces an output with size of 48 bit.
MX1: 46 bit 1 selector Multiplexer:
It is a special multiplexer that is used to sample the output. The 48th bit of the multiplier output is used as the selector of the multiplexer. If the 48th bit is 0, then multiplier output (45 downto 0) is selected otherwise to obtain the normalization, the output (46 downto 1) is selected. From the 46 bit output of multiplexer, mux_out(45 downto 23) is selected as the final truncated mantissa output (22 downto 0) of the floating point multiplier.
AD1: 8 bit adder:
This adder is used to add the two exponent parts [input(30 downto 23)] of the two inputs. The initial carry in is connected to the 48th bit of the multiplier to obtain the normalization when needed. It is 8 bit ripple carry adder.
AD2: 8 bit adder:
This adder is used to subtract 127 from the added exponents. The one’s complement of 127 is fed as the second input and a carry in of 1 is passed to the adder. The subtracted result is used as output (30 downto 23).
X1: 2 input XOR:
The two MSBs of the two inputs is passed through an XOR gate to produce the sign bit of the output (31).
Synthesis, Mapping, Placement, Routing:
After the design was successfully implemented and operational in behavioural simulation at ModelSim, the VHDL code was synthesized in XILINX ISE. In this phase, the synthesizer translates the VHDL code and produces the Post-Synthesis-Simulation-Model. Then it maps the model into different blocks of FPGA and generates Post-Map-Simulation-Model. Then it places the components and routes the interconnections between the different components and generates Post-Place and Route-Simulation-Model. This file was used to test the post route simulation result to check the functionality of the design.
1. Steve Hollasch, IEEE Standard 754 Floating Point Numbers
2. Article about Floating Point, Wikipedia
3. Siddamal, S.V., Banakar, R.M., Jinaga, B.C. ,B.V.B C.E.T, Hubli, “Design of High-Speed Floating Point Multiplier” 4th IEEE International Symposium on Electronic Design, Test and Applications, 2008. DELTA 2008.
4. Multiplying Floating Point Numbers
5. Booth's multiplication algorithm, Wikipedia