High-performance low power arithmetic units
MetadataShow full item record
This thesis considers the problem of design of power and area efficient, high-performance arithmetic units for next generation integrated circuits. Performance has traditionally been the primary driver for innovation in the entire semiconductor industry. However, in recent years, design of low power integrated circuits has become another significant area of research. In standard CMOS logic families like static CMOS, domino, and so on, performance and power have always been antipodal, and the design process comprised of power-performance tradeoffs. We have tried to work around the entire problem by introducing a new circuit family called Limited Switch Dynamic Logic (LSDL), which is a power-performance optimized solution. The compression process forms the critical portion of a multiply operation, which in turn is a significant part of most floating-point operations on a chip. Compression is performed by 3-to-2 and 4-to-2 adders. A new technique and circuit to perform 4-to-2 carry save addition in a multiplier is proposed. In addition to the standard performance benefits of LSDL circuits, this adder also gains significant performance benefits from a carry relocation mechanism. A comparison of simulation results is made with a product line chip designed in the same technology, and LSDL circuits are shown to gain in power, performance, leakage and area. A reference is given for the implementation of the 4-to-2 on a multiplier chip fabricated in 90nm technology. The shift operation forms an integral part of all floating-point units. We present a novel shifting technqiue called partial decode or modulo shift, designed using LSDL circuit family, and present simulation results in 65nm technology. Traditional floating-points have separate units for shifitng and complementing. However we integrate these functions onto a single unit, and present an unique, high-performance, low power, low leakage, and low area Shift and Negate Unit. A large number of operations in modern microprocessors require circuitry for fast and power efficient movement of data. Permute units in a large number of chips move bytes of data in a quick and efficient manner, with the limitation that the byte boundaries of data need to be maintained. However, a large number of applications like encryption operations require extraction and movement of certain number of consecuteive bits, that may not necessarily be byte aligned. To this end, we present a novel technique for performing bit-aligned permute, and present a circuit for the same. Simulation results in 65nm are presented for the Bit Aligned Permute Unit.