Ultra Low-Voltage Low-Power CMOS 5-2 Compressor

Project Overview

This project implements an ultra low-voltage, low-power CMOS 5-2 compressor architecture based on Figure 13 from the IEEE research paper, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits" (T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, 2007). The design has been developed and comprehensively simulated using Cadence Virtuoso with the gpdk045 (45 nm CMOS) technology node.

The primary objective is the optimization of Power, Delay, and Area, ensuring high performance suitable for energy-efficient high-speed arithmetic circuits used in VLSI and Digital Signal Processing (DSP) systems. This repository contains both the un-sized implementation (ULVLP) as well as the transistor-sized optimized implementation (ULVLP_Sizing).

Theory and Working Principle

Theory

A 5-2 compressor is a combinational arithmetic circuit used to reduce multiple one-bit inputs of equal significance into fewer outputs while preserving their total binary sum. It is widely employed in high-speed arithmetic units, particularly in multipliers, accumulators, and digital signal processing (DSP) systems to speed up multi-operand addition.

In a 5-2 compressor, five input bits of equal weight (X1, X2, X3, X4, X5) and two carry inputs (Cin1 and Cin2) are combined to produce four outputs: a sum bit (S) and three carry bits (Carry, Cout1, Cout2).

The mathematical relationship between inputs and outputs is expressed as: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)

This equation ensures that the total binary weight of inputs and outputs remains the same. The sum output (S) represents the parity (odd/even count) of the inputs, while the carry outputs represent the higher-weight contributions from groups of input bits.

Working Principle

Input Combination: The compressor receives seven one-bit inputs. These are logically combined through XOR and MUX networks.
Sum Generation: The sum (S) is produced by performing XOR operations across all inputs: S = X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ X5 ⊕ Cin1 ⊕ Cin2 This gives a logic '1' when the number of HIGH inputs is odd.
Carry Generation: The carry outputs are obtained through majority logic functions or MUX-based selection:
- Carry = Majority of (X1, X2, X3)
- Cout1 = Majority of (X4, X5, Carry)
- Cout2 = Majority of (Cout1, Cin1, Cin2) These majority functions generate the proper carry bits required to maintain binary equivalence.
Binary Preservation: The preservation equation confirms that the total weighted sum of the inputs equals the total weighted sum of the outputs. Hence, the circuit performs a lossless compression of bits.

Significance

The 5-2 compressor acts as a core element in Wallace Tree and Dadda Tree multipliers, where multiple partial products must be added quickly.
By reducing the number of addition stages, it achieves lower propagation delay, reduced power, and higher computational speed.
When implemented using optimized CMOS XOR-XNOR and MUX cells, the compressor operates effectively even at low supply voltages (0.8 V - 1.2 V) with full-swing outputs and a low Power-Delay Product (PDP).

Tool and Technology Used

Parameter	Description
Tool Used	Cadence Virtuoso (Schematic Design, Transient Simulation, Layout & DRC/LVS)
Technology Node	45 nm CMOS process (Generic PDK / gpdk045)
Simulation Environment	Cadence Analog Design Environment (ADE)
Supply Voltage (VDD)	1 V
Transistor Models	nmos4 and pmos4 from 45 nm library
Transistor Sizing	Scaled according to paper ratios
Simulation Type	Transient analysis with all 128 input combinations verified
Expected Outputs	Sum, Carry, Cout1, Cout2 corresponding to correct binary Sum outputs

Comprehensive Implementation Details

Submodule Architecture

The design modularity stems from recreating fundamental blocks presented as sub-figures in the referenced paper. Each block has its own functional testbed. The submodules built from scratch include:

Figure 5e Module (fivee): Core functional partial block consisting of optimized transmission gates or multiplexer designs.
Figure 7b Module (sevenb): Core logical submodule aiding in carry generation or summation.
Figure 8b Module (eightb): Specific subcircuit optimized for power-delay reduction at the internal nodes.
Main Compressor (comp): The integrated 5-2 architecture aggregating the above subcircuits to generate standard outputs: Sum, Carry, Cout1, and Cout2.

Analytical Methodologies Reference

The simulation utilizes Cadence ADE (Analog Design Environment) configured carefully with algorithmic expressions to estimate component metrics. Specific formulae coded into Cadence (referencing Calculations.csv parameters) assess:

Power: Integration of Current across Voltage (Average Power function: average(VT("/<node>"))). Output pins cout1, cout2, carry, and sum are individually characterized.
Delay: Computed by capturing the time differential between an input threshold transition (0.55V) and output corresponding thresholds.
Power Delay Product (PDP): Derived dynamically by multiplying the node average power and absolute delay (average(VT("/output")) * delay(...)).

Performance Metrics Comparison

The following parameters provide analytical validation for the evaluated signals with and without custom transistor sizing optimizations.

Parameters	Value Without Sizing				Value With Sizing
Signal W.R.T	Cout1	Cout2	Carry	Sum	Cout1	Cout2	Carry	Sum
Power Calculations	500.03 m	531.30 m	515.62 m	492.13 m	501.42 m	485.04 m	515.233 m	499.956 m
Delay Calculations	25.585 p	23.5241 p	25.852 p	29.9533 p	396.945 p	377.813 p	403.559 p	507.79 p
Power Delay Product	12.793 p	12.4984 p	13.33 p	14.741 p	199.03 p	183.256 p	207.927 p	253.873 p

Justification for the Inability to Perform DC and PVT Analysis

Combinational and Dynamic Nature of the Circuit: The 5-2 compressor is a purely combinational logic circuit that functions based on dynamic switching of input signals. DC analysis assumes steady-state conditions, but since the compressor operates only during input transitions, it has no fixed operating point suitable for DC evaluation.
Lack of Defined DC Biasing Points: In DC analysis, every node in the circuit must have a stable DC voltage or current path. The 5-2 compressor uses digital logic inputs (0 or VDD) without any constant biasing network, causing undefined or floating nodes that prevent DC convergence.
Simulator Convergence Issues: Due to the absence of stable biasing and multiple dependent nodes, the simulator cannot establish an initial operating point, often leading to "no DC solution" or "convergence error" messages. This prevents the completion of DC analysis required for further characterization.
Inapplicability of DC Analysis for Logic Behavior: The key parameters of a 5-2 compressor such as propagation delay, dynamic power, and switching behavior can only be analyzed using transient simulation, not DC analysis. Therefore, DC analysis is not meaningful for this type of circuit.
PVT Analysis Depends on Successful DC Operating Point: Process, Voltage, and Temperature (PVT) variations require a valid DC operating point to initialize simulations for each corner condition. Since DC analysis could not be performed successfully, PVT analysis could not be carried out either.

Problems Faced and Solutions Applied

Problems Faced During Implementation

Transistor Width Limitations: The 45 nm PDK imposed a minimum finger width of 120 nm, causing warnings when smaller widths (e.g., 10n, 25n) were used as per the paper's ratios.
Threshold Voltage Drop in Pass Transistors: Initial XOR-MUX designs showed degraded output voltage levels due to threshold voltage drop in NMOS-only paths, leading to non-full-swing outputs.
Complex Signal Routing and Symbol Management: Interconnecting multiple XOR and MUX symbols required careful node naming and wiring discipline, as missing or misconnected nodes resulted in simulation errors.
Simulation Convergence Issues: At low supply voltages (0.8 V), transient simulations occasionally failed to converge or showed distorted waveforms due to high resistance or weak drive strengths.

Problems Resolved Using Solutions

Transistor Width Correction: All transistor widths were scaled proportionally (minimum PMOS width = 1.2 μm, NMOS = 0.48 μm) to maintain correct drive ratios while satisfying the PDK minimum constraints.
Full-Swing Restoration: The XOR-XNOR cell from Figure 5(e) was used along with static CMOS output buffers (Figure 7(b)) to ensure full-swing voltage levels and strong output drive.
Hierarchical Design Methodology: Each sub-circuit (XOR, XNOR, MUX) was symbolized individually and tested before being used in the final schematic. This hierarchical approach eliminated node confusion and made debugging easier.
Simulation Stability: Proper time-step control and power supply ramping were introduced in the ADE setup. This improved simulation convergence and produced clean transient responses at all voltage levels.

Visual Design and Results Reference

Below is a detailed walkthrough of the physical and schematic implementation of the compressor and its submodules, complemented by screenshots spanning calculation environments to final layouts.

1. Calculation and Analytical Environment

The Analog Design Environment (ADE) setups used to quantify Power, Delay, and Power-Delay Product (PDP).

Baseline Calculations Setup Explanation: Depicts the baseline expression outputs and values for the transient nodes simulating the standard gpdk045 width/length ratios.
Sized Layout Calculations Setup Explanation: Shows the evaluated metric expressions for the custom-sized implementation under the ULVLP_Sizing workspace, demonstrating optimized output variations.

2. Module Implementations and Schematics

Visual depictions of the structural hierarchies and component designs.

Figure 5e Reference Block Explanation: Implementation snippet of the specific subcircuit corresponding to the figure 5e logical gate architecture in the reference paper.
Figure 7b Reference Block Explanation: Implementation corresponding to the figure 7b logic gate configurations.
Figure 8b Reference Block Explanation: Implementation snippet for the figure 8b logical architecture.
Complete 5-2 Compressor Block Explanation: The comprehensive schematic aggregating internal modules to execute the full 5-2 compression with input combinations spanning to standard sum/carry lines.

3. Mask Layout Implementations

Representations of the DRC/LVS analyzed physical design layouts constructed from the gpdk045 library geometries.

Baseline Layout Explanation: The physical layout generation of the 5-2 compressor architecture executing initial physical rule checks.
Optimized Sized Layout Explanation: The complex geometry manipulation in standard layouts targeting minimal resistance/capacitance via optimized W/L transistor dimensions to enhance area/power footprint.

4. Waveforms and Transient Responses

Graphical evidence of the successful output evaluations covering logic responses and full voltage swings.

Baseline Functional Waveform Explanation: Graph rendering outputs against logical transitions mapping across the total domain of input possibilities.
Sized Layout Functional Waveform Explanation: Validating full-swing node capabilities post-sizing optimizations while preventing any major signal distortion or severe delay skew.

Conclusion

The 5-2 compressor architecture was successfully designed and simulated using Cadence Virtuoso in a 45 nm CMOS process. The design demonstrated correct functional behavior, satisfying the arithmetic relationship: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)

Simulation results verified that the proposed circuit produces accurate sum and carry outputs for all possible input combinations. The circuit achieved low power consumption, reduced propagation delay, and a low Power-Delay Product (PDP) compared to conventional adder-based compressors. This confirms that the chosen architecture is suitable for high-speed arithmetic and signal processing systems, particularly for partial product reduction in multipliers and MAC (Multiply-Accumulate) units operating at low supply voltages.

Setup and Execution Guide

Prerequisites

Dedicated Cadence environment loaded with the gpdk045 node logic libraries.
A customized C-shell source mechanism corresponding to Virtuoso standard startup configurations.

Standard Launch Procedures

To investigate the components directly via Cadence:

Download or clone this project repository into an active workspace path.
Open a terminal instantiated within the repository root directory.
Configure the C-shell environment sequentially by running:
```
csh
source /home/install/cshrc~
virtuoso
```
Access the Cadence Library Manager. The project comprises distinct libraries:
- ULVLP: Containing fundamental unsized subcircuit schemas (fivee, sevenb, eightb, comp).
- ULVLP_Sizing: Corresponding identical elements subjected to iterative physical sizing.
Launch either the Schematic view to traverse network nodes or the Layout view to observe physical traces and execute DRC/LVS extractions.
Initiate ADEL / ADE Explorer instance, load any contained states, and trigger transient operations.

Educational Reference Material

Calculations relative to the implementation logic have been sourced from fundamental principles demonstrated in the following recorded material:

Power Evaluation Methodologies: Refer to technical walkthroughs detailing integral-based calculation and direct net-node measurement methodologies.
Propagation Delay Evaluation: Capturing absolute net differentials logic to 50% threshold voltage levels.
Process, Voltage, Temperature (PVT) Analysis configurations targeting edge conditions constraints.

Project Specifications File Index

README.md: High-level and comprehensive technical outline (this document).
Note.txt: Academic outline encompassing primary guidance and external instructional reference URLs pertaining to cadence evaluation features.
Open.txt: Rapid sequence execution script designed for Unix terminal usage to invoke the Virtuoso wrapper.
Calculations.csv: A comma-separated list of Cadence evaluation expressions encompassing outputs tracking (/cout1, /cout2, /carry, /sum), node references, and the specific arithmetic operators deriving delay parameters from transient signal states.
Ultra_low-voltage_low-power_CMOS_4-2_and_5-2_compressors_for_fast_arithmetic_circuits.pdf: Source literature and core reference paper validating truth tables, node requirements, and overall network layouts.

References

T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits," IEEE Transactions on Circuits and Systems II: Express Briefs, Vol. 54, No. 5, pp. 412-416, May 2007.

Contributors

Chandan Sai Pavan Padala
D Rushikesh
KSVS Sobhita
Chebrolu Rishita

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.cadence		.cadence
Screenshots		Screenshots
ULVLP		ULVLP
ULVLP_Sizing		ULVLP_Sizing
LICENSE		LICENSE
Note.txt		Note.txt
Open.txt		Open.txt
README.md		README.md
Ultra_low-voltage_low-power_CMOS_4-2_and_5-2_compressors_for_fast_arithmetic_circuits.pdf		Ultra_low-voltage_low-power_CMOS_4-2_and_5-2_compressors_for_fast_arithmetic_circuits.pdf
cds.lib		cds.lib
libManager.log		libManager.log
libManager.log.1		libManager.log.1
libManager.log.1.cdslck		libManager.log.1.cdslck
libManager.log.2		libManager.log.2
libManager.log.cdslck		libManager.log.cdslck

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ultra Low-Voltage Low-Power CMOS 5-2 Compressor

Project Overview

Theory and Working Principle

Theory

Working Principle

Significance

Tool and Technology Used

Comprehensive Implementation Details

Submodule Architecture

Analytical Methodologies Reference

Performance Metrics Comparison

Justification for the Inability to Perform DC and PVT Analysis

Problems Faced and Solutions Applied

Problems Faced During Implementation

Problems Resolved Using Solutions

Visual Design and Results Reference

1. Calculation and Analytical Environment

2. Module Implementations and Schematics

3. Mask Layout Implementations

4. Waveforms and Transient Responses

Conclusion

Setup and Execution Guide

Prerequisites

Standard Launch Procedures

Educational Reference Material

Project Specifications File Index

References

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Ultra Low-Voltage Low-Power CMOS 5-2 Compressor

Project Overview

Theory and Working Principle

Theory

Working Principle

Significance

Tool and Technology Used

Comprehensive Implementation Details

Submodule Architecture

Analytical Methodologies Reference

Performance Metrics Comparison

Justification for the Inability to Perform DC and PVT Analysis

Problems Faced and Solutions Applied

Problems Faced During Implementation

Problems Resolved Using Solutions

Visual Design and Results Reference

1. Calculation and Analytical Environment

2. Module Implementations and Schematics

3. Mask Layout Implementations

4. Waveforms and Transient Responses

Conclusion

Setup and Execution Guide

Prerequisites

Standard Launch Procedures

Educational Reference Material

Project Specifications File Index

References

Contributors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages