This project implements an ultra low-voltage, low-power CMOS 5-2 compressor architecture based on Figure 13 from the IEEE research paper, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits" (T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, 2007). The design has been developed and comprehensively simulated using Cadence Virtuoso with the gpdk045 (45 nm CMOS) technology node.
The primary objective is the optimization of Power, Delay, and Area, ensuring high performance suitable for energy-efficient high-speed arithmetic circuits used in VLSI and Digital Signal Processing (DSP) systems. This repository contains both the un-sized implementation (ULVLP) as well as the transistor-sized optimized implementation (ULVLP_Sizing).
A 5-2 compressor is a combinational arithmetic circuit used to reduce multiple one-bit inputs of equal significance into fewer outputs while preserving their total binary sum. It is widely employed in high-speed arithmetic units, particularly in multipliers, accumulators, and digital signal processing (DSP) systems to speed up multi-operand addition.
In a 5-2 compressor, five input bits of equal weight (X1, X2, X3, X4, X5) and two carry inputs (Cin1 and Cin2) are combined to produce four outputs: a sum bit (S) and three carry bits (Carry, Cout1, Cout2).
The mathematical relationship between inputs and outputs is expressed as: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)
This equation ensures that the total binary weight of inputs and outputs remains the same. The sum output (S) represents the parity (odd/even count) of the inputs, while the carry outputs represent the higher-weight contributions from groups of input bits.
- Input Combination: The compressor receives seven one-bit inputs. These are logically combined through XOR and MUX networks.
- Sum Generation: The sum (S) is produced by performing XOR operations across all inputs: S = X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ X5 ⊕ Cin1 ⊕ Cin2 This gives a logic '1' when the number of HIGH inputs is odd.
- Carry Generation: The carry outputs are obtained through majority logic functions or MUX-based selection:
- Carry = Majority of (X1, X2, X3)
- Cout1 = Majority of (X4, X5, Carry)
- Cout2 = Majority of (Cout1, Cin1, Cin2) These majority functions generate the proper carry bits required to maintain binary equivalence.
- Binary Preservation: The preservation equation confirms that the total weighted sum of the inputs equals the total weighted sum of the outputs. Hence, the circuit performs a lossless compression of bits.
- The 5-2 compressor acts as a core element in Wallace Tree and Dadda Tree multipliers, where multiple partial products must be added quickly.
- By reducing the number of addition stages, it achieves lower propagation delay, reduced power, and higher computational speed.
- When implemented using optimized CMOS XOR-XNOR and MUX cells, the compressor operates effectively even at low supply voltages (0.8 V - 1.2 V) with full-swing outputs and a low Power-Delay Product (PDP).
| Parameter | Description |
|---|---|
| Tool Used | Cadence Virtuoso (Schematic Design, Transient Simulation, Layout & DRC/LVS) |
| Technology Node | 45 nm CMOS process (Generic PDK / gpdk045) |
| Simulation Environment | Cadence Analog Design Environment (ADE) |
| Supply Voltage (VDD) | 1 V |
| Transistor Models | nmos4 and pmos4 from 45 nm library |
| Transistor Sizing | Scaled according to paper ratios |
| Simulation Type | Transient analysis with all 128 input combinations verified |
| Expected Outputs | Sum, Carry, Cout1, Cout2 corresponding to correct binary Sum outputs |
The design modularity stems from recreating fundamental blocks presented as sub-figures in the referenced paper. Each block has its own functional testbed. The submodules built from scratch include:
- Figure 5e Module (
fivee): Core functional partial block consisting of optimized transmission gates or multiplexer designs. - Figure 7b Module (
sevenb): Core logical submodule aiding in carry generation or summation. - Figure 8b Module (
eightb): Specific subcircuit optimized for power-delay reduction at the internal nodes. - Main Compressor (
comp): The integrated 5-2 architecture aggregating the above subcircuits to generate standard outputs: Sum, Carry, Cout1, and Cout2.
The simulation utilizes Cadence ADE (Analog Design Environment) configured carefully with algorithmic expressions to estimate component metrics. Specific formulae coded into Cadence (referencing Calculations.csv parameters) assess:
- Power: Integration of Current across Voltage (Average Power function:
average(VT("/<node>"))). Output pinscout1,cout2,carry, andsumare individually characterized. - Delay: Computed by capturing the time differential between an input threshold transition (0.55V) and output corresponding thresholds.
- Power Delay Product (PDP): Derived dynamically by multiplying the node average power and absolute delay
(average(VT("/output")) * delay(...)).
The following parameters provide analytical validation for the evaluated signals with and without custom transistor sizing optimizations.
| Parameters | Value Without Sizing | Value With Sizing | ||||||
|---|---|---|---|---|---|---|---|---|
| Signal W.R.T | Cout1 | Cout2 | Carry | Sum | Cout1 | Cout2 | Carry | Sum |
| Power Calculations | 500.03 m | 531.30 m | 515.62 m | 492.13 m | 501.42 m | 485.04 m | 515.233 m | 499.956 m |
| Delay Calculations | 25.585 p | 23.5241 p | 25.852 p | 29.9533 p | 396.945 p | 377.813 p | 403.559 p | 507.79 p |
| Power Delay Product | 12.793 p | 12.4984 p | 13.33 p | 14.741 p | 199.03 p | 183.256 p | 207.927 p | 253.873 p |
- Combinational and Dynamic Nature of the Circuit: The 5-2 compressor is a purely combinational logic circuit that functions based on dynamic switching of input signals. DC analysis assumes steady-state conditions, but since the compressor operates only during input transitions, it has no fixed operating point suitable for DC evaluation.
- Lack of Defined DC Biasing Points: In DC analysis, every node in the circuit must have a stable DC voltage or current path. The 5-2 compressor uses digital logic inputs (0 or VDD) without any constant biasing network, causing undefined or floating nodes that prevent DC convergence.
- Simulator Convergence Issues: Due to the absence of stable biasing and multiple dependent nodes, the simulator cannot establish an initial operating point, often leading to "no DC solution" or "convergence error" messages. This prevents the completion of DC analysis required for further characterization.
- Inapplicability of DC Analysis for Logic Behavior: The key parameters of a 5-2 compressor such as propagation delay, dynamic power, and switching behavior can only be analyzed using transient simulation, not DC analysis. Therefore, DC analysis is not meaningful for this type of circuit.
- PVT Analysis Depends on Successful DC Operating Point: Process, Voltage, and Temperature (PVT) variations require a valid DC operating point to initialize simulations for each corner condition. Since DC analysis could not be performed successfully, PVT analysis could not be carried out either.
- Transistor Width Limitations: The 45 nm PDK imposed a minimum finger width of 120 nm, causing warnings when smaller widths (e.g., 10n, 25n) were used as per the paper's ratios.
- Threshold Voltage Drop in Pass Transistors: Initial XOR-MUX designs showed degraded output voltage levels due to threshold voltage drop in NMOS-only paths, leading to non-full-swing outputs.
- Complex Signal Routing and Symbol Management: Interconnecting multiple XOR and MUX symbols required careful node naming and wiring discipline, as missing or misconnected nodes resulted in simulation errors.
- Simulation Convergence Issues: At low supply voltages (0.8 V), transient simulations occasionally failed to converge or showed distorted waveforms due to high resistance or weak drive strengths.
- Transistor Width Correction: All transistor widths were scaled proportionally (minimum PMOS width = 1.2 μm, NMOS = 0.48 μm) to maintain correct drive ratios while satisfying the PDK minimum constraints.
- Full-Swing Restoration: The XOR-XNOR cell from Figure 5(e) was used along with static CMOS output buffers (Figure 7(b)) to ensure full-swing voltage levels and strong output drive.
- Hierarchical Design Methodology: Each sub-circuit (XOR, XNOR, MUX) was symbolized individually and tested before being used in the final schematic. This hierarchical approach eliminated node confusion and made debugging easier.
- Simulation Stability: Proper time-step control and power supply ramping were introduced in the ADE setup. This improved simulation convergence and produced clean transient responses at all voltage levels.
Below is a detailed walkthrough of the physical and schematic implementation of the compressor and its submodules, complemented by screenshots spanning calculation environments to final layouts.
The Analog Design Environment (ADE) setups used to quantify Power, Delay, and Power-Delay Product (PDP).
-
Baseline Calculations Setup
Explanation: Depicts the baseline expression outputs and values for the transient nodes simulating the standard gpdk045 width/length ratios. -
Sized Layout Calculations Setup
Explanation: Shows the evaluated metric expressions for the custom-sized implementation under the ULVLP_Sizingworkspace, demonstrating optimized output variations.
Visual depictions of the structural hierarchies and component designs.
-
Figure 5e Reference Block
Explanation: Implementation snippet of the specific subcircuit corresponding to the figure 5e logical gate architecture in the reference paper. -
Figure 7b Reference Block
Explanation: Implementation corresponding to the figure 7b logic gate configurations. -
Figure 8b Reference Block
Explanation: Implementation snippet for the figure 8b logical architecture. -
Complete 5-2 Compressor Block
Explanation: The comprehensive schematic aggregating internal modules to execute the full 5-2 compression with input combinations spanning to standard sum/carry lines.
Representations of the DRC/LVS analyzed physical design layouts constructed from the gpdk045 library geometries.
-
Baseline Layout
Explanation: The physical layout generation of the 5-2 compressor architecture executing initial physical rule checks. -
Optimized Sized Layout
Explanation: The complex geometry manipulation in standard layouts targeting minimal resistance/capacitance via optimized W/L transistor dimensions to enhance area/power footprint.
Graphical evidence of the successful output evaluations covering logic responses and full voltage swings.
-
Baseline Functional Waveform
Explanation: Graph rendering outputs against logical transitions mapping across the total domain of input possibilities. -
Sized Layout Functional Waveform
Explanation: Validating full-swing node capabilities post-sizing optimizations while preventing any major signal distortion or severe delay skew.
The 5-2 compressor architecture was successfully designed and simulated using Cadence Virtuoso in a 45 nm CMOS process. The design demonstrated correct functional behavior, satisfying the arithmetic relationship: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)
Simulation results verified that the proposed circuit produces accurate sum and carry outputs for all possible input combinations. The circuit achieved low power consumption, reduced propagation delay, and a low Power-Delay Product (PDP) compared to conventional adder-based compressors. This confirms that the chosen architecture is suitable for high-speed arithmetic and signal processing systems, particularly for partial product reduction in multipliers and MAC (Multiply-Accumulate) units operating at low supply voltages.
- Dedicated Cadence environment loaded with the
gpdk045node logic libraries. - A customized C-shell source mechanism corresponding to Virtuoso standard startup configurations.
To investigate the components directly via Cadence:
- Download or clone this project repository into an active workspace path.
- Open a terminal instantiated within the repository root directory.
- Configure the C-shell environment sequentially by running:
csh source /home/install/cshrc~ virtuoso - Access the Cadence Library Manager. The project comprises distinct libraries:
ULVLP: Containing fundamental unsized subcircuit schemas (fivee,sevenb,eightb,comp).ULVLP_Sizing: Corresponding identical elements subjected to iterative physical sizing.
- Launch either the
Schematicview to traverse network nodes or theLayoutview to observe physical traces and execute DRC/LVS extractions. - Initiate ADEL / ADE Explorer instance, load any contained states, and trigger transient operations.
Calculations relative to the implementation logic have been sourced from fundamental principles demonstrated in the following recorded material:
- Power Evaluation Methodologies: Refer to technical walkthroughs detailing integral-based calculation and direct net-node measurement methodologies.
- Propagation Delay Evaluation: Capturing absolute net differentials logic to 50% threshold voltage levels.
- Process, Voltage, Temperature (PVT) Analysis configurations targeting edge conditions constraints.
README.md: High-level and comprehensive technical outline (this document).Note.txt: Academic outline encompassing primary guidance and external instructional reference URLs pertaining to cadence evaluation features.Open.txt: Rapid sequence execution script designed for Unix terminal usage to invoke the Virtuoso wrapper.Calculations.csv: A comma-separated list of Cadence evaluation expressions encompassing outputs tracking (/cout1, /cout2, /carry, /sum), node references, and the specific arithmetic operators deriving delay parameters from transient signal states.Ultra_low-voltage_low-power_CMOS_4-2_and_5-2_compressors_for_fast_arithmetic_circuits.pdf: Source literature and core reference paper validating truth tables, node requirements, and overall network layouts.
T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits," IEEE Transactions on Circuits and Systems II: Express Briefs, Vol. 54, No. 5, pp. 412-416, May 2007.
- Chandan Sai Pavan Padala
- D Rushikesh
- KSVS Sobhita
- Chebrolu Rishita