Skip to content

chandansaipavanpadala/UltraLowVoltageLowPowerCompressor-CadenceVirtuoso

Repository files navigation

Ultra Low-Voltage Low-Power CMOS 5-2 Compressor

Project Overview

This project implements an ultra low-voltage, low-power CMOS 5-2 compressor architecture based on Figure 13 from the IEEE research paper, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits" (T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, 2007). The design has been developed and comprehensively simulated using Cadence Virtuoso with the gpdk045 (45 nm CMOS) technology node.

The primary objective is the optimization of Power, Delay, and Area, ensuring high performance suitable for energy-efficient high-speed arithmetic circuits used in VLSI and Digital Signal Processing (DSP) systems. This repository contains both the un-sized implementation (ULVLP) as well as the transistor-sized optimized implementation (ULVLP_Sizing).


Theory and Working Principle

Theory

A 5-2 compressor is a combinational arithmetic circuit used to reduce multiple one-bit inputs of equal significance into fewer outputs while preserving their total binary sum. It is widely employed in high-speed arithmetic units, particularly in multipliers, accumulators, and digital signal processing (DSP) systems to speed up multi-operand addition.

In a 5-2 compressor, five input bits of equal weight (X1, X2, X3, X4, X5) and two carry inputs (Cin1 and Cin2) are combined to produce four outputs: a sum bit (S) and three carry bits (Carry, Cout1, Cout2).

The mathematical relationship between inputs and outputs is expressed as: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)

This equation ensures that the total binary weight of inputs and outputs remains the same. The sum output (S) represents the parity (odd/even count) of the inputs, while the carry outputs represent the higher-weight contributions from groups of input bits.

Working Principle

  1. Input Combination: The compressor receives seven one-bit inputs. These are logically combined through XOR and MUX networks.
  2. Sum Generation: The sum (S) is produced by performing XOR operations across all inputs: S = X1 ⊕ X2 ⊕ X3 ⊕ X4 ⊕ X5 ⊕ Cin1 ⊕ Cin2 This gives a logic '1' when the number of HIGH inputs is odd.
  3. Carry Generation: The carry outputs are obtained through majority logic functions or MUX-based selection:
    • Carry = Majority of (X1, X2, X3)
    • Cout1 = Majority of (X4, X5, Carry)
    • Cout2 = Majority of (Cout1, Cin1, Cin2) These majority functions generate the proper carry bits required to maintain binary equivalence.
  4. Binary Preservation: The preservation equation confirms that the total weighted sum of the inputs equals the total weighted sum of the outputs. Hence, the circuit performs a lossless compression of bits.

Significance

  • The 5-2 compressor acts as a core element in Wallace Tree and Dadda Tree multipliers, where multiple partial products must be added quickly.
  • By reducing the number of addition stages, it achieves lower propagation delay, reduced power, and higher computational speed.
  • When implemented using optimized CMOS XOR-XNOR and MUX cells, the compressor operates effectively even at low supply voltages (0.8 V - 1.2 V) with full-swing outputs and a low Power-Delay Product (PDP).

Tool and Technology Used

Parameter Description
Tool Used Cadence Virtuoso (Schematic Design, Transient Simulation, Layout & DRC/LVS)
Technology Node 45 nm CMOS process (Generic PDK / gpdk045)
Simulation Environment Cadence Analog Design Environment (ADE)
Supply Voltage (VDD) 1 V
Transistor Models nmos4 and pmos4 from 45 nm library
Transistor Sizing Scaled according to paper ratios
Simulation Type Transient analysis with all 128 input combinations verified
Expected Outputs Sum, Carry, Cout1, Cout2 corresponding to correct binary Sum outputs

Comprehensive Implementation Details

Submodule Architecture

The design modularity stems from recreating fundamental blocks presented as sub-figures in the referenced paper. Each block has its own functional testbed. The submodules built from scratch include:

  1. Figure 5e Module (fivee): Core functional partial block consisting of optimized transmission gates or multiplexer designs.
  2. Figure 7b Module (sevenb): Core logical submodule aiding in carry generation or summation.
  3. Figure 8b Module (eightb): Specific subcircuit optimized for power-delay reduction at the internal nodes.
  4. Main Compressor (comp): The integrated 5-2 architecture aggregating the above subcircuits to generate standard outputs: Sum, Carry, Cout1, and Cout2.

Analytical Methodologies Reference

The simulation utilizes Cadence ADE (Analog Design Environment) configured carefully with algorithmic expressions to estimate component metrics. Specific formulae coded into Cadence (referencing Calculations.csv parameters) assess:

  • Power: Integration of Current across Voltage (Average Power function: average(VT("/<node>"))). Output pins cout1, cout2, carry, and sum are individually characterized.
  • Delay: Computed by capturing the time differential between an input threshold transition (0.55V) and output corresponding thresholds.
  • Power Delay Product (PDP): Derived dynamically by multiplying the node average power and absolute delay (average(VT("/output")) * delay(...)).

Performance Metrics Comparison

The following parameters provide analytical validation for the evaluated signals with and without custom transistor sizing optimizations.

Parameters Value Without Sizing Value With Sizing
Signal W.R.T Cout1 Cout2 Carry Sum Cout1 Cout2 Carry Sum
Power Calculations 500.03 m 531.30 m 515.62 m 492.13 m 501.42 m 485.04 m 515.233 m 499.956 m
Delay Calculations 25.585 p 23.5241 p 25.852 p 29.9533 p 396.945 p 377.813 p 403.559 p 507.79 p
Power Delay Product 12.793 p 12.4984 p 13.33 p 14.741 p 199.03 p 183.256 p 207.927 p 253.873 p

Justification for the Inability to Perform DC and PVT Analysis

  1. Combinational and Dynamic Nature of the Circuit: The 5-2 compressor is a purely combinational logic circuit that functions based on dynamic switching of input signals. DC analysis assumes steady-state conditions, but since the compressor operates only during input transitions, it has no fixed operating point suitable for DC evaluation.
  2. Lack of Defined DC Biasing Points: In DC analysis, every node in the circuit must have a stable DC voltage or current path. The 5-2 compressor uses digital logic inputs (0 or VDD) without any constant biasing network, causing undefined or floating nodes that prevent DC convergence.
  3. Simulator Convergence Issues: Due to the absence of stable biasing and multiple dependent nodes, the simulator cannot establish an initial operating point, often leading to "no DC solution" or "convergence error" messages. This prevents the completion of DC analysis required for further characterization.
  4. Inapplicability of DC Analysis for Logic Behavior: The key parameters of a 5-2 compressor such as propagation delay, dynamic power, and switching behavior can only be analyzed using transient simulation, not DC analysis. Therefore, DC analysis is not meaningful for this type of circuit.
  5. PVT Analysis Depends on Successful DC Operating Point: Process, Voltage, and Temperature (PVT) variations require a valid DC operating point to initialize simulations for each corner condition. Since DC analysis could not be performed successfully, PVT analysis could not be carried out either.

Problems Faced and Solutions Applied

Problems Faced During Implementation

  1. Transistor Width Limitations: The 45 nm PDK imposed a minimum finger width of 120 nm, causing warnings when smaller widths (e.g., 10n, 25n) were used as per the paper's ratios.
  2. Threshold Voltage Drop in Pass Transistors: Initial XOR-MUX designs showed degraded output voltage levels due to threshold voltage drop in NMOS-only paths, leading to non-full-swing outputs.
  3. Complex Signal Routing and Symbol Management: Interconnecting multiple XOR and MUX symbols required careful node naming and wiring discipline, as missing or misconnected nodes resulted in simulation errors.
  4. Simulation Convergence Issues: At low supply voltages (0.8 V), transient simulations occasionally failed to converge or showed distorted waveforms due to high resistance or weak drive strengths.

Problems Resolved Using Solutions

  1. Transistor Width Correction: All transistor widths were scaled proportionally (minimum PMOS width = 1.2 μm, NMOS = 0.48 μm) to maintain correct drive ratios while satisfying the PDK minimum constraints.
  2. Full-Swing Restoration: The XOR-XNOR cell from Figure 5(e) was used along with static CMOS output buffers (Figure 7(b)) to ensure full-swing voltage levels and strong output drive.
  3. Hierarchical Design Methodology: Each sub-circuit (XOR, XNOR, MUX) was symbolized individually and tested before being used in the final schematic. This hierarchical approach eliminated node confusion and made debugging easier.
  4. Simulation Stability: Proper time-step control and power supply ramping were introduced in the ADE setup. This improved simulation convergence and produced clean transient responses at all voltage levels.

Visual Design and Results Reference

Below is a detailed walkthrough of the physical and schematic implementation of the compressor and its submodules, complemented by screenshots spanning calculation environments to final layouts.

1. Calculation and Analytical Environment

The Analog Design Environment (ADE) setups used to quantify Power, Delay, and Power-Delay Product (PDP).

  • Baseline Calculations Setup ADE Calc Setup Explanation: Depicts the baseline expression outputs and values for the transient nodes simulating the standard gpdk045 width/length ratios.

  • Sized Layout Calculations Setup ADE Calc Sized Setup Explanation: Shows the evaluated metric expressions for the custom-sized implementation under the ULVLP_Sizing workspace, demonstrating optimized output variations.

2. Module Implementations and Schematics

Visual depictions of the structural hierarchies and component designs.

  • Figure 5e Reference Block Figure 5e Submodule Explanation: Implementation snippet of the specific subcircuit corresponding to the figure 5e logical gate architecture in the reference paper.

  • Figure 7b Reference Block Figure 7b Submodule Explanation: Implementation corresponding to the figure 7b logic gate configurations.

  • Figure 8b Reference Block Figure 8b Submodule Explanation: Implementation snippet for the figure 8b logical architecture.

  • Complete 5-2 Compressor Block 5-2 Compressor Schematic Explanation: The comprehensive schematic aggregating internal modules to execute the full 5-2 compression with input combinations spanning to standard sum/carry lines.

3. Mask Layout Implementations

Representations of the DRC/LVS analyzed physical design layouts constructed from the gpdk045 library geometries.

  • Baseline Layout Compressor Layout Explanation: The physical layout generation of the 5-2 compressor architecture executing initial physical rule checks.

  • Optimized Sized Layout Compressor Sized Layout Explanation: The complex geometry manipulation in standard layouts targeting minimal resistance/capacitance via optimized W/L transistor dimensions to enhance area/power footprint.

4. Waveforms and Transient Responses

Graphical evidence of the successful output evaluations covering logic responses and full voltage swings.

  • Baseline Functional Waveform Transient Waveforms Explanation: Graph rendering outputs against logical transitions mapping across the total domain of input possibilities.

  • Sized Layout Functional Waveform Transient Waveforms Sized Explanation: Validating full-swing node capabilities post-sizing optimizations while preventing any major signal distortion or severe delay skew.


Conclusion

The 5-2 compressor architecture was successfully designed and simulated using Cadence Virtuoso in a 45 nm CMOS process. The design demonstrated correct functional behavior, satisfying the arithmetic relationship: X1 + X2 + X3 + X4 + X5 + Cin1 + Cin2 = S + 2(Carry + Cout1 + Cout2)

Simulation results verified that the proposed circuit produces accurate sum and carry outputs for all possible input combinations. The circuit achieved low power consumption, reduced propagation delay, and a low Power-Delay Product (PDP) compared to conventional adder-based compressors. This confirms that the chosen architecture is suitable for high-speed arithmetic and signal processing systems, particularly for partial product reduction in multipliers and MAC (Multiply-Accumulate) units operating at low supply voltages.


Setup and Execution Guide

Prerequisites

  1. Dedicated Cadence environment loaded with the gpdk045 node logic libraries.
  2. A customized C-shell source mechanism corresponding to Virtuoso standard startup configurations.

Standard Launch Procedures

To investigate the components directly via Cadence:

  1. Download or clone this project repository into an active workspace path.
  2. Open a terminal instantiated within the repository root directory.
  3. Configure the C-shell environment sequentially by running:
    csh
    source /home/install/cshrc~
    virtuoso
  4. Access the Cadence Library Manager. The project comprises distinct libraries:
    • ULVLP: Containing fundamental unsized subcircuit schemas (fivee, sevenb, eightb, comp).
    • ULVLP_Sizing: Corresponding identical elements subjected to iterative physical sizing.
  5. Launch either the Schematic view to traverse network nodes or the Layout view to observe physical traces and execute DRC/LVS extractions.
  6. Initiate ADEL / ADE Explorer instance, load any contained states, and trigger transient operations.

Educational Reference Material

Calculations relative to the implementation logic have been sourced from fundamental principles demonstrated in the following recorded material:

  • Power Evaluation Methodologies: Refer to technical walkthroughs detailing integral-based calculation and direct net-node measurement methodologies.
  • Propagation Delay Evaluation: Capturing absolute net differentials logic to 50% threshold voltage levels.
  • Process, Voltage, Temperature (PVT) Analysis configurations targeting edge conditions constraints.

Project Specifications File Index

  • README.md: High-level and comprehensive technical outline (this document).
  • Note.txt: Academic outline encompassing primary guidance and external instructional reference URLs pertaining to cadence evaluation features.
  • Open.txt: Rapid sequence execution script designed for Unix terminal usage to invoke the Virtuoso wrapper.
  • Calculations.csv: A comma-separated list of Cadence evaluation expressions encompassing outputs tracking (/cout1, /cout2, /carry, /sum), node references, and the specific arithmetic operators deriving delay parameters from transient signal states.
  • Ultra_low-voltage_low-power_CMOS_4-2_and_5-2_compressors_for_fast_arithmetic_circuits.pdf: Source literature and core reference paper validating truth tables, node requirements, and overall network layouts.

References

T. K. B. Rao, D. Radhakrishnan, and P. V. Rao, "Ultra Low-Voltage Low-Power CMOS 4-2 and 5-2 Compressors for Fast Arithmetic Circuits," IEEE Transactions on Circuits and Systems II: Express Briefs, Vol. 54, No. 5, pp. 412-416, May 2007.

Contributors

  • Chandan Sai Pavan Padala
  • D Rushikesh
  • KSVS Sobhita
  • Chebrolu Rishita

About

Designed and simulated ultra low-voltage, low-power CMOS 5:2 compressors in Cadence Virtuoso, based on Figure 13 from the referenced IEEE paper. Focused on optimizing power, delay, and area for efficient high-speed arithmetic circuits used in VLSI and DSP applications.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors