🤝 Contributing to FormalVerifML

Thank you for your interest in contributing to FormalVerifML! This document provides comprehensive guidelines for contributing to the project.

📋 Table of Contents

Getting Started
Development Setup
Code Standards
Testing Guidelines
Pull Request Process
Issue Reporting
Documentation
Community Guidelines

🚀 Getting Started

Prerequisites

Before contributing, ensure you have:

Python 3.9+ with pip
Lean 4 with mathlib
Git and GitHub account
Docker (optional, for containerized development)
8GB+ RAM for large model verification

Quick Start

Fork the repository on GitHub

Clone your fork:

git clone https://github.com/your-username/formal_verif_ml.git
cd formal_verif_ml

Add upstream remote:

git remote add upstream https://github.com/fraware/formal_verif_ml.git

Create a feature branch:

git checkout -b feature/your-feature-name

🛠️ Development Setup

Environment Setup

# Create virtual environment
python3.9 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -r requirements-dev.txt

# Install pre-commit hooks
pre-commit install

# Build Lean project
lake build

# Run tests to verify setup
python -m pytest tests/

IDE Configuration

VS Code

Install Extensions:
- Python
- Lean 4
- GitLens
- Pylint
- Black Formatter

Workspace Settings (.vscode/settings.json):

{
  "python.defaultInterpreterPath": "./venv/bin/python",
  "python.linting.enabled": true,
  "python.linting.pylintEnabled": true,
  "python.formatting.provider": "black",
  "python.testing.pytestEnabled": true,
  "python.testing.pytestArgs": ["tests/"],
  "editor.formatOnSave": true,
  "editor.codeActionsOnSave": {
    "source.organizeImports": true
  }
}

PyCharm

Configure Python Interpreter: Point to venv/bin/python
Install Plugins: Python, Docker, Git, Lean 4
Configure Testing: Use pytest framework
Setup Code Style: Use Black formatter

Docker Development

# Build development image
docker build -f Dockerfile.dev -t formalverifml-dev .

# Run development container
docker run -it --rm \
  -v $(pwd):/app \
  -p 5000:5000 \
  -p 8000:8000 \
  formalverifml-dev

📏 Code Standards

Python Standards

Style Guide

Formatter: Black (line length: 88)
Linter: Pylint (score: 9.0+)
Type Hints: Required for all functions
Docstrings: Google style

Example Code

from typing import Dict, List, Optional
import logging

logger = logging.getLogger(__name__)


def process_model_data(
    model_path: str,
    config: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
    """Process model data with configuration.

    Args:
        model_path: Path to the model file.
        config: Optional configuration dictionary.

    Returns:
        Dictionary containing processed model data.

    Raises:
        FileNotFoundError: If model file doesn't exist.
        ValueError: If model data is invalid.

    Example:
        >>> config = {"normalize": True, "batch_size": 32}
        >>> result = process_model_data("model.pth", config)
        >>> print(result["status"])
        'success'
    """
    if not os.path.exists(model_path):
        raise FileNotFoundError(f"Model file not found: {model_path}")

    logger.info(f"Processing model: {model_path}")

    # Implementation here
    return {"status": "success", "data": processed_data}

Error Handling

def safe_operation(data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
    """Perform operation with comprehensive error handling."""
    try:
        # Validate input
        if not isinstance(data, dict):
            raise ValueError("Data must be a dictionary")

        # Perform operation
        result = perform_operation(data)

        # Validate output
        if not result:
            raise RuntimeError("Operation returned empty result")

        return result

    except ValueError as e:
        logger.error(f"Validation error: {e}")
        return None
    except RuntimeError as e:
        logger.error(f"Runtime error: {e}")
        return None
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        return None

Lean Standards

Style Guide

Naming: Use camelCase for definitions, snake_case for variables
Documentation: Use /-- for documentation comments
Structure: Group related definitions together

Example Code

/--
Process neural network layer with validation.
-/
def processLayer (layer : LayerType) (input : Array Float) : Array Float :=
  -- Validate input
  if input.size == 0 then
    #[]
  else
    -- Process layer
    match layer with
    | LayerType.linear w b => evalLinear w b input
    | LayerType.relu => evalActivation LayerType.relu input
    | _ => input

Proof Organization

theorem modelRobustness (model : NeuralNet) (ε δ : Float) :
  robust model ε δ := by
  -- Apply robustness definition
  unfold robust

  -- Use triangle inequality
  apply triangleInequality

  -- Apply model properties
  apply modelProperties

  -- Complete proof
  done

Git Standards

Commit Messages

Use Conventional Commits format:

# Format: <type>(<scope>): <description>

# Examples:
feat(translator): add support for vision transformers
fix(webapp): resolve memory leak in model processing
docs(readme): update installation instructions
test(verification): add property-based tests for robustness
refactor(core): simplify model evaluation logic

Branch Naming

# Feature branches
feature/vision-transformer-support
feature/memory-optimization

# Bug fix branches
bugfix/memory-leak-in-export
bugfix/lean-build-failure

# Hotfix branches
hotfix/critical-security-issue
hotfix/performance-regression

🧪 Testing Guidelines

Test Categories

1. Unit Tests

Coverage: 90%+ required
Location: tests/unit/
Framework: pytest

# tests/unit/test_model_export.py
import pytest
from translator.export_from_pytorch import extract_linear_layer

def test_extract_linear_layer():
    """Test linear layer extraction."""
    # Setup
    layer = create_mock_linear_layer()

    # Execute
    result = extract_linear_layer(layer)

    # Assert
    assert "weight" in result
    assert "bias" in result
    assert len(result["weight"]) > 0

2. Integration Tests

Purpose: Test component interactions
Location: tests/integration/
Framework: pytest with fixtures

# tests/integration/test_pipeline.py
def test_model_export_to_verification():
    """Test complete pipeline from export to verification."""
    # Export model
    model_json = export_pytorch_model("test_model.pth")

    # Generate Lean code
    lean_code = generate_lean_code(model_json)

    # Run verification
    result = run_verification(lean_code)

    # Assert results
    assert result.success
    assert len(result.properties) > 0

3. Performance Tests

Purpose: Ensure performance requirements
Location: tests/performance/
Framework: pytest-benchmark

# tests/performance/test_large_models.py
def test_large_model_verification(benchmark):
    """Benchmark large model verification."""
    model = create_large_model(1000000)  # 1M parameters

    def verify_model():
        return run_verification(model)

    result = benchmark(verify_model)

    # Assert performance requirements
    assert result.stats.mean < 60.0  # < 60 seconds
    assert result.stats.max < 120.0  # < 2 minutes

4. Property-Based Tests

Purpose: Test with random inputs
Location: tests/property/
Framework: hypothesis

# tests/property/test_robustness.py
from hypothesis import given, strategies as st

@given(st.lists(st.floats(min_value=-10, max_value=10), min_size=1, max_size=100))
def test_model_robustness(input_data):
    """Test model robustness with random inputs."""
    model = create_test_model()

    # Add small perturbation
    perturbed = [x + 0.01 for x in input_data]

    # Get predictions
    pred1 = model.predict(input_data)
    pred2 = model.predict(perturbed)

    # Assert robustness
    assert abs(pred1 - pred2) < 0.1

Running Tests

# Run all tests
python -m pytest tests/

# Run specific test categories
python -m pytest tests/unit/
python -m pytest tests/integration/
python -m pytest tests/performance/

# Run with coverage
python -m pytest tests/ --cov=translator --cov=webapp --cov-report=html

# Run performance tests
python -m pytest tests/performance/ --benchmark-only

# Run property tests
python -m pytest tests/property/ --hypothesis-profile=ci

🔄 Pull Request Process

Before Submitting

Ensure tests pass:

python -m pytest tests/
python -m pylint translator/ webapp/
python -m mypy translator/ webapp/

Update documentation:
- Update relevant docstrings
- Update user guide if needed
- Update API documentation

Check code quality:

black --check translator/ webapp/
isort --check-only translator/ webapp/
bandit -r translator/ webapp/

PR Guidelines

Title and Description

## Title

feat(translator): add support for vision transformers

## Description

This PR adds comprehensive support for vision transformer models including:

- ViT (Vision Transformer) model export
- Swin Transformer support
- Multi-modal transformer integration
- Vision-specific verification properties

## Changes

- Add `VisionTransformer` structure to Lean definitions
- Implement vision model export in PyTorch translator
- Add vision-specific verification properties
- Update documentation with vision model examples

## Testing

- [x] Unit tests for vision transformer export
- [x] Integration tests for end-to-end pipeline
- [x] Performance tests for large vision models
- [x] Property-based tests for vision robustness

## Breaking Changes

None

## Related Issues

Closes #123
Related to #456

Review Process

Self-review: Review your own code before requesting review
Request review: Assign appropriate reviewers
Address feedback: Respond to all review comments
Update PR: Push additional commits if needed
Merge: Only merge after approval and CI passes

Commit Guidelines

# Make atomic commits
git add translator/vision_export.py
git commit -m "feat(translator): add vision transformer export function"

git add tests/test_vision_export.py
git commit -m "test(translator): add tests for vision transformer export"

git add docs/vision_models.md
git commit -m "docs: add vision transformer documentation"

🐛 Issue Reporting

Bug Reports

## Bug Description

Brief description of the bug

## Steps to Reproduce

1. Load model from `example.json`
2. Run verification with robustness property
3. Observe memory error

## Expected Behavior

Verification should complete successfully

## Actual Behavior

Process crashes with "Out of memory" error

## Environment

- OS: Ubuntu 22.04
- Python: 3.9.12
- Lean: 4.0.0
- Memory: 16GB RAM

## Additional Information

- Model size: 50M parameters
- Error occurs only with large models
- Works fine with models < 10M parameters

Feature Requests

## Feature Description

Support for distributed verification across multiple nodes

## Use Case

Large models (>100M parameters) require significant computational resources

## Proposed Solution

Implement distributed verification with:

- Task sharding across nodes
- Load balancing
- Fault tolerance
- Result aggregation

## Alternatives Considered

- Cloud-based verification (rejected due to cost)
- Memory optimization only (insufficient for very large models)

## Implementation Plan

1. Design distributed architecture
2. Implement node communication
3. Add fault tolerance mechanisms
4. Create monitoring and logging
5. Update documentation

📚 Documentation

Documentation Standards

Python Docstrings

def verify_model_robustness(
    model: NeuralNet,
    epsilon: float,
    delta: float,
    num_samples: int = 1000
) -> RobustnessResult:
    """Verify model robustness against adversarial perturbations.

    This function performs formal verification of model robustness by checking
    that small perturbations to inputs don't cause large changes in outputs.

    Args:
        model: Neural network model to verify.
        epsilon: Maximum allowed input perturbation.
        delta: Maximum allowed output change.
        num_samples: Number of random samples to test.

    Returns:
        RobustnessResult containing verification status and statistics.

    Raises:
        ValueError: If epsilon or delta are negative.
        RuntimeError: If verification fails due to insufficient resources.

    Example:
        >>> model = load_model("model.json")
        >>> result = verify_model_robustness(model, epsilon=0.1, delta=0.05)
        >>> print(result.status)
        'verified'

    Note:
        This function uses formal verification techniques and may take
        significant time for large models.
    """

Lean Documentation

/--
Verify model robustness against adversarial perturbations.

This theorem establishes that a neural network model is robust
against input perturbations of magnitude at most ε, ensuring
output changes are bounded by δ.

## Mathematical Definition

For a model f: ℝⁿ → ℝᵐ, we say f is (ε, δ)-robust if:
∀x, x' ∈ ℝⁿ, ||x - x'||₂ ≤ ε → ||f(x) - f(x')||₂ ≤ δ

## Parameters

- **model**: Neural network to verify
- **epsilon**: Maximum input perturbation
- **delta**: Maximum output change

## Returns

Proof that the model satisfies the robustness property

## Usage

This theorem can be used to establish formal guarantees
about model behavior under adversarial attacks.
-/
theorem modelRobustness (model : NeuralNet) (ε δ : Float) :
  robust model ε δ := by
  -- Proof implementation
  sorry

Documentation Updates

When making changes:

Update docstrings for modified functions
Update user guide for new features
Update API documentation for new endpoints
Update examples to reflect changes
Update README if needed

👥 Community Guidelines

Code of Conduct

We are committed to providing a welcoming and inclusive environment for all contributors. Please:

Be respectful and considerate of others
Be collaborative and open to feedback
Be constructive in criticism and suggestions
Be inclusive and welcoming to new contributors

Communication

GitHub Issues: For bug reports and feature requests
GitHub Discussions: For questions and general discussion
Pull Requests: For code contributions
Discord: For real-time chat (link in README)

Recognition

Contributors are recognized through:

GitHub Contributors page
Release notes for significant contributions
Documentation acknowledgments
Community highlights in blog posts

🎯 Getting Help

Resources

User Guide: Getting started and usage
Developer Guide: Architecture and development
API Reference: Detailed API documentation
Examples: Code examples and tutorials

Support Channels

GitHub Issues: Create an issue
GitHub Discussions: Start a discussion
Discord: Join our community server
Email: Contact the maintainers directly

Mentorship

New contributors can:

Ask for help in GitHub Discussions
Request mentorship from experienced contributors
Start with good first issues labeled as such
Join community calls for guidance

Thank you for contributing to FormalVerifML! 🚀

Your contributions help make formal verification of machine learning models more accessible and reliable for everyone.

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

🤝 Contributing to FormalVerifML

📋 Table of Contents

🚀 Getting Started

Prerequisites

Quick Start

🛠️ Development Setup

Environment Setup

IDE Configuration

VS Code

PyCharm

Docker Development

📏 Code Standards

Python Standards

Style Guide

Example Code

Error Handling

Lean Standards

Style Guide

Example Code

Proof Organization

Git Standards

Commit Messages

Branch Naming

🧪 Testing Guidelines

Test Categories

1. Unit Tests

2. Integration Tests

3. Performance Tests

4. Property-Based Tests

Running Tests

🔄 Pull Request Process

Before Submitting

PR Guidelines

Title and Description

Review Process

Commit Guidelines

🐛 Issue Reporting

Bug Reports

Feature Requests

📚 Documentation

Documentation Standards

Python Docstrings

Lean Documentation

Documentation Updates

👥 Community Guidelines

Code of Conduct

Communication

Recognition

🎯 Getting Help

Resources

Support Channels

Mentorship