Thank you for your interest in contributing to vLLM-lite! This guide will help you get started.
Before building from source, ensure you have:
- Rust 1.75+ - Install via rustup
- CUDA 12.1+ - Optional, for GPU support
- CMake 3.18+ - For building some dependencies
# Verify installation
rustc --version # Should be 1.75 or higher
cargo --version# Clone the repository
git clone https://github.com/pplmx/vllm-lite.git
cd vllm-lite
# Build
cargo build --workspace
# Run tests
cargo test --workspace
# Run the server
cargo run -p vllm-server-
Fork & Clone
- Fork the repository on GitHub
- Clone your fork:
git clone https://github.com/YOUR_USERNAME/vllm-lite.git
-
Create a Branch
git checkout -b feature/your-feature-name # or git checkout -b fix/bug-description -
Make Changes
- Follow the coding standards below
- Add tests for new features
- Keep commits atomic and focused
-
Test Your Changes
# Format check cargo fmt --all --check # Lint cargo clippy --workspace -- -D warnings # Run tests cargo test --workspace
-
Submit a Pull Request
- Push to your fork
- Open a PR against
main - Fill out the PR template
- Formatting: Run
cargo fmt --allbefore committing - Linting:
cargo clippy --workspace -- -D warningsmust pass - Testing: Add tests for new features; all tests must pass
- Documentation: Document public APIs with
///doc comments
<type>(<scope>): <subject>
| Type | Description |
|---|---|
| feat | New feature |
| fix | Bug fix |
| refactor | Code restructuring |
| test | Adding/updating tests |
| docs | Documentation |
| chore | Maintenance |
Example:
feat(scheduler): add decode-priority batching
- Prioritize decode sequences over prefill
- Add max_num_batched_tokens limit
- Fix chunked prefill tracking
vllm-lite/
├── crates/
│ ├── traits/ # Interface definitions
│ ├── core/ # Engine, Scheduler, KV Cache
│ ├── model/ # Model implementations, kernels
│ ├── dist/ # Tensor Parallelism
│ └── server/ # HTTP API
├── tests/ # Integration tests
└── docs/ # Design documents
# Run all tests
cargo test --workspace
# Run specific crate
cargo test -p vllm-core
# Run with output visible
cargo test --workspace -- --nocapture
# Run only fast tests (skip slow/ignored tests)
just nextest- Issues: Open a GitHub issue for bugs or feature requests
- Discussions: Use GitHub Discussions for questions
- Documentation: See README.md and docs/ directory
By contributing, you agree that your contributions will be licensed under the MIT License.