Production-style multimodal retrieval system built with OpenCLIP + Qdrant + FastAPI + Streamlit.
Multimodal Retrieval Lab is an end-to-end image–text retrieval system supporting:
- Text → Image retrieval
- Image → Image similarity search
- Retrieval evaluation (Recall@K, MRR@K, nDCG@K)
- Latency profiling (p50 / p95 / mean)
- Fully containerized deployment (Docker Compose)
| Layer | Technology |
|---|---|
| Embeddings | OpenCLIP (ViT-B/32, OpenAI weights) |
| Vector DB | Qdrant (Cosine similarity, REST API) |
| Backend | FastAPI |
| Frontend | Streamlit |
| Deployment | Docker Compose |
Evaluation performed on test split (full image index).
| Metric | Value |
|---|---|
| Recall@1 | 0.294 |
| Recall@5 | 0.533 |
| Recall@10 | 0.631 |
| MRR@10 | 0.397 |
| nDCG@10 | 0.452 |
| Stage | p50 | p95 |
|---|---|---|
| Encode | ~9 ms | ~11 ms |
| Search | ~24 ms | ~39 ms |
| End-to-End | ~33 ms | ~48 ms |
First request may take ~1–2 minutes due to OpenCLIP model warm-up.
MultimodalNN/
│
├── src/
│ ├── embeddings/
│ ├── qdrant/
│ ├── search/
│ ├── eval/
│ ├── api/
│ └── ui/
│
├── notebooks/
│ └── 01_flickr8k_end2end.ipynb
│
├── docker/
│ ├── api/Dockerfile
│ └── ui/Dockerfile
│
├── docker-compose.yml
├── requirements.txt
├── requirements-dev.txt
└── README.md
docker compose up -d --build| Service | URL |
|---|---|
| UI | http://localhost:8508 |
| API | http://localhost:8008/health |
| Qdrant Dashboard | http://localhost:6334/dashboard |
GET /health
POST /search_text
{
"query": "a dog running on the grass",
"top_k": 5
}
POST /search_image
(form-data: image file)
GET /similar_image/{image_id}?top_k=6
- Download Flickr8k dataset
- Place images in:
data/flickr8k/images/
- Run notebook:
notebooks/01_flickr8k_end2end.ipynb
Notebook performs:
- Dataset preparation
- CLIP embedding generation
- Qdrant indexing
- Retrieval evaluation
- Latency benchmarking
- Artifact export
- Multimodal embeddings engineering
- Vector search architecture
- Retrieval evaluation methodology
- Modular ML system design
- Deployable ML stack
MIT License
Built for ML Engineering Portfolio • 2026