AI/ML Engineer with 2+ years of experience designing and deploying production-grade LLM systems, RAG pipelines, and multi-agent architectures serving thousands of concurrent users.
Specialized in LLMOps, MLOps, GPU optimization, and scalable microservices on AWS, Azure, and Kubernetes.
Delivered systems that cut design cycles by 90%, reduced GPU memory usage by 40%, and sustained high-throughput inference across distributed workloads.
- 🧠 LLM Systems & Agents — RAG pipelines, multi-agent orchestration, prompt engineering, guardrails (LangChain, LangGraph, LlamaIndex)
- ⚙️ MLOps & LLMOps — model deployment, inference optimization, CI/CD, observability, distributed tracing
- 🎨 Generative AI — LLM fine-tuning (LoRA, PEFT, QLoRA), Stable Diffusion, SDXL, diffusion pipelines
- 🚀 Scalable Infrastructure — FastAPI, WebSocket, Docker, Kubernetes, Redis, Celery, microservices architecture
- 🧮 GPU Optimization — xFormers, ONNX, TensorRT, quantization, mixed-precision inference, CUDA
- Zedny INC (Cairo, Egypt) — AI & DevOps Engineer architecting production AI platforms on microservices, delivering scalable LLM inference across distributed GPU clusters with 99.9% uptime, end-to-end MLOps/LLMOps pipelines, and Kubernetes-based autoscaling.
- VEEM Solutions (Saudi Arabia) – Developing a large multi-agent system with RAG, tools, templates, and scalable deployment; building the next evolution of brand intelligence (Shrwd.ai).
- Freelance — Delivered 10+ RAG systems, multi-agent architectures, and real-time voice assistants deployed on Azure, AWS, and SaladCloud.
- Built the DataOps LLM Engine — an LLM-powered data operations engine with a 7-layer security architecture (AST validation, sandboxed execution, audit logging) enabling natural-language interaction with Excel, CSV, and pandas DataFrames.
- Architected a large-scale multi-agent LLM system at Shrwd.ai with dynamic tool orchestration, improving answer relevance by 40% and reducing hallucinations by 50%.
- Fine-tuned Stable Diffusion with LoRA for NFT Wear AI, compressing design cycles from 4+ hours to under 5 minutes (95% reduction).
📥 Download CV (PDF)
🌐 GitHub Pages CV site
- Languages: Python, C/C++, Java, C#, Go, Bash, SQL
- LLMs & GenAI: LangChain, LangGraph, LlamaIndex, Hugging Face Transformers, Diffusers, LiteLLM, OpenAI, Anthropic, Google Gemini
- ML & DL: PyTorch, TensorFlow, scikit-learn, Stable Diffusion, SDXL, CLIP, DINOv2, SAM, faster-whisper
- Vector DBs & Retrieval: FAISS, Qdrant, Chroma, Pinecone, hybrid retrieval, re-ranking
- MLOps & LLMOps: CI/CD, model versioning, observability, xFormers, ONNX, TensorRT, quantization
- Cloud & Infra: AWS (EC2 g5.x GPU, S3, IAM), Azure (VMs, Container Apps), SaladCloud, Docker, Kubernetes, Helm, GitHub Actions
- Backend: FastAPI, WebSocket, REST/Async APIs, Redis, Celery, microservices, distributed systems
|
|
|
|
|
|
|
|
|

