Skip to content

saadtariq-ds/rag-document-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Document Search

A lightweight Retrieval-Augmented Generation (RAG) application that enables semantic document search with LLM-powered answers. The system indexes documents into a vector store, retrieves relevant context using FAISS, and generates accurate responses via a language model — all wrapped in a Streamlit UI.

🚀 Features

  • 📚 Document ingestion and chunking
  • 🔎 Semantic search using vector embeddings
  • 🧠 Retrieval-Augmented Generation (RAG) pipeline
  • ⚡ Fast similarity search with FAISS
  • 🎛️ Interactive frontend built with Streamlit
  • 🧩 Modular and easy-to-extend architecture

🏗️ Architecture Overview

  1. Document Processing
    • Load and split documents into chunks
  2. Embeddings & Vector Store
    • Generate embeddings
    • Store vectors using FAISS
  3. Retrieval
    • Retrieve top-k relevant chunks for a query
  4. Generation
    • Pass retrieved context to an LLM
    • Generate grounded answers
  5. Frontend
    • Streamlit UI for querying documents

🖥️ Tech Stack

  • Python
  • Streamlit – Frontend UI
  • FAISS – Vector store
  • LangChain / LangGraph
  • LLM Provider (OpenAI)
  • Embedding Models (OpenAI)

📦 Installation

git clone https://github.com/saadtariq-ds/rag-document-search.git
cd rag-document-search

Create a virtual environment (recommended)

uv init
uv venv
.venv\Scripts\activate

Install dependencies

uv add -r requirements.txt

🔑 Environment Variables

Create a .env file and add your API keys if required:

OPENAI_API_KEY=your_api_key_here

(Adjust based on the LLM provider you are using.)

▶️ Run the Application

streamlit run app.py

Then open your browser at:

http://localhost:8501

About

A lightweight RAG project that indexes documents into a vector store for semantic search with LLM-powered responses. It processes documents, creates embeddings, retrieves relevant context using FAISS, and generates accurate answers. Modular and easy to extend for learning or rapid prototyping.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages