A production-ready Next.js + Python application deployed to AWS ECS Fargate with Terraform and GitHub Actions. The Next.js frontend provides cite-mode and answer-mode search interfaces, while the Python search service handles BM25 + vector retrieval with query expansion.
graph TB
User([User]) --> ALB
subgraph AWS Cloud
subgraph VPC
subgraph Public Subnets
ALB[Application Load Balancer]
NAT[NAT Gateway]
end
subgraph Private Subnets
subgraph ECSCluster[ECS Fargate Cluster]
NextJS[Next.js Service<br/>cite-mode Β· answer-mode]
Search[Search Service<br/>BM25 + vector retrieval]
end
end
ALB --> NextJS
NextJS -->|Service Discovery| Search
NextJS --> NAT
Search --> NAT
end
subgraph Supporting Services
ECR[ECR Repositories]
CW[CloudWatch Logs]
S3[(S3<br/>TF State Β· Documents Β· Eval)]
RDS[(RDS PostgreSQL<br/>Query Logs Β· Feedback)]
end
NextJS --> RDS
NextJS --> S3
Search --> S3
NextJS -.-> CW
Search -.-> CW
ECR -.-> NextJS
ECR -.-> Search
end
.
βββ .github/workflows/
β βββ deploy-qa.yml # QA deployment workflow
β βββ deploy-production.yml # Production deployment workflow
β βββ pr-check.yml # Pull request validation
β βββ destroy.yml # Infrastructure teardown
βββ docs/plans/ # Design & implementation docs
βββ evaluation/ # Retrieval & synthesis eval framework
β βββ run-answer-retrieval-eval.ts
β βββ run-answer-synthesis-capture.ts
β βββ run-answer-synthesis-llm-eval.ts
β βββ run-cite-eval.ts
β βββ calibrate-answer-thresholds.ts
β βββ calibrate-cite-thresholds.ts
β βββ diagnostics/ # Eval diagnostic utilities
β βββ lib/ # Shared eval helpers
βββ search-service/ # Python retrieval service
β βββ app/
β βββ main.py # FastAPI entry (BM25 + vector)
β βββ cache_system.py # S3-backed caching
β βββ config.py # Service configuration
β βββ query_expansion.py # Query expansion logic
β βββ routers/ # API route handlers
βββ src/
β βββ app/ # Next.js App Router
β β βββ api/ # API routes
β β β βββ answer/ # Answer-mode endpoints
β β β βββ alignment/ # Alignment endpoints
β β β βββ catalog/ # Catalog endpoints
β β β βββ cite-mode-*/ # Cite-mode feedback & query logs
β β β βββ answer-mode-*/ # Answer-mode feedback & query logs
β β β βββ eval/ # Evaluation endpoints
β β β βββ health/ # Health check
β β β βββ relates/ # Related questions
β β β βββ why/ # Why endpoints
β β βββ components/ # React components
β β β βββ AnswerMode/ # Answer-mode UI
β β β βββ results/ # Results display
β β β βββ Footer/
β β βββ results/ # Results page (cite-mode)
β β βββ utils/ # Client utilities
β βββ config/ # App configuration
β βββ db/ # TypeORM database layer
β β βββ entities/ # DB entities (feedback, query logs)
β β βββ queries/ # Query helpers
β β βββ migrations/ # Database migrations
β βββ lib/ # Server-side libraries
β βββ llamacloud.ts # LlamaCloud integration
β βββ llamaindex-client.ts # LlamaIndex client
β βββ multi-query-strategy.ts # Multi-query retrieval
β βββ catalog-cache.ts # Catalog caching
β βββ eval-storage.ts # Eval data S3 storage
βββ terraform/
β βββ backend-setup/ # Terraform state backend
β βββ infrastructure/ # Main infrastructure (VPC, ECS, ALB, etc.)
β βββ environments/ # Environment configs (qa, production)
βββ Dockerfile # Next.js container
βββ search-service/Dockerfile # Search service container
βββ package.json
- Node.js 20.x or later
- Python 3.12.x or later
- Docker
- Terraform 1.0+
- AWS CLI configured with appropriate credentials
- GitHub account
git clone <repository-url>
cd askwri-app
npm installBefore deploying infrastructure, you need to create the S3 bucket and DynamoDB table for Terraform state:
cd terraform/backend-setup
# Make the script executable
chmod +x setup.sh
# Run setup (uses default values)
./setup.sh
# Or customize with environment variables
AWS_REGION=us-east-2 PROJECT_NAME=askwri-app ./setup.sh-
Create a new GitHub repository
-
Push this code to the repository
-
Create the following branches:
mainorproduction- Production deploymentsqa- QA deployments
-
Add GitHub variables for AWS permissions (Settings β Secrets and variables β Actions -> Variables):
OIDC_ROLE- ARN from AWS console for role GitHubActionsOIDC
The AWS credentials need permissions for:
- ECR (create/push images)
- ECS (manage clusters, services, tasks)
- EC2 (VPC, subnets, security groups, NAT gateways)
- ELB (Application Load Balancers)
- IAM (create roles and policies)
- CloudWatch (logs)
- S3 (Terraform state)
- DynamoDB (Terraform locks)
Push to the appropriate branch to trigger deployment:
# Deploy to QA
git checkout -b qa
git push origin qa
# Deploy to Production
git checkout main
git push origin main# Install dependencies
npm install
# Run development server
npm run dev
# Run tests
npm test
# Build for production
npm run build
# Start production server
npm start# Build image
docker build -t askwri-app .
# Run container
docker run -p 3000:3000 askwri-app- VPC CIDR:
10.0.0.0/16 - Resources (nextJS): 256 CPU / 512 MB Memory
- Resources (python): 1024 CPU / 4096 MB Memory
- Desired count: 1 task
- Auto-scaling: 1-2 tasks (disabled)
- VPC CIDR:
10.1.0.0/16 - Resources (nextJS): 512 CPU / 1024 MB Memory
- Resources (python): 1024 CPU / 4096 MB Memory
- Desired count: 1 tasks
- Auto-scaling: 1-10 tasks (disabled)
| Workflow | Trigger | Description |
|---|---|---|
deploy-qa.yml |
Push to qa branch |
Deploy to QA environment |
deploy-production.yml |
Push to main/production |
Deploy to Production |
pr-check.yml |
Pull requests | Run tests and validate |
destroy.yml |
Manual | Tear down infrastructure |
- Go to Actions β Destroy Infrastructure
- Select the environment (qa or production)
- Type
DESTROYto confirm - Run workflow
cd terraform/backend-setup
chmod +x teardown.sh
./teardown.shNotes:
-
This assumes that documents.csv has already been generated and a list of documents has also been compiled.
-
The KPs are stored in AWS S3 and shared by both QA and production environments, so both environments will be affected (some parts may require service restarts).
-
Update /tmp/askWRI_docs directory with new documents.csv as well as new documents (may require some removals too)
-
rm -rf /tmp/askWRI_cache/*
-
Ensure local
search-service/.envfile contains the same contents as in AWS param store for search-service. Also good to verify root level.envcontains same contents as ASKWRI_APP_ENV contents as well. -
In search-service directory:
pip install -r requirements.txtpython -m uvicorn app.main:app --host 0.0.0.0 --port 8000- This should rebuild the cache directory (/tmp/askWRI_cache)
- Indexing time depends on the number and size of documents; see
search-service/README.mdfor up-to-date details. - When finished, the python code will output
app.main - INFO - Background indexing complete
-
Test changes by running
npm run devfrom root directory -
Run following aws s3 sync commands.
- Note: this requires you have proper AWS_PROFILE setup and have recently run
aws sso login - Note: Following sync commands do not remove files, so any file removals should be done separately, or delete everything with
aws s3 rm --recursive s3://askwri-data/documents/ aws s3 sync /tmp/askWRI_docs s3://askwri-data/documents/aws s3 rm --recursive s3://askwri-data/cache/aws s3 sync /tmp/askWRI_cache s3://askwri-data/cache/
- Note: this requires you have proper AWS_PROFILE setup and have recently run
-
Restart services (both search service and app) to pick up new files from AWS S3 (either by deploying or via AWS Console ECS service)
- CloudWatch Logs (Next.js):
/ecs/askwri-app-{environment} - CloudWatch Logs (Search Service):
/ecs/askwri-app-{environment}-search-service - Container Insights: Enabled on ECS cluster
- Health Check (Next.js):
GET /api/health - Service Discovery: Internal DNS via
{service}.askwri-app-{environment}.local
- VPC with public/private subnet isolation
- NAT Gateways for private subnet internet access
- Security groups limiting traffic
- ECS managed tags propagated to ENIs and runtime resources
- S3 bucket versioning and encryption for Terraform state
- ECR image scanning on push
- Non-root container user
- HTTPS headers configured in Next.js
- Use
FARGATE_SPOTfor non-production workloads - Auto-scaling based on CPU/Memory utilization
- ECR lifecycle policies to clean old images
- Consider reducing NAT Gateway count for non-production
- Update
terraform/environments/{env}.tfvars:
app_environment_variables = {
"MY_VAR" = "my-value"
}- Redeploy
Environment secrets for search service are stored in AWS Param Store and copied to github secrets. Be sure to update both. Param store key is
SEARCH_SERVICE_ENV in JSON format. Github secrets mirror the same key and are expected to be copy/pasted from the AWS Param Store.
Edit terraform/environments/{env}.tfvars:
container_cpu = 512 # 0.5 vCPU
container_memory = 1024 # 1 GB
desired_count = 3-
Deployment fails at ECS service stability
- Check CloudWatch logs
- Verify health check endpoint returns 200
- Check security group rules
-
Terraform state lock error
- Wait for other deployments to complete
- If stuck, manually release lock in DynamoDB
-
Docker build fails
- Ensure all dependencies are in package.json
- Check for missing files in .dockerignore
MIT