Agentic RAG Demo System "Try-before-you-build" Retrieval-Augmented Generation demo for showcasing grounded chat flows. It combines: ## Why this repo exists - Interview/demo ready: launch a full LangChain + FastAPI + Next.js stack in minutes to prove you can ship end-to-end agentic workflows. - Reference implementation: show how ingestion, retrieval, agent tooling, and a deterministic calculator can live behind one API. - Jump-off point: the docs call out what’s intentionally simple (auth, observability, guardrails) so contributors can extend it responsibly. ## What you get out of the box - FastAPI backend with LangChain agents, ingestion pipeline, and loan calculator API. - Qdrant vector search for grounded retrieval. - Next.js 15 frontend featuring chat, ingestion console, and amortization tool. If you've never installed a Python or Node project before, follow the steps below exactly—every command is provided. ## Repository Layout text backend/ FastAPI app, LangChain services, ingestion scripts, tests frontend/ Next.js App Router UI (chat, ingestion, calculator) infrastructure/ Docker Compose + deployment helpers data/ Local document sources + embeddings cache (gitignored) docs/ Architecture notes, chat exports, interview prep material scripts/ Convenience scripts for bootstrapping on macOS/Linux/Windows ## Quick Start ### Fast path bash # macOS/Linux ./scripts/bootstrap.sh make dev powershell # Windows PowerShell pwsh -File scripts/bootstrap.ps1 docker compose -f infrastructure/docker-compose.yml up --build The bootstrap script installs both dependency stacks and copies .env templates so you only have to fill in secrets once. make dev (or the docker compose command) brings up Qdrant, the FastAPI backend, and the Next.js frontend in a single process. ### Manual setup (if you prefer step-by-step) 1. Clone and pick a shell bash git clone <repo-url> cd RAG-System 2. Install backend dependencies bash cd backend uv sync # installs everything declared in pyproject cp .env.example .env # then add your keys (see below) > No uv yet? Install it with pip install uv or download from https://github.com/astral-sh/uv. 3. Install frontend dependencies bash cd ../frontend npm install cp .env.example .env.local # create if it doesn't exist Set NEXT_PUBLIC_API_BASE_URL=http://localhost:8000 inside .env.local for local dev. 4. Start the stack (two terminals if running outside Docker compose) - Backend/Qdrant bash cd backend docker compose -f ../infrastructure/docker-compose.yml up -d qdrant uv run uvicorn app.main:app --reload - Frontend bash cd frontend npm run dev Visit http://localhost:3000 for the UI and http://localhost:8000/docs for the FastAPI Swagger docs. 5. Chat + ingest - Navigate to Chat to ask policy questions; responses include citations like [S1]. - Go to Ingest to point at any folder under ../data/sources and stream chunk/embedding counts. - Open Loan Calculator to generate deterministic repayment numbers during demos. ## Environment Variables Create backend/.env using the template provided: env OPENAI_API_KEY=sk-... QDRANT_URL=http://localhost:6333 QDRANT_API_KEY= # leave blank for local docker EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 FRONTEND_ORIGINS=http://localhost:3000 Frontend needs NEXT_PUBLIC_API_BASE_URL, optionally pointing to a deployed backend. ### Bring your own data + keys - No documents are shipped with the repo—drop redacted PDFs/HTML under data/sources/* before ingesting. - Secrets live in .env / .env.local; never commit them. - The stack intentionally skips auth/rate-limiting/secrets management. Add a gateway (e.g., API keys, NextAuth, Clerk) before exposing it beyond trusted users. ## Manual Workflow (if Make is unavailable) bash # 1. Start Qdrant (docker compose or managed instance) docker compose -f infrastructure/docker-compose.yml up -d qdrant # 2. Run FastAPI with hot reload cd backend uv run uvicorn app.main:app --reload # 3. Run Next.js dev server cd ../frontend npm run dev make dev wraps those commands if GNU Make is available. ## Production caveats - Security & auth: demo-only—no authentication, authorization, or rate limiting is provided. Wrap the FastAPI/Next.js surfaces in your own identity solution before real users touch it. - Guardrails: responses are unfiltered beyond the RAG context. Add Pydantic/Guardrails or LangChain validation before using this in regulated flows. - Observability: logs are local; instrument LangSmith, Prometheus, OpenTelemetry, etc., if you need traces/metrics. - Costs/quotas: You own the OpenAI/Azure usage; mimic the .env.example format for whichever model/key you prefer. ## Share your local demo (free tunnels) You can keep everything running on your own machine and still hand out a public URL using free tunneling services: 1. Expose the backend with ngrok bash ngrok http http://localhost:8000 Copy the https://*.ngrok-free.app URL that ngrok prints. Update frontend/.env.local so NEXT_PUBLIC_API_BASE_URL points to that URL. If you're already running the FastAPI server it will start honoring remote calls immediately. 2. Expose the frontend with Localtonet powershell localtonet http --port 3000 (Use the CLI from https://localtonet.com/—a free account gives you one HTTP tunnel.) Share the generated link with viewers. 3. Allow cross-origin requests In backend/.env, append your Localtonet hostname to FRONTEND_ORIGINS, for example: env FRONTEND_ORIGINS=http://localhost:3000,https://glowing-owl.localtonet.com With those three tweaks you get a fully remote-friendly demo at zero cost: the frontend is reachable via Localtonet, every button still talks to the backend through the ngrok URL, and FastAPI's CORS settings allow both domains. ## Roadmap / good first issues 1. LangGraph agent planner – replace the single-step LangChain chain with a graph-based retrieval/decision loop. 2. Guardrails & evals – wire in Pydantic/Guardrails validation plus RAGAS-style evaluation notebooks. 3. Telemetry & analytics – stream LangChain callbacks to LangSmith or Prometheus and plot ingestion/chat metrics. 4. Auth & rate limits – add API key middleware on FastAPI and NextAuth/Clerk on the frontend. 5. Frontend tests – extend Vitest beyond the smoke test (ChatPanel, DocumentUpload, LoanCalculator). 6. Ingestion observability – persist chunk/file counts and timestamps so the UI can show ingestion history. ## Ingestion CLI You can ingest documents straight from the terminal; the UI calls the same pipeline. bash cd backend # Default: uses data/sources directory uv run python -m app.ingestion.loader # Target a specific folder uv run python -m app.ingestion.loader --path ../data/sources/demo-docs The job logs how many files were processed, skipped, and how many chunks were written to Qdrant. ## Testing & Linting - Backend tests: cd backend && uv run pytest - Frontend tests: cd frontend && npm run test - Frontend lint: cd frontend && npm run lint ## Troubleshooting - "Module not found" in frontend → ensure npm install ran inside the frontend folder and restart npm run dev. - FastAPI cannot reach Qdrant → confirm docker container is running: docker ps | grep qdrant. Update QDRANT_URL if using a remote instance. - Permission errors on ingestion path (Windows) → use absolute paths (e.g., E:\AI\rag-demo\...) or keep documents under the provided data/sources hierarchy. - Citations missing file names → rerun ingestion after updating PDFs/HTML so the new metadata is stored in Qdrant. ## Additional Reading - docs/architecture.md – system diagram, component responsibilities, future roadmap. Bring your own OpenAI (or Azure OpenAI) key, drop approved documents under data/sources, and you can demo a full agentic workflow—chat, ingestion telemetry, and calculator—within a few minutes.
If this project helps you, you can support my work here: https://buymeacoffee.com/chadpkeith