Add Dewey managed RAG pipeline example#2586
Conversation
f95685a to
1e64afa
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f95685a454
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "# Production Document Q&A with Dewey's Managed RAG Backend\n", |
There was a problem hiding this comment.
Register new notebook in registry.yaml
This commit adds examples/dewey_rag_pipeline.ipynb but does not add a matching registry.yaml entry, so the new content will not be discoverable/published on cookbook.openai.com even though the notebook exists in the repo. Please add a registry record for this path in the same change set to keep metadata in sync with content additions.
Useful? React with 👍 / 👎.
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "%pip install meetdewey openai --quiet" |
There was a problem hiding this comment.
Install requests in setup cell
The setup cell installs only meetdewey and openai, but the next import cell uses requests for PDF downloads; in a clean virtual environment this will raise ModuleNotFoundError before ingestion starts. Add requests to the %pip install line (or remove the dependency) so the notebook runs end-to-end from a fresh environment.
Useful? React with 👍 / 👎.
| "print(\"BYOK configured: Dewey will route generation through your OpenAI account.\")\n", | ||
| "print(\"Credit metering is bypassed. deep/exhaustive depths are unlocked.\")" |
There was a problem hiding this comment.
Avoid claiming BYOK is configured when no API call ran
This cell only shows the provider-key creation as comments, but then unconditionally prints that BYOK is configured and deep/exhaustive are unlocked. Users who skip actual key registration will get failures later while believing setup succeeded, so this should either execute/verify the registration or print instructional text that clearly states configuration is still pending.
Useful? React with 👍 / 👎.
1e64afa to
2db7628
Compare
Demonstrates building production document Q&A with Dewey's managed RAG backend alongside the OpenAI Python SDK. Covers: - Uploading PDFs to a Dewey collection - Hybrid BM25 + vector search with citation metadata - Section-aware retrieval (scan titles before loading chunks) - Streaming agentic research endpoint with source attribution - BYOK (bring your own OpenAI key) for cost transparency - RAG chat loop using Dewey retrieval + OpenAI generation
2db7628 to
785edfb
Compare
Summary
Adds a notebook demonstrating how to build production document Q&A using Dewey as a managed RAG backend alongside the OpenAI Python SDK.
Dewey handles the full ingestion pipeline (PDF conversion, section extraction, chunking, embedding) behind a single API, letting developers focus on the application layer rather than infrastructure assembly.
The notebook covers:
gpt-4o-minigenerationNotebook location
examples/dewey_rag_pipeline.ipynbDependencies
meetdewey— Dewey Python SDKopenai— OpenAI Python SDKrequests— for downloading ArXiv PDFs (stdlib-only alternative available)Both are installed via
%pip installat the top of the notebook.