Add: Eliminating mathematical hallucinations with deterministic tool use#2599
Add: Eliminating mathematical hallucinations with deterministic tool use#2599michaelwinczuk wants to merge 1 commit intoopenai:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 760ea983e3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| " tools=[math_tool],\n", | ||
| " tool_choice=\"auto\",\n", | ||
| " temperature=0,\n", |
There was a problem hiding this comment.
Force math tool invocation in deterministic path
The “WITH SYMPY TOOL (deterministic computation)” call uses tool_choice="auto", which still allows the model to skip the tool and answer from token prediction (your own else branch already handles [NO TOOL USED]). That means this path cannot guarantee deterministic/no-hallucination behavior and can regress to incorrect answers for some prompts; use a required/specific function tool choice for this section.
Useful? React with 👍 / 👎.
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "# Eliminating Mathematical Hallucinations with Deterministic Tool Use\n", |
There was a problem hiding this comment.
Add registry entry for the new notebook
This commit introduces a new cookbook notebook but does not update registry.yaml, so the publication pipeline will not index/render this page on cookbook.openai.com. Per this repo’s metadata workflow, new content must be added to the registry in the same change to avoid shipping an effectively hidden example.
Useful? React with 👍 / 👎.
Summary
A practical notebook showing how to eliminate LLM mathematical hallucinations by routing computation to SymPy via tool use.
What it covers
compute_mathtool using SymPyWhy it belongs in the cookbook
Developing_hallucination_guardrails.ipynbShort, practical, copy-pasteable. Links to Math Swarm for the full 1,079-test, 12-category implementation.