豆豆友情提示:这是一个非官方 GitHub 代理镜像,主要用于网络测试或访问加速。请勿在此进行登录、注册或处理任何敏感信息。进行这些操作请务必访问官方网站 github.com。 Raw 内容也通过此代理提供。
Skip to content

feat(streaming): emit ReasoningDeltaEvent for reasoning/thinking deltas (#825)#3

Open
adityasingh2400 wants to merge 2 commits intomainfrom
feat/reasoning-delta-stream-event-825
Open

feat(streaming): emit ReasoningDeltaEvent for reasoning/thinking deltas (#825)#3
adityasingh2400 wants to merge 2 commits intomainfrom
feat/reasoning-delta-stream-event-825

Conversation

@adityasingh2400
Copy link
Copy Markdown
Owner

@adityasingh2400 adityasingh2400 commented Apr 14, 2026

Summary

When models like o3 or DeepSeek-R1 produce reasoning/thinking tokens during streaming, those deltas currently only surface as raw RawResponsesStreamEvent wrappers around low-level response.reasoning_summary_text.delta or response.reasoning_text.delta events. To consume them, callers have to inspect .data.type and cast the event themselves — there's no clean signal in the StreamEvent union.

This PR adds ReasoningDeltaEvent to StreamEvent and emits it alongside the existing raw event so reasoning deltas are as easy to consume as message deltas.

Closes openai#825

What changed

  • Added ReasoningDeltaEvent dataclass to stream_events.py with delta, snapshot, and type fields
  • Updated StreamEvent type alias to include ReasoningDeltaEvent
  • Exported from agents/__init__.py
  • In run_internal/run_loop.py, the run_single_turn_streamed loop now emits a ReasoningDeltaEvent after each ResponseReasoningSummaryTextDeltaEvent (o-series) and ResponseReasoningTextDeltaEvent (DeepSeek/LiteLLM)
  • The snapshot field accumulates the full reasoning text so far in the turn, so callers don't have to maintain their own buffer
  • Raw events are still emitted unchanged — fully backwards compatible

Usage example

from agents import Agent, Runner
from agents.stream_events import ReasoningDeltaEvent

agent = Agent(name="thinker", model="o3-mini")
result = Runner.run_streamed(agent, "prove P != NP")

async for event in result.stream_events():
    if isinstance(event, ReasoningDeltaEvent):
        print(event.delta, end="", flush=True)

print()  # reasoning complete

Tests

Added tests/test_reasoning_delta_stream_event.py covering:

  • ReasoningDeltaEvent is emitted for reasoning items
  • Snapshot grows monotonically and ends with full text
  • No event emitted for plain text responses
  • Raw events still emitted alongside
  • Importable directly from agents
  • Correct dataclass fields

Also updated tests/test_stream_events.py::test_complete_streaming_events to account for the new event in the event sequence (count goes from 27 → 28).

Summary by CodeRabbit

  • New Features

    • Added ReasoningDeltaEvent to stream incremental reasoning updates with delta and snapshot fields; emitted during agent streaming when reasoning is produced.
  • Tests

    • Added and updated tests to validate reasoning-delta emission, snapshot accumulation, and revised streaming event sequence expectations.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

A new ReasoningDeltaEvent stream event type was added, exported from the package, produced by the streaming run loop as reasoning-text deltas arrive (with per-delta delta and accumulated snapshot), and covered by new and updated tests.

Changes

Cohort / File(s) Summary
Event Type Definition
src/agents/stream_events.py
Add ReasoningDeltaEvent dataclass (delta: str, snapshot: str, type: "reasoning_delta") and include it in the StreamEvent union.
Package Exports
src/agents/__init__.py
Import and add ReasoningDeltaEvent to __all__, exposing the new event at package top-level.
Streaming Loop Implementation
src/agents/run_internal/run_loop.py
Detect ResponseReasoningTextDeltaEvent/ResponseReasoningSummaryTextDeltaEvent, maintain _reasoning_snapshot, and enqueue ReasoningDeltaEvent(delta=..., snapshot=...) for each reasoning delta; reset snapshot on ResponseCreatedEvent.
Test Coverage
tests/test_reasoning_delta_stream_event.py
New async tests asserting emission, fields (delta, snapshot, type), monotonic snapshot accumulation, coexistence with other events, and top-level importability.
Stream Event Test Updates
tests/test_stream_events.py
Adjust expected streamed-event sequence and counts (total events increased from 27 to 28) to account for the new ReasoningDeltaEvent and shifted indices.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Runner
    participant Agent
    participant Model
    participant StreamLoop
    participant EventQueue

    Client->>Runner: run_streamed()
    Runner->>Agent: start execution (streaming)
    Agent->>Model: request reasoning + response
    Model-->>Agent: ResponseReasoningTextDeltaEvent / ResponseReasoningSummaryTextDeltaEvent
    Agent->>StreamLoop: deliver response events
    loop for each reasoning delta
        StreamLoop->>StreamLoop: append event.delta → _reasoning_snapshot
        StreamLoop->>EventQueue: enqueue ReasoningDeltaEvent(delta, snapshot)
    end
    EventQueue-->>Client: stream_events() yields ReasoningDeltaEvent and other events
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nibble deltas, stitch thought by bit,
Snapshots grow gentle, piece after bit,
From model to queue my whiskers hop,
Reasoning stitched—no fragment will drop,
A rabbit applause for the streaming crop.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding a ReasoningDeltaEvent for streaming reasoning/thinking deltas, which is the primary objective across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/reasoning-delta-stream-event-825

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agents/run_internal/run_loop.py`:
- Around line 1113-1114: The _reasoning_snapshot field is never cleared across
stream retries, causing ReasoningDeltaEvent.snapshot to include duplicated text;
import ResponseCreatedEvent and, inside the event loop where stream events are
handled (the loop that processes events from stream_response_with_retry /
get_stream after rewind()), detect events of type ResponseCreatedEvent and reset
_reasoning_snapshot = "" when such an event is received to mark a fresh response
attempt; ensure you reference and update the existing _reasoning_snapshot
variable (not a new local) so subsequent ReasoningDeltaEvent handling produces a
clean snapshot for the new stream.

In `@tests/test_reasoning_delta_stream_event.py`:
- Around line 100-104: The test currently breaks out when a ReasoningDeltaEvent
is seen but does nothing if none are emitted; update the test to explicitly fail
when no ReasoningDeltaEvent is observed by either setting a flag (e.g.,
seen_reasoning = False) and asserting seen_reasoning is True after the async for
loop, or by using an else branch on the loop to raise an AssertionError;
reference the async iterator result.stream_events(), the ReasoningDeltaEvent
type check and the event.type assertion to locate where to add the post-loop
failure assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: da5ae3ef-04e2-4c5c-a0aa-b83bf9abdb96

📥 Commits

Reviewing files that changed from the base of the PR and between 5c9fb2c and 531b9ea.

📒 Files selected for processing (5)
  • src/agents/__init__.py
  • src/agents/run_internal/run_loop.py
  • src/agents/stream_events.py
  • tests/test_reasoning_delta_stream_event.py
  • tests/test_stream_events.py

Comment thread src/agents/run_internal/run_loop.py
Comment thread tests/test_reasoning_delta_stream_event.py
adityasingh2400 pushed a commit that referenced this pull request Apr 14, 2026
- Import ResponseCreatedEvent and reset _reasoning_snapshot to "" when
  a ResponseCreatedEvent is received inside the retry stream loop, fixing
  the bug where snapshot text would be duplicated across retries
- In test_reasoning_delta_event_type_field: add found=False flag and
  assert found after the loop so the test properly fails when no
  ReasoningDeltaEvent is emitted
Ubuntu and others added 2 commits April 16, 2026 16:08
…as (openai#825)

Add a new ReasoningDeltaEvent to StreamEvent so callers can react to
reasoning/thinking tokens in real time without unpacking low-level raw
response events.

The event is emitted whenever a ResponseReasoningSummaryTextDeltaEvent
(o-series extended thinking via the Responses API) or a
ResponseReasoningTextDeltaEvent (third-party models like DeepSeek-R1
via LiteLLM) passes through the stream.  The underlying
RawResponsesStreamEvent is still emitted as well, so nothing breaks for
consumers that already inspect raw events.

Fields:
  delta    - the incremental text fragment from this chunk
  snapshot - full accumulated reasoning text so far in this turn
  type     - always 'reasoning_delta'

Closes openai#825
- Import ResponseCreatedEvent and reset _reasoning_snapshot to "" when
  a ResponseCreatedEvent is received inside the retry stream loop, fixing
  the bug where snapshot text would be duplicated across retries
- In test_reasoning_delta_event_type_field: add found=False flag and
  assert found after the loop so the test properly fails when no
  ReasoningDeltaEvent is emitted
@adityasingh2400 adityasingh2400 force-pushed the feat/reasoning-delta-stream-event-825 branch from 01d8b3d to 996be13 Compare April 16, 2026 16:09
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_reasoning_delta_stream_event.py`:
- Around line 59-70: The test currently allows a vacuous pass when no
ReasoningDeltaEvent snapshots are emitted; update the test around the snapshots
collection from result.stream_events() to require at least one snapshot before
performing length-order and content checks: after collecting snapshots (variable
snapshots) add an assertion that snapshots is not empty (e.g., assert snapshots,
"no reasoning snapshots emitted") so the subsequent loop and final check that
"Hello world" appears in snapshots[-1] will fail if no ReasoningDeltaEvent
objects were produced.
- Around line 82-85: The test currently only asserts that no individual event is
a ReasoningDeltaEvent but doesn't ensure the stream produced any events; update
the test that uses result.stream_events() to also verify the stream yielded at
least one event (e.g., accumulate events or increment a counter while iterating)
and assert the collected events list length (or counter) is greater than zero,
while still asserting none of the yielded events are instances of
ReasoningDeltaEvent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b02a6b46-9175-4392-97c4-428c546aef1f

📥 Commits

Reviewing files that changed from the base of the PR and between 531b9ea and 996be13.

📒 Files selected for processing (5)
  • src/agents/__init__.py
  • src/agents/run_internal/run_loop.py
  • src/agents/stream_events.py
  • tests/test_reasoning_delta_stream_event.py
  • tests/test_stream_events.py
✅ Files skipped from review due to trivial changes (1)
  • src/agents/run_internal/run_loop.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/agents/init.py

Comment on lines +59 to +70
snapshots: list[str] = []
async for event in result.stream_events():
if isinstance(event, ReasoningDeltaEvent):
snapshots.append(event.snapshot)

# Each snapshot must be at least as long as the previous one
for i in range(1, len(snapshots)):
assert len(snapshots[i]) >= len(snapshots[i - 1])

# Last snapshot must contain the full reasoning text
if snapshots:
assert "Hello world" in snapshots[-1]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid vacuous pass when no reasoning snapshots are emitted.

Line 69 currently guards the final assertion with if snapshots, so the test passes even when zero ReasoningDeltaEvent objects are produced.

✅ Minimal hardening diff
     snapshots: list[str] = []
     async for event in result.stream_events():
         if isinstance(event, ReasoningDeltaEvent):
             snapshots.append(event.snapshot)

+    assert snapshots, "Expected at least one ReasoningDeltaEvent snapshot"
+
     # Each snapshot must be at least as long as the previous one
     for i in range(1, len(snapshots)):
         assert len(snapshots[i]) >= len(snapshots[i - 1])

     # Last snapshot must contain the full reasoning text
-    if snapshots:
-        assert "Hello world" in snapshots[-1]
+    assert "Hello world" in snapshots[-1]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
snapshots: list[str] = []
async for event in result.stream_events():
if isinstance(event, ReasoningDeltaEvent):
snapshots.append(event.snapshot)
# Each snapshot must be at least as long as the previous one
for i in range(1, len(snapshots)):
assert len(snapshots[i]) >= len(snapshots[i - 1])
# Last snapshot must contain the full reasoning text
if snapshots:
assert "Hello world" in snapshots[-1]
snapshots: list[str] = []
async for event in result.stream_events():
if isinstance(event, ReasoningDeltaEvent):
snapshots.append(event.snapshot)
assert snapshots, "Expected at least one ReasoningDeltaEvent snapshot"
# Each snapshot must be at least as long as the previous one
for i in range(1, len(snapshots)):
assert len(snapshots[i]) >= len(snapshots[i - 1])
# Last snapshot must contain the full reasoning text
assert "Hello world" in snapshots[-1]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_reasoning_delta_stream_event.py` around lines 59 - 70, The test
currently allows a vacuous pass when no ReasoningDeltaEvent snapshots are
emitted; update the test around the snapshots collection from
result.stream_events() to require at least one snapshot before performing
length-order and content checks: after collecting snapshots (variable snapshots)
add an assertion that snapshots is not empty (e.g., assert snapshots, "no
reasoning snapshots emitted") so the subsequent loop and final check that "Hello
world" appears in snapshots[-1] will fail if no ReasoningDeltaEvent objects were
produced.

Comment on lines +82 to +85
async for event in result.stream_events():
assert not isinstance(event, ReasoningDeltaEvent), (
"Got unexpected ReasoningDeltaEvent for a plain text response"
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assert that the stream produced events in the negative-case test.

Line 82–85 verifies event type but not stream liveness. A fully empty stream would incorrectly pass this test.

✅ Minimal hardening diff
-    async for event in result.stream_events():
+    saw_event = False
+    async for event in result.stream_events():
+        saw_event = True
         assert not isinstance(event, ReasoningDeltaEvent), (
             "Got unexpected ReasoningDeltaEvent for a plain text response"
         )
+    assert saw_event, "Expected at least one streamed event"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async for event in result.stream_events():
assert not isinstance(event, ReasoningDeltaEvent), (
"Got unexpected ReasoningDeltaEvent for a plain text response"
)
saw_event = False
async for event in result.stream_events():
saw_event = True
assert not isinstance(event, ReasoningDeltaEvent), (
"Got unexpected ReasoningDeltaEvent for a plain text response"
)
assert saw_event, "Expected at least one streamed event"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_reasoning_delta_stream_event.py` around lines 82 - 85, The test
currently only asserts that no individual event is a ReasoningDeltaEvent but
doesn't ensure the stream produced any events; update the test that uses
result.stream_events() to also verify the stream yielded at least one event
(e.g., accumulate events or increment a counter while iterating) and assert the
collected events list length (or counter) is greater than zero, while still
asserting none of the yielded events are instances of ReasoningDeltaEvent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Yield reasoning delta in the response or add hooks to handle

1 participant