feat(lifecycle): add on_turn_start and on_turn_end hooks to RunHooksBase (#2671)#1
feat(lifecycle): add on_turn_start and on_turn_end hooks to RunHooksBase (#2671)#1adityasingh2400 wants to merge 3 commits intomainfrom
Conversation
…ase (openai#2671) Both RunHooksBase and AgentHooksBase get two new hook methods: - on_turn_start(context, agent, turn_number): fires before each LLM call - on_turn_end(context, agent, turn_number): fires after all tool calls for the turn complete (i.e. just before the next-step decision) Turn numbers are 1-indexed and increment each time through the agent loop, regardless of handoffs. The hooks are called in both the sync and streaming code paths. Agent-level hooks on agent.hooks are also called, matching the existing on_tool_start/on_tool_end pattern. Closes openai#2671
📝 WalkthroughWalkthroughAdded TurnControl type and two async lifecycle hooks—on_turn_start and on_turn_end—to RunHooksBase and AgentHooksBase; the run loop now concurrently invokes runner- and agent-level turn hooks at each agent turn and respects a hook-requested "stop" by halting the run. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(135,206,235,0.5)
participant Runner
end
rect rgba(144,238,144,0.5)
participant Agent
end
rect rgba(255,228,181,0.5)
participant Hooks
end
Runner->>Hooks: on_turn_start(context, agent, turn_number)
Runner->>Agent: agent.hooks.on_turn_start(context, agent, turn_number)
Note right of Runner: await asyncio.gather(...)
alt any hook returns "stop"
Hooks-->>Runner: "stop"
Agent-->>Runner: None/"stop"
Runner->>Runner: raise MaxTurnsExceeded / emit QueueCompleteSentinel
Runner-->>Agent: (halt — no LLM call for this turn)
else all continue
Hooks-->>Runner: None/"continue"
Agent-->>Runner: None/"continue"
Runner->>Agent: proceed with turn (LLM/tool calls)
Runner->>Hooks: on_turn_end(context, agent, turn_number)
Runner->>Agent: agent.hooks.on_turn_end(context, agent, turn_number)
Note right of Runner: await asyncio.gather(...)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_turn_lifecycle_hooks.py`:
- Around line 115-119: The tests define async hook methods on_llm_start and
on_llm_end that append literal strings using unnecessary f-strings
(call_order.append(f"llm_start") and call_order.append(f"llm_end")), which
triggers Ruff F541; remove the f-string prefixes and use plain string literals
(call_order.append("llm_start") and call_order.append("llm_end")) in the
on_llm_start and on_llm_end implementations to satisfy the linter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: bdc7731b-45ac-4b18-afdd-c275d8cb7287
📒 Files selected for processing (4)
src/agents/lifecycle.pysrc/agents/run.pysrc/agents/run_internal/run_loop.pytests/test_turn_lifecycle_hooks.py
- Remove unnecessary f-string prefixes from on_llm_start/on_llm_end (Ruff F541) - Add missing docstrings to class methods to improve docstring coverage - Add docstring to OrderTrackingHooks inner class
ff99d90 to
05a4f15
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agents/run.py`:
- Around line 1107-1116: The on_turn_end hook is being emitted prematurely when
run_single_turn yields NextStepInterruption; update the logic around the await
asyncio.gather call that invokes hooks.on_turn_end and
current_agent.hooks.on_turn_end so it skips calling those callbacks if the turn
was interrupted (i.e., detected NextStepInterruption returned from
run_single_turn) and only invokes on_turn_end when the resumed path completes a
non-interrupted step; apply the same conditional guard/change to the analogous
block in run_internal.run_loop (the section around run_loop lines handling
on_turn_end) so sync and streaming behavior match, and add a regression test
exercising interrupt -> resume -> complete to assert on_turn_end is emitted
exactly once at final completion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 5e8532f9-f6ab-4e6f-a350-24009b7e485a
📒 Files selected for processing (4)
src/agents/lifecycle.pysrc/agents/run.pysrc/agents/run_internal/run_loop.pytests/test_turn_lifecycle_hooks.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/agents/run_internal/run_loop.py
- src/agents/lifecycle.py
| await asyncio.gather( | ||
| hooks.on_turn_end(context_wrapper, current_agent, current_turn), | ||
| ( | ||
| current_agent.hooks.on_turn_end( | ||
| context_wrapper, current_agent, current_turn | ||
| ) | ||
| if current_agent.hooks | ||
| else _coro.noop_coroutine() | ||
| ), | ||
| ) |
There was a problem hiding this comment.
Don't emit on_turn_end before an interrupted turn actually finishes.
When run_single_turn(...) yields NextStepInterruption, Line 1107 still fires on_turn_end(...) even though the turn is only paused for approval/resume. The resumed path above (Lines 648-760) then completes that same turn without re-emitting the hook, so interrupted runs observe on_turn_end too early and never at the real completion point.
Please skip this callback for NextStepInterruption and emit it only once the resumed turn reaches a non-interrupted step. The same fix should be mirrored in src/agents/run_internal/run_loop.py:923-931 to keep sync and streaming behavior aligned. A regression test for the interrupt/resume flow would also help here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agents/run.py` around lines 1107 - 1116, The on_turn_end hook is being
emitted prematurely when run_single_turn yields NextStepInterruption; update the
logic around the await asyncio.gather call that invokes hooks.on_turn_end and
current_agent.hooks.on_turn_end so it skips calling those callbacks if the turn
was interrupted (i.e., detected NextStepInterruption returned from
run_single_turn) and only invokes on_turn_end when the resumed path completes a
non-interrupted step; apply the same conditional guard/change to the analogous
block in run_internal.run_loop (the section around run_loop lines handling
on_turn_end) so sync and streaming behavior match, and add a regression test
exercising interrupt -> resume -> complete to assert on_turn_end is emitted
exactly once at final completion.
Address seratch's review feedback on openai#2911: hooks that only observe cannot affect agent loop orchestration. This commit adds a TurnControl return type ('continue' | 'stop') so on_turn_start can now halt the run before the LLM is called for that turn. Changes: - lifecycle.py: on_turn_start now returns Union[TurnControl, None] (None and 'continue' are equivalent; 'stop' halts the loop) - run.py (non-streaming path): checks return value; raises MaxTurnsExceeded with descriptive message on 'stop' - run_internal/run_loop.py (streaming path): checks return value; signals QueueCompleteSentinel on 'stop' - __init__.py: exports TurnControl, RunHooksBase, AgentHooksBase - tests: 4 new test cases covering stop-on-turn-N, stop-on-turn-1, explicit 'continue', and agent-level stop The MaxTurnsExceeded raise on 'stop' keeps behaviour consistent with the existing max_turns limit: callers can catch and inspect .run_data if needed.
05a4f15 to
f1cafcd
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agents/lifecycle.py`:
- Around line 115-118: The docstring for on_turn_start incorrectly claims
raising StopAgentRun will gracefully halt a run, but runners only handle the
hook's return value; remove the exception-based stopping text and only document
returning "stop" (or None/"continue") as the supported mechanism in the
on_turn_start documentation, and ensure the symbol names mentioned are
on_turn_start and StopAgentRun so future implementers know the discrepancy if
runners are later updated.
In `@src/agents/run_internal/run_loop.py`:
- Around line 935-944: The resumed-interrupted-turn path (where
resolve_interrupted_turn(...) completes a previously interrupted turn) never
emits the on_turn_end hooks, so add the same hook invocation used at the bottom
of the loop to the resume path: after resolve_interrupted_turn(...) finishes,
await asyncio.gather(hooks.on_turn_end(context_wrapper, current_agent,
current_turn), (current_agent.hooks.on_turn_end(context_wrapper, current_agent,
current_turn) if current_agent.hooks else _coro.noop_coroutine())) so resumed
runs fire both the global hooks.on_turn_end and the agent-specific
current_agent.hooks.on_turn_end; locate the resume branch around
resolve_interrupted_turn and mirror the existing invocation around on_turn_end
in run_loop.py.
- Around line 823-844: The on_turn_start stop branch currently terminates the
stream directly; instead, mirror the existing max-turns flow by setting
streamed_result._max_turns_handled = True, updating streamed_result.current_turn
and run_state._current_turn/_current_step as you already do, then raise the same
MaxTurnsExceeded exception (the one used by the max-turns handler) so the outer
max-turns handling path processes the stop and produces the unified final output
(preserve the logger.debug and QueueCompleteSentinel enqueueing behavior before
raising so behavior matches the sync path).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 68cf7fac-dc17-4621-858a-5666b3fcf559
📒 Files selected for processing (5)
src/agents/__init__.pysrc/agents/lifecycle.pysrc/agents/run.pysrc/agents/run_internal/run_loop.pytests/test_turn_lifecycle_hooks.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/test_turn_lifecycle_hooks.py
- src/agents/run.py
| Returning ``"stop"`` (or raising :class:`StopAgentRun`) will halt the run | ||
| gracefully — the model is **not** called for this turn and | ||
| :meth:`on_turn_end` is **not** fired. Returning ``None`` or ``"continue"`` | ||
| proceeds normally. |
There was a problem hiding this comment.
Don’t document exception-based stopping unless the runners support it.
These lines say on_turn_start can raise StopAgentRun to halt gracefully, but the reviewed runner changes only check the hook’s return value for "stop". A hook exception here will currently propagate as an error instead of taking the graceful stop path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agents/lifecycle.py` around lines 115 - 118, The docstring for
on_turn_start incorrectly claims raising StopAgentRun will gracefully halt a
run, but runners only handle the hook's return value; remove the exception-based
stopping text and only document returning "stop" (or None/"continue") as the
supported mechanism in the on_turn_start documentation, and ensure the symbol
names mentioned are on_turn_start and StopAgentRun so future implementers know
the discrepancy if runners are later updated.
| run_hook_control, agent_hook_control = await asyncio.gather( | ||
| hooks.on_turn_start(context_wrapper, current_agent, current_turn), | ||
| ( | ||
| current_agent.hooks.on_turn_start( | ||
| context_wrapper, current_agent, current_turn | ||
| ) | ||
| if current_agent.hooks | ||
| else _coro.noop_coroutine() | ||
| ), | ||
| ) | ||
| if run_hook_control == "stop" or agent_hook_control == "stop": | ||
| logger.debug( | ||
| "Turn %s: on_turn_start hook requested stop; halting run.", | ||
| current_turn, | ||
| ) | ||
| streamed_result._max_turns_handled = True | ||
| streamed_result.current_turn = current_turn | ||
| if run_state is not None: | ||
| run_state._current_turn = current_turn | ||
| run_state._current_step = None | ||
| streamed_result._event_queue.put_nowait(QueueCompleteSentinel()) | ||
| break |
There was a problem hiding this comment.
Route hook-requested stops through the existing max-turns path.
Lines 833-844 terminate the stream directly, but they skip the max_turns handler flow already implemented at Lines 746-821. That means a configured max_turns error handler cannot translate on_turn_start(...)->"stop" into a final output, and it also diverges from the sync path in src/agents/run.py:971-988, which raises MaxTurnsExceeded for the same condition.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agents/run_internal/run_loop.py` around lines 823 - 844, The
on_turn_start stop branch currently terminates the stream directly; instead,
mirror the existing max-turns flow by setting streamed_result._max_turns_handled
= True, updating streamed_result.current_turn and
run_state._current_turn/_current_step as you already do, then raise the same
MaxTurnsExceeded exception (the one used by the max-turns handler) so the outer
max-turns handling path processes the stop and produces the unified final output
(preserve the logger.debug and QueueCompleteSentinel enqueueing behavior before
raising so behavior matches the sync path).
| await asyncio.gather( | ||
| hooks.on_turn_end(context_wrapper, current_agent, current_turn), | ||
| ( | ||
| current_agent.hooks.on_turn_end( | ||
| context_wrapper, current_agent, current_turn | ||
| ) | ||
| if current_agent.hooks | ||
| else _coro.noop_coroutine() | ||
| ), | ||
| ) |
There was a problem hiding this comment.
Fire on_turn_end when a resumed interrupted turn finishes too.
This is the only place that emits turn-end hooks, so the resumed path around src/agents/run_internal/run_loop.py:608-705 never calls on_turn_end after resolve_interrupted_turn(...) completes a previously interrupted turn. Resumed runs will therefore miss the hook for exactly the turn whose tool calls just finished.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agents/run_internal/run_loop.py` around lines 935 - 944, The
resumed-interrupted-turn path (where resolve_interrupted_turn(...) completes a
previously interrupted turn) never emits the on_turn_end hooks, so add the same
hook invocation used at the bottom of the loop to the resume path: after
resolve_interrupted_turn(...) finishes, await
asyncio.gather(hooks.on_turn_end(context_wrapper, current_agent, current_turn),
(current_agent.hooks.on_turn_end(context_wrapper, current_agent, current_turn)
if current_agent.hooks else _coro.noop_coroutine())) so resumed runs fire both
the global hooks.on_turn_end and the agent-specific
current_agent.hooks.on_turn_end; locate the resume branch around
resolve_interrupted_turn and mirror the existing invocation around on_turn_end
in run_loop.py.
Summary
Right now there's no way to react to the start or end of an individual agent turn — you can hook into LLM calls and tool calls, but not the turn boundary itself. This matters for things like compacting context between turns, injecting messages, logging per-turn metrics, or cancelling after a specific turn.
This PR adds
on_turn_startandon_turn_endto bothRunHooksBaseandAgentHooksBase.Closes openai#2671
What changed
on_turn_start(context, agent, turn_number)andon_turn_end(context, agent, turn_number)toRunHooksBaseandAgentHooksBaseinlifecycle.pyrun.py) and streaming (run_internal/run_loop.py) code pathsUsage example
Tests
Added
tests/test_turn_lifecycle_hooks.pycovering:on_turn_startfires before the LLM,on_turn_endfires after tool callsagent.hooks) also fireSummary by CodeRabbit
New Features
Tests