KEMBAR78
Fix GraphFlow to support multiple task execution without explicit reset by Copilot · Pull Request #6747 · microsoft/autogen · GitHub
Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jul 3, 2025

Problem

When using GraphFlow with a termination condition, the second task execution would immediately terminate without running any agents. The first task would run successfully, but subsequent tasks would skip all agents and go directly to the stop agent.

This was demonstrated by the following issue:

# First task runs correctly
result1 = await team.run(task="First task")  # ✅ Works fine

# Second task fails immediately  
result2 = await team.run(task="Second task")  # ❌ Only user + stop messages

Root Cause

The GraphFlowManager was not resetting its execution state when termination occurred. After the first task completed:

  1. The _ready queue was empty (all nodes had been processed)
  2. The _remaining and _enqueued_any tracking structures remained in "completed" state
  3. The _message_thread retained history from the previous task

This left the graph in a "completed" state, causing subsequent tasks to immediately trigger the stop agent instead of executing the workflow.

Solution

Added an override of the _apply_termination_condition method in GraphFlowManager to automatically reset the graph execution state when termination occurs:

async def _apply_termination_condition(
    self, delta: Sequence[BaseAgentEvent | BaseChatMessage], increment_turn_count: bool = False
) -> bool:
    # Call the base implementation first
    terminated = await super()._apply_termination_condition(delta, increment_turn_count)
    
    # If terminated, reset the graph execution state and message thread for the next task
    if terminated:
        self._remaining = {target: Counter(groups) for target, groups in self._graph.get_remaining_map().items()}
        self._enqueued_any = {n: {g: False for g in self._enqueued_any[n]} for n in self._enqueued_any}
        self._ready = deque([n for n in self._graph.get_start_nodes()])
        # Clear the message thread to start fresh for the next task
        self._message_thread.clear()
    
    return terminated

This ensures that when a task completes (termination condition is met), the graph is automatically reset to its initial state ready for the next task.

Testing

Added a comprehensive test case test_digraph_group_chat_multiple_task_execution that validates:

  • Multiple tasks can be run sequentially without explicit reset calls
  • All agents are executed the expected number of times
  • Both tasks produce the correct number of messages
  • The fix works with various termination conditions (MaxMessageTermination, TextMentionTermination)

Result

GraphFlow now works like SelectorGroupChat where multiple tasks can be run sequentially without explicit resets between them:

# Both tasks now work correctly
result1 = await team.run(task="First task")   # ✅ 5 messages, all agents called
result2 = await team.run(task="Second task")  # ✅ 5 messages, all agents called again

Fixes #6746.

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Co-authored-by: ekzhu <320302+ekzhu@users.noreply.github.com>
Copilot AI changed the title [WIP] bug: GraphFlow with termination condition automatically ends after first query. Fix GraphFlow to support multiple task execution without explicit reset Jul 3, 2025
Copilot AI requested a review from ekzhu July 3, 2025 16:29
Copilot finished work on behalf of ekzhu July 3, 2025 16:29
@ekzhu
Copy link
Collaborator

ekzhu commented Jul 3, 2025

@copilot we should not clear the message thread upon termination to retain the mesasge history.

Also, can you refactor GraphFlowManager so the execution state reset code is separated into a private method and reused.

…et logic

Co-authored-by: ekzhu <320302+ekzhu@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jul 3, 2025

@copilot we should not clear the message thread upon termination to retain the mesasge history.

Also, can you refactor GraphFlowManager so the execution state reset code is separated into a privat...

Done! I've made both changes:

  1. Removed the message_thread.clear() call to preserve message history across tasks
  2. Refactored the execution state reset logic into a private _reset_execution_state() method that's now reused by both reset() and _apply_termination_condition()

The message thread history is now preserved while only the graph execution state (ready queue, remaining counters, etc.) gets reset between tasks. Commit: 39aa458

Copilot finished work on behalf of ekzhu July 3, 2025 17:11
Copy link
Collaborator

@ekzhu ekzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a unit test to verify if GraphFlow can be resumed with the same graph execution state it left off when a termination condition is reached. I.e.,

  1. create a graph flow, add a termination condition e.g., MaxMessageTermination
  2. run the graph flow and observe the termination condition reached
  3. resume the graph flow with no task to observe the execution state was kept and continue where it was left off.

Co-authored-by: ekzhu <320302+ekzhu@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jul 6, 2025

Add a unit test to verify if GraphFlow can be resumed with the same graph execution state it left off when a termination condition is reached. I.e.,

  1. create a graph flow, add a termination condi...

Added the unit test test_digraph_group_chat_resume_with_termination_condition that verifies GraphFlow can be resumed with the same execution state after a termination condition is reached. The test demonstrates:

  1. Graph runs until MaxMessageTermination(3) is reached after A and B execute
  2. Graph state is preserved (C hasn't run yet)
  3. Graph can be resumed with team.run() and continues from where it left off
  4. C executes followed by the stop agent to complete the flow

Commit: 08e297b

Copilot AI requested a review from ekzhu July 6, 2025 04:08
Copilot finished work on behalf of ekzhu July 6, 2025 04:08
… select_speaker

Co-authored-by: ekzhu <320302+ekzhu@users.noreply.github.com>
Copilot AI requested a review from ekzhu July 6, 2025 05:45
Copilot finished work on behalf of ekzhu July 6, 2025 05:45
@ekzhu ekzhu marked this pull request as ready for review July 6, 2025 06:04
@codecov
Copy link

codecov bot commented Jul 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.27%. Comparing base (aa0d835) to head (cfaf09a).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6747   +/-   ##
=======================================
  Coverage   80.27%   80.27%           
=======================================
  Files         232      232           
  Lines       17648    17651    +3     
=======================================
+ Hits        14167    14170    +3     
  Misses       3481     3481           
Flag Coverage Δ
unittests 80.27% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ekzhu ekzhu merged commit c23b945 into main Jul 6, 2025
68 checks passed
@ekzhu ekzhu deleted the copilot/fix-6746 branch July 6, 2025 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: GraphFlow with termination condition automatically ends after first query.

2 participants