KEMBAR78
Add `RollbackToolRegistry` in the `Persistency` feature in order to roll back tool calls with side effects when checkpointing. Make `AIAgent` state-manageable and introduce `AIAgentService` to manage multiple uniform running agents. Deprecate concurrent unsafe `AIAgent.asTool` in favor of `AIAgentService.createAgentTool` by Ololoshechkin · Pull Request #873 · JetBrains/koog · GitHub
Skip to content

Conversation

Ololoshechkin
Copy link
Collaborator

@Ololoshechkin Ololoshechkin commented Sep 26, 2025

  1. Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing:
val agent = AIAgent(
    toolRegistry = ToolRegistry {
        tool(::createUser)
        tool(::sendMessage)
        tool(::inviteMember)
    },
    ...
) {
    install(Persistency) {
        storage = myStateStorageImpl
    
        rollbackToolRegistry = RollbackToolRegistry {
            // TOOL -> "REVERSE" TOOL
            
            registerRollback(::createUser, ::deleteUser)
            registerRollback(::sendMessage, ::undoMessage)
            registerRollback(::inviteMember, ::revokeInvitation)
        }
    }
}
  1. Now you can also manage (create, run, find, access running state) multiple AI Agents using AIAgentService :
val agentService = AIAgentService(
    toolRegistry = ToolRegistry {
        tool(::createUser)
        tool(::sendMessage)
        tool(::inviteMember)
    },
    ...
) {
    install(Persistency) {
        storage = myStateStorageImpl
    
        rollbackToolRegistry = RollbackToolRegistry {
            registerRollback(::createUser, ::deleteUser)
            registerRollback(::sendMessage, ::undoMessage)
            registerRollback(::inviteMember, ::revokeInvitation)
        }
    }
}

And then you can create instances of AIAgent, and manage their running state. This is particularly useful to rollback long-running operations if user realizes that agent took the wrong direction:

// user creates new agent:
post("/agent") {
    val input = call.receive<String>()

    val agent = agentService.createAgent()

    launch {
        agent.run(input
    }
    
    call.respond(agent.id)
}

// user checks the agent's state:
get("/agent") {
    val id = call.receive<String>()
    val agent = agentService.agentById(id)

    if (!agent.finished()) call.respondText("Agent is still running...")
    else call.respond(agent.resultIfReady()!!)
}

// user asks running agent to rollback:
data class RollbackRequest(val agentId: String, val checkpoint: String)

put("/agent/rollback") {
    val userRequest = call.receive<RollbackRequest>()
    val agent = agentService.agentById(userRequest.agentId)

    if (agent.finished()) {
         call.respondText("Agent has already finished!")
    } else {
        // Rolling back agent to a checkpoint
        agent.withRunningContext {
            withPersistency(this) { ctx ->
                rollbackToCheckpoint(userRequest.checkpoint, ctx)
            }
        }
    
        call.respond(HttpStatusCode.OK)
    }
}
  1. Make AIAgent explicitly single-run. Previous semantic was: AIAgent.run can be called multiple times, but if called in parallel -- throws exception that agent is currently running.
    This was creating a hard to manage contract for run() and was very error-prone (see example of such errors below in 4)

  2. Fix Agent.asTool for parallel use AIAgent.asTool() fails with parallelTools due to agent is already running. #864 . Previously -- AIAgentTool was holding an instance of AIAgent and running it. When LLM decided to call tools in parallel, because of the previous contract of AIAgent it was failing in runtime with exception. Hence, AIAgent.asTool was broken.
    Now -- intended usage is AIAgentService.createAgentTool().
    AIAgent.asTool is now deprecated and is currently working correctly via AIAgentService.fromAgent(this).createAgentTool() (i.e. it creates an instance of AIAgentService from the current AIAgent, then creates a tool that holds a new copy of AIAgent).

Motivation and Context

Breaking Changes


Type of the changes

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Tests improvement
  • Refactoring

Checklist

  • The pull request has a description of the proposed change
  • I read the Contributing Guidelines before opening the pull request
  • The pull request uses develop as the base branch
  • Tests for the changes have been added
  • All new and existing tests passed
Additional steps for pull requests adding a new feature
  • An issue describing the proposed change exists
  • The pull request includes a link to the issue
  • The change was discussed and approved in the issue
  • Docs have been added / updated

@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make agent statales and introduce session objects Sep 27, 2025
@github-actions
Copy link

github-actions bot commented Sep 27, 2025

Qodana for JVM

862 new problems were found

Inspection name Severity Problems
Check Kotlin and Java source code coverage 🔶 Warning 838
Vulnerable imported dependency 🔶 Warning 17
Unused import directive 🔶 Warning 6
Missing KDoc for public API declaration 🔶 Warning 1
@@ Code coverage @@
+ 70% total lines covered
12679 lines analyzed, 8901 lines covered
# Calculated according to the filters of your coverage tool

☁️ View the detailed Qodana report

Contact Qodana team

Contact us at qodana-support@jetbrains.com

@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make agent statales and introduce session objects Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage uniform running agents Sep 28, 2025
@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage uniform running agents Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage uniform running agents Sep 28, 2025
@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage uniform running agents Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents Sep 28, 2025
@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents. Other than that, deprecate concurrent unsafe AIAgent.asTool in favor of AIAgentService.createAgentTool Sep 29, 2025
@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Also make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents. Other than that, deprecate concurrent unsafe AIAgent.asTool in favor of AIAgentService.createAgentTool Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents. Deprecate concurrent unsafe AIAgent.asTool in favor of AIAgentService.createAgentTool Sep 29, 2025
@Ololoshechkin Ololoshechkin changed the title Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Make AIAgent manageable, and introduce AIAgentService to manage multiple uniform running agents. Deprecate concurrent unsafe AIAgent.asTool in favor of AIAgentService.createAgentTool Add RollbackToolRegistry in the Persistency feature in order to roll back tool calls with side effects when checkpointing. Make AIAgent state-manageable and introduce AIAgentService to manage multiple uniform running agents. Deprecate concurrent unsafe AIAgent.asTool in favor of AIAgentService.createAgentTool Sep 29, 2025
@Ololoshechkin Ololoshechkin force-pushed the vbr/rollback-sideeffect-tools-in-persistency branch from 8a3a423 to e80d688 Compare September 30, 2025 12:34
@Ololoshechkin Ololoshechkin merged commit eca6b46 into develop Sep 30, 2025
11 checks passed
@Ololoshechkin Ololoshechkin deleted the vbr/rollback-sideeffect-tools-in-persistency branch September 30, 2025 14:05
valery1707 added a commit to valery1707/koog that referenced this pull request Oct 17, 2025
karloti pushed a commit to karloti/koog that referenced this pull request Oct 21, 2025
…oll back tool calls with side effects when checkpointing. Make `AIAgent` state-manageable and introduce `AIAgentService` to manage multiple uniform running agents. Deprecate concurrent unsafe `AIAgent.asTool` in favor of `AIAgentService.createAgentTool` (JetBrains#873)

1. Add `RollbackToolRegistry` in the `Persistency` feature in order to
roll back tool calls with side effects when checkpointing:

```kotlin
val agent = AIAgent(
    toolRegistry = ToolRegistry {
        tool(::createUser)
        tool(::sendMessage)
        tool(::inviteMember)
    },
    ...
) {
    install(Persistency) {
        storage = myStateStorageImpl
    
        rollbackToolRegistry = RollbackToolRegistry {
            // TOOL -> "REVERSE" TOOL
            
            registerRollback(::createUser, ::deleteUser)
            registerRollback(::sendMessage, ::undoMessage)
            registerRollback(::inviteMember, ::revokeInvitation)
        }
    }
}
```

2. Now you can also manage (create, run, find, access running state)
multiple AI Agents using `AIAgentService` :

```kotlin
val agentService = AIAgentService(
    toolRegistry = ToolRegistry {
        tool(::createUser)
        tool(::sendMessage)
        tool(::inviteMember)
    },
    ...
) {
    install(Persistency) {
        storage = myStateStorageImpl
    
        rollbackToolRegistry = RollbackToolRegistry {
            registerRollback(::createUser, ::deleteUser)
            registerRollback(::sendMessage, ::undoMessage)
            registerRollback(::inviteMember, ::revokeInvitation)
        }
    }
}
```

And then you can create instances of AIAgent, and manage their running
state. This is particularly useful to rollback long-running operations
if user realizes that agent took the wrong direction:

```kotlin
// user creates new agent:
post("/agent") {
    val input = call.receive<String>()

    val agent = agentService.createAgent()

    launch {
        agent.run(input
    }
    
    call.respond(agent.id)
}

// user checks the agent's state:
get("/agent") {
    val id = call.receive<String>()
    val agent = agentService.agentById(id)

    if (!agent.finished()) call.respondText("Agent is still running...")
    else call.respond(agent.resultIfReady()!!)
}

// user asks running agent to rollback:
data class RollbackRequest(val agentId: String, val checkpoint: String)

put("/agent/rollback") {
    val userRequest = call.receive<RollbackRequest>()
    val agent = agentService.agentById(userRequest.agentId)

    if (agent.finished()) {
         call.respondText("Agent has already finished!")
    } else {
        // Rolling back agent to a checkpoint
        agent.withRunningContext {
            withPersistency(this) { ctx ->
                rollbackToCheckpoint(userRequest.checkpoint, ctx)
            }
        }
    
        call.respond(HttpStatusCode.OK)
    }
}
```

3. Make `AIAgent` explicitly single-run. Previous semantic was:
`AIAgent.run` can be called multiple times, but if called in parallel --
throws exception that agent is currently running.
This was creating a hard to manage contract for `run()` and was very
error-prone (see example of such errors below in `4`)

4. Fix `Agent.asTool` for parallel use
JetBrains#864 . Previously --
`AIAgentTool` was holding an instance of `AIAgent` and running it. When
LLM decided to call tools in parallel, because of the previous contract
of `AIAgent` it was failing in runtime with exception. Hence,
`AIAgent.asTool` was broken.
Now -- intended usage is `AIAgentService.createAgentTool()`.
`AIAgent.asTool` is now deprecated and is currently working correctly
via `AIAgentService.fromAgent(this).createAgentTool()` (i.e. it creates
an instance of `AIAgentService` from the current `AIAgent`, then creates
a tool that holds a new copy of `AIAgent`).

## Motivation and Context
<!-- Why is this change needed? What problem does it solve? -->

## Breaking Changes
<!-- Will users need to update their code or configurations? -->

---

#### Type of the changes
- [x] New feature (non-breaking change which adds functionality)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Tests improvement
- [ ] Refactoring

#### Checklist
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [ ] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [ ] An issue describing the proposed change exists
- [ ] The pull request includes a link to the issue
- [ ] The change was discussed and approved in the issue
- [x] Docs have been added / updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants