0% found this document useful (0 votes)

54 views16 pages

Chapter 16 - Resource-Aware Optimization

Chapter 16 discusses Resource-Aware Optimization, which allows intelligent agents to manage computational, temporal, and financial resources dynamically during operation. It highlights practical applications such as cost-optimized model usage, latency-sensitive operations, and fallback mechanisms for service reliability. The chapter also includes hands-on code examples demonstrating how to classify user queries and route them to appropriate models based on complexity, ensuring efficient resource allocation.

Uploaded by

raghunandhan.ptg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views16 pages

Chapter 16 - Resource-Aware Optimization

Uploaded by

raghunandhan.ptg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Chapter 16: Resource-Aware

Optimization
Resource-Aware Optimization enables intelligent agents to dynamically monitor and
manage computational, temporal, and financial resources during operation. This
differs from simple planning, which primarily focuses on action sequencing.
Resource-Aware Optimization requires agents to make decisions regarding action
execution to achieve goals within specified resource budgets or to optimize efficiency.
This involves choosing between more accurate but expensive models and faster,
lower-cost ones, or deciding whether to allocate additional compute for a more
refined response versus returning a quicker, less detailed answer.

For example, consider an agent tasked with analyzing a large dataset for a financial
analyst. If the analyst needs a preliminary report immediately, the agent might use a
faster, more affordable model to quickly summarize key trends. However, if the analyst
requires a highly accurate forecast for a critical investment decision and has a larger
budget and more time, the agent would allocate more resources to utilize a powerful,
slower, but more precise predictive model. A key strategy in this category is the
fallback mechanism, which acts as a safeguard when a preferred model is unavailable
due to being overloaded or throttled. To ensure graceful degradation, the system
automatically switches to a default or more affordable model, maintaining service
continuity instead of failing completely.

Practical Applications & Use Cases

Practical use cases include:

● Cost-Optimized LLM Usage: An agent deciding whether to use a large,

expensive LLM for complex tasks or a smaller, more affordable one for simpler
queries, based on a budget constraint.
● Latency-Sensitive Operations: In real-time systems, an agent chooses a faster
but potentially less comprehensive reasoning path to ensure a timely response.
● Energy Efficiency: For agents deployed on edge devices or with limited power,
optimizing their processing to conserve battery life.
● Fallback for service reliability: An agent automatically switches to a backup
model when the primary choice is unavailable, ensuring service continuity and
graceful degradation.

1
● Data Usage Management: An agent opting for summarized data retrieval
instead of full dataset downloads to save bandwidth or storage.
● Adaptive Task Allocation: In multi-agent systems, agents self-assign tasks
based on their current computational load or available time.

Hands-On Code Example

An intelligent system for answering user questions can assess the difficulty of each
question. For simple queries, it utilizes a cost-effective language model such as
Gemini Flash. For complex inquiries, a more powerful, but expensive, language model
(like Gemini Pro) is considered. The decision to use the more powerful model also
depends on resource availability, specifically budget and time constraints. This system
dynamically selects appropriate models.

For example, consider a travel planner built with a hierarchical agent. The high-level
planning, which involves understanding a user's complex request, breaking it down
into a multi-step itinerary, and making logical decisions, would be managed by a
sophisticated and more powerful LLM like Gemini Pro. This is the "planner" agent that
requires a deep understanding of context and the ability to reason.

However, once the plan is established, the individual tasks within that plan, such as
looking up flight prices, checking hotel availability, or finding restaurant reviews, are
essentially simple, repetitive web queries. These "tool function calls" can be executed
by a faster and more affordable model like Gemini Flash. It is easier to visualize why
the affordable model can be used for these straightforward web searches, while the
intricate planning phase requires the greater intelligence of the more advanced model
to ensure a coherent and logical travel plan.

Google's ADK supports this approach through its multi-agent architecture, which
allows for modular and scalable applications. Different agents can handle specialized
tasks. Model flexibility enables the direct use of various Gemini models, including both
Gemini Pro and Gemini Flash, or integration of other models through LiteLLM. The
ADK's orchestration capabilities support dynamic, LLM-driven routing for adaptive
behavior. Built-in evaluation features allow systematic assessment of agent
performance, which can be used for system refinement (see the Chapter on
Evaluation and Monitoring).

Next, two agents with identical setup but utilizing different models and costs will be
defined.

2
# Conceptual Python-like structure, not runnable code

from google.adk.agents import Agent

# from google.adk.models.lite_llm import LiteLlm # If using models
not directly supported by ADK's default Agent

# Agent using the more expensive Gemini Pro 2.5

gemini_pro_agent = Agent(
name="GeminiProAgent",
model="gemini-2.5-pro", # Placeholder for actual model name if
different
description="A highly capable agent for complex queries.",
instruction="You are an expert assistant for complex
problem-solving."
)

# Agent using the less expensive Gemini Flash 2.5

gemini_flash_agent = Agent(
name="GeminiFlashAgent",
model="gemini-2.5-flash", # Placeholder for actual model name if
different
description="A fast and efficient agent for simple queries.",
instruction="You are a quick assistant for straightforward
questions."
)

A Router Agent can direct queries based on simple metrics like query length, where
shorter queries go to less expensive models and longer queries to more capable
models. However, a more sophisticated Router Agent can utilize either LLM or ML
models to analyze query nuances and complexity. This LLM router can determine
which downstream language model is most suitable. For example, a query requesting
a factual recall is routed to a flash model, while a complex query requiring deep
analysis is routed to a pro model.

Optimization techniques can further enhance the LLM router's effectiveness. Prompt
tuning involves crafting prompts to guide the router LLM for better routing decisions.
Fine-tuning the LLM router on a dataset of queries and their optimal model choices
improves its accuracy and efficiency. This dynamic routing capability balances
response quality with cost-effectiveness.

3
# Conceptual Python-like structure, not runnable code

from google.adk.agents import Agent, BaseAgent

from google.adk.events import Event
from google.adk.agents.invocation_context import InvocationContext
import asyncio

class QueryRouterAgent(BaseAgent):
name: str = "QueryRouter"
description: str = "Routes user queries to the appropriate LLM
agent based on complexity."

async def _run_async_impl(self, context: InvocationContext) ->

AsyncGenerator[Event, None]:
user_query = context.current_message.text # Assuming text
input
query_length = len(user_query.split()) # Simple metric: number
of words

if query_length < 20: # Example threshold for simplicity vs.

complexity
print(f"Routing to Gemini Flash Agent for short query
(length: {query_length})")
# In a real ADK setup, you would 'transfer_to_agent' or
directly invoke
# For demonstration, we'll simulate a call and yield its
response
response = await
gemini_flash_agent.run_async(context.current_message)
yield Event(author=self.name, content=f"Flash Agent
processed: {response}")
else:
print(f"Routing to Gemini Pro Agent for long query
(length: {query_length})")
response = await
gemini_pro_agent.run_async(context.current_message)
yield Event(author=self.name, content=f"Pro Agent
processed: {response}")

The Critique Agent evaluates responses from language models, providing feedback
that serves several functions. For self-correction, it identifies errors or
inconsistencies, prompting the answering agent to refine its output for improved

4
quality. It also systematically assesses responses for performance monitoring,
tracking metrics like accuracy and relevance, which are used for optimization.

Additionally, its feedback can signal reinforcement learning or fine-tuning; consistent

identification of inadequate Flash model responses, for instance, can refine the router
agent's logic. While not directly managing the budget, the Critique Agent contributes
to indirect budget management by identifying suboptimal routing choices, such as
directing simple queries to a Pro model or complex queries to a Flash model, which
leads to poor results. This informs adjustments that improve resource allocation and
cost savings.

The Critique Agent can be configured to review either only the generated text from
the answering agent or both the original query and the generated text, enabling a
comprehensive evaluation of the response's alignment with the initial question.

CRITIC_SYSTEM_PROMPT = """
You are the **Critic Agent**, serving as the quality assurance arm of
our collaborative research assistant system. Your primary function is
to **meticulously review and challenge** information from the
Researcher Agent, guaranteeing **accuracy, completeness, and unbiased
presentation**.
Your duties encompass:
* **Assessing research findings** for factual correctness,
thoroughness, and potential leanings.
* **Identifying any missing data** or inconsistencies in reasoning.
* **Raising critical questions** that could refine or expand the
current understanding.
* **Offering constructive suggestions** for enhancement or exploring
different angles.
* **Validating that the final output is comprehensive** and balanced.
All criticism must be constructive. Your goal is to fortify the
research, not invalidate it. Structure your feedback clearly, drawing
attention to specific points for revision. Your overarching aim is to
ensure the final research product meets the highest possible quality
standards.
"""

The Critic Agent operates based on a predefined system prompt that outlines its role,
responsibilities, and feedback approach. A well-designed prompt for this agent must
clearly establish its function as an evaluator. It should specify the areas for critical
focus and emphasize providing constructive feedback rather than mere dismissal. The

5
prompt should also encourage the identification of both strengths and weaknesses,
and it must guide the agent on how to structure and present its feedback.

Hands-On Code with OpenAI

This system uses a resource-aware optimization strategy to handle user queries
efficiently. It first classifies each query into one of three categories to determine the
most appropriate and cost-effective processing pathway. This approach avoids
wasting computational resources on simple requests while ensuring complex queries
get the necessary attention. The three categories are:

● simple: For straightforward questions that can be answered directly without

complex reasoning or external data.
● reasoning: For queries that require logical deduction or multi-step thought
processes, which are routed to more powerful models.
● internet_search: For questions needing current information, which
automatically triggers a Google Search to provide an up-to-date answer.

The code is under the MIT license and available on Github:

(https://github.com/mahtabsyed/21-Agentic-Patterns/blob/main/16_Resource_Aware_
Opt_LLM_Reflection_v2.ipynb)

# MIT License
# Copyright (c) 2025 Mahtab Syed
# https://www.linkedin.com/in/mahtabsyed/

import os
import requests
import json
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
GOOGLE_CUSTOM_SEARCH_API_KEY =
os.getenv("GOOGLE_CUSTOM_SEARCH_API_KEY")
GOOGLE_CSE_ID = os.getenv("GOOGLE_CSE_ID")

if not OPENAI_API_KEY or not GOOGLE_CUSTOM_SEARCH_API_KEY or not

GOOGLE_CSE_ID:
raise ValueError(

6
"Please set OPENAI_API_KEY, GOOGLE_CUSTOM_SEARCH_API_KEY, and
GOOGLE_CSE_ID in your .env file."
)

client = OpenAI(api_key=OPENAI_API_KEY)

# --- Step 1: Classify the Prompt ---

def classify_prompt(prompt: str) -> dict:
system_message = {
"role": "system",
"content": (
"You are a classifier that analyzes user prompts and
returns one of three categories ONLY:\n\n"
"- simple\n"
"- reasoning\n"
"- internet_search\n\n"
"Rules:\n"
"- Use 'simple' for direct factual questions that need no
reasoning or current events.\n"
"- Use 'reasoning' for logic, math, or multi-step
inference questions.\n"
"- Use 'internet_search' if the prompt refers to current
events, recent data, or things not in your training data.\n\n"
"Respond ONLY with JSON like:\n"
'{ "classification": "simple" }'
),
}

user_message = {"role": "user", "content": prompt}

response = client.chat.completions.create(
model="gpt-4o", messages=[system_message, user_message],
temperature=1
)

reply = response.choices[0].message.content
return json.loads(reply)

# --- Step 2: Google Search ---

def google_search(query: str, num_results=1) -> list:
url = "https://www.googleapis.com/customsearch/v1"
params = {
"key": GOOGLE_CUSTOM_SEARCH_API_KEY,
"cx": GOOGLE_CSE_ID,
"q": query,
"num": num_results,
}

7
try:
response = requests.get(url, params=params)
response.raise_for_status()
results = response.json()

if "items" in results and results["items"]:

return [
{
"title": item.get("title"),
"snippet": item.get("snippet"),
"link": item.get("link"),
}
for item in results["items"]
]
else:
return []
except requests.exceptions.RequestException as e:
return {"error": str(e)}

# --- Step 3: Generate Response ---

def generate_response(prompt: str, classification: str,
search_results=None) -> str:
if classification == "simple":
model = "gpt-4o-mini"
full_prompt = prompt
elif classification == "reasoning":
model = "o4-mini"
full_prompt = prompt
elif classification == "internet_search":
model = "gpt-4o"
# Convert each search result dict to a readable string
if search_results:
search_context = "\n".join(
[
f"Title: {item.get('title')}\nSnippet:
{item.get('snippet')}\nLink: {item.get('link')}"
for item in search_results
]
)
else:
search_context = "No search results found."
full_prompt = f"""Use the following web results to answer the
user query:

{search_context}

8
Query: {prompt}"""

response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": full_prompt}],
temperature=1,
)

return response.choices[0].message.content, model

# --- Step 4: Combined Router ---

def handle_prompt(prompt: str) -> dict:
classification_result = classify_prompt(prompt)

# print("\n 🔍
# Remove or comment out the next line to avoid duplicate printing
Classification Result:", classification_result)
classification = classification_result["classification"]

search_results = None
if classification == "internet_search":

# print("\n 🔍
search_results = google_search(prompt)
Search Results:", search_results)

answer, model = generate_response(prompt, classification,

search_results)
return {"classification": classification, "response": answer,
"model": model}
test_prompt = "What is the capital of Australia?"
# test_prompt = "Explain the impact of quantum computing on
cryptography."
# test_prompt = "When does the Australian Open 2026 start, give me
full date?"

print(" 🔍
result = handle_prompt(test_prompt)

🧠Classification:", result["classification"])
print("
print(" 🧠Model Used:", result["model"])
Response:\n", result["response"])

This Python code implements a prompt routing system to answer user questions. It
begins by loading necessary API keys from a .env file for OpenAI and Google Custom
Search. The core functionality lies in classifying the user's prompt into three
categories: simple, reasoning, or internet search. A dedicated function utilizes an
OpenAI model for this classification step. If the prompt requires current information, a
Google search is performed using the Google Custom Search API. Another function

9
then generates the final response, selecting an appropriate OpenAI model based on
the classification. For internet search queries, the search results are provided as
context to the model. The main handle_prompt function orchestrates this workflow,
calling the classification and search (if needed) functions before generating the
response. It returns the classification, the model used, and the generated answer. This
system efficiently directs different types of queries to optimized methods for a better
response.

Hands-On Code Example (OpenRouter)

OpenRouter offers a unified interface to hundreds of AI models via a single API
endpoint. It provides automated failover and cost-optimization, with easy integration
through your preferred SDK or framework.

import requests
import json
response = requests.post(
url="https://openrouter.ai/api/v1/chat/completions",
headers={
"Authorization": "Bearer <OPENROUTER_API_KEY>",
"HTTP-Referer": "<YOUR_SITE_URL>", # Optional. Site URL for
rankings on openrouter.ai.
"X-Title": "<YOUR_SITE_NAME>", # Optional. Site title for rankings
on openrouter.ai.
},
data=json.dumps({
"model": "openai/gpt-4o", # Optional
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
})
)

This code snippet uses the requests library to interact with the OpenRouter API. It
sends a POST request to the chat completion endpoint with a user message. The
request includes authorization headers with an API key and optional site information.
The goal is to get a response from a specified language model, in this case,
"openai/gpt-4o".

10
Openrouter offers two distinct methodologies for routing and determining the
computational model used to process a given request.

● Automated Model Selection: This function routes a request to an optimized

model chosen from a curated set of available models. The selection is
predicated on the specific content of the user's prompt. The identifier of the
model that ultimately processes the request is returned in the response's
metadata.

{
"model": "openrouter/auto",
... // Other params
}

● Sequential Model Fallback: This mechanism provides operational redundancy

by allowing users to specify a hierarchical list of models. The system will first
attempt to process the request with the primary model designated in the
sequence. Should this primary model fail to respond due to any number of error
conditions—such as service unavailability, rate-limiting, or content filtering—the
system will automatically re-route the request to the next specified model in
the sequence. This process continues until a model in the list successfully
executes the request or the list is exhausted. The final cost of the operation
and the model identifier returned in the response will correspond to the model
that successfully completed the computation.

{
"models": ["anthropic/claude-3.5-sonnet", "gryphe/mythomax-l2-13b"],
... // Other params
}

OpenRouter offers a detailed leaderboard ( https://openrouter.ai/rankings) which ranks

available AI models based on their cumulative token production. It also offers latest
models from different providers (ChatGPT, Gemini, Claude) (see Fig. 1)

11
Fig. 1: OpenRouter Web site (https://openrouter.ai/)

Beyond Dynamic Model Switching: A Spectrum of

Agent Resource Optimizations
Resource-aware optimization is paramount in developing intelligent agent systems
that operate efficiently and effectively within real-world constraints. Let's see a
number of additional techniques:

Dynamic Model Switching is a critical technique involving the strategic selection of

large language models based on the intricacies of the task at hand and the available
computational resources. When faced with simple queries, a lightweight,
cost-effective LLM can be deployed, whereas complex, multifaceted problems
necessitate the utilization of more sophisticated and resource-intensive models.

Adaptive Tool Use & Selection ensures agents can intelligently choose from a suite
of tools, selecting the most appropriate and efficient one for each specific sub-task,
with careful consideration given to factors like API usage costs, latency, and execution
time. This dynamic tool selection enhances overall system efficiency by optimizing the
use of external APIs and services.

Contextual Pruning & Summarization plays a vital role in managing the amount of
information processed by agents, strategically minimizing the prompt token count and
reducing inference costs by intelligently summarizing and selectively retaining only the
12
most relevant information from the interaction history, preventing unnecessary
computational overhead.

Proactive Resource Prediction involves anticipating resource demands by

forecasting future workloads and system requirements, which allows for proactive
allocation and management of resources, ensuring system responsiveness and
preventing bottlenecks.

Cost-Sensitive Exploration in multi-agent systems extends optimization

considerations to encompass communication costs alongside traditional
computational costs, influencing the strategies employed by agents to collaborate
and share information, aiming to minimize the overall resource expenditure.

Energy-Efficient Deployment is specifically tailored for environments with stringent

resource constraints, aiming to minimize the energy footprint of intelligent agent
systems, extending operational time and reducing overall running costs.

Parallelization & Distributed Computing Awareness leverages distributed

resources to enhance the processing power and throughput of agents, distributing
computational workloads across multiple machines or processors to achieve greater
efficiency and faster task completion.

Learned Resource Allocation Policies introduce a learning mechanism, enabling

agents to adapt and optimize their resource allocation strategies over time based on
feedback and performance metrics, improving efficiency through continuous
refinement.

Graceful Degradation and Fallback Mechanisms ensure that intelligent agent

systems can continue to function, albeit perhaps at a reduced capacity, even when
resource constraints are severe, gracefully degrading performance and falling back to
alternative strategies to maintain operation and provide essential functionality.

At a Glance
What: Resource-Aware Optimization addresses the challenge of managing the
consumption of computational, temporal, and financial resources in intelligent
systems. LLM-based applications can be expensive and slow, and selecting the best
model or tool for every task is often inefficient. This creates a fundamental trade-off
between the quality of a system's output and the resources required to produce it.

13
Without a dynamic management strategy, systems cannot adapt to varying task
complexities or operate within budgetary and performance constraints.

Why: The standardized solution is to build an agentic system that intelligently

monitors and allocates resources based on the task at hand. This pattern typically
employs a "Router Agent" to first classify the complexity of an incoming request. The
request is then forwarded to the most suitable LLM or tool—a fast, inexpensive model
for simple queries, and a more powerful one for complex reasoning. A "Critique
Agent" can further refine the process by evaluating the quality of the response,
providing feedback to improve the routing logic over time. This dynamic, multi-agent
approach ensures the system operates efficiently, balancing response quality with
cost-effectiveness.

Rule of thumb: Use this pattern when operating under strict financial budgets for API
calls or computational power, building latency-sensitive applications where quick
response times are critical, deploying agents on resource-constrained hardware such
as edge devices with limited battery life, programmatically balancing the trade-off
between response quality and operational cost, and managing complex, multi-step
workflows where different tasks have varying resource requirements.

Visual Summary

14
Fig. 2: Resource-Aware Optimization Design Pattern

Key Takeaways
● Resource-Aware Optimization is Essential: Intelligent agents can manage
computational, temporal, and financial resources dynamically. Decisions
regarding model usage and execution paths are made based on real-time
constraints and objectives.
● Multi-Agent Architecture for Scalability: Google's ADK provides a multi-agent
framework, enabling modular design. Different agents (answering, routing,
critique) handle specific tasks.
● Dynamic, LLM-Driven Routing: A Router Agent directs queries to language
models (Gemini Flash for simple, Gemini Pro for complex) based on query
complexity and budget. This optimizes cost and performance.
● Critique Agent Functionality: A dedicated Critique Agent provides feedback for
self-correction, performance monitoring, and refining routing logic, enhancing
system effectiveness.

15
● Optimization Through Feedback and Flexibility: Evaluation capabilities for
critique and model integration flexibility contribute to adaptive and
self-improving system behavior.
● Additional Resource-Aware Optimizations: Other methods include Adaptive
Tool Use & Selection, Contextual Pruning & Summarization, Proactive Resource
Prediction, Cost-Sensitive Exploration in Multi-Agent Systems, Energy-Efficient
Deployment, Parallelization & Distributed Computing Awareness, Learned
Resource Allocation Policies, Graceful Degradation and Fallback Mechanisms,
and Prioritization of Critical Tasks.

Conclusions
Resource-aware optimization is essential for the development of intelligent agents,
enabling efficient operation within real-world constraints. By managing computational,
temporal, and financial resources, agents can achieve optimal performance and
cost-effectiveness. Techniques such as dynamic model switching, adaptive tool use,
and contextual pruning are crucial for attaining these efficiencies. Advanced
strategies, including learned resource allocation policies and graceful degradation,
enhance an agent's adaptability and resilience under varying conditions. Integrating
these optimization principles into agent design is fundamental for building scalable,
robust, and sustainable AI systems.

References
1. Google's Agent Development Kit (ADK): https://google.github.io/adk-docs/
2. Gemini Flash 2.5 & Gemini 2.5 Pro: https://aistudio.google.com/
3. OpenRouter: https://openrouter.ai/docs/quickstart

Chapter 6 - Planning
No ratings yet
Chapter 6 - Planning
13 pages
Agentic Ai s1
No ratings yet
Agentic Ai s1
14 pages
Introduction To Automation
No ratings yet
Introduction To Automation
11 pages
Master Thesis - Niels Denissen
No ratings yet
Master Thesis - Niels Denissen
127 pages
Agents in Langchain
No ratings yet
Agents in Langchain
6 pages
AI Assignment 1 Q and A
No ratings yet
AI Assignment 1 Q and A
11 pages
May 2025 Launch Week Webinar Slides
No ratings yet
May 2025 Launch Week Webinar Slides
30 pages
Agent Work Flows
No ratings yet
Agent Work Flows
72 pages
Agents Companion v2
100% (1)
Agents Companion v2
76 pages
Agents Companion v2
No ratings yet
Agents Companion v2
76 pages
Agent Development Kit - Making It Easy To Build Multi-Agent Applications - Google Developers Blog
No ratings yet
Agent Development Kit - Making It Easy To Build Multi-Agent Applications - Google Developers Blog
6 pages
AI Agent Frameworks
No ratings yet
AI Agent Frameworks
15 pages
Towards AI Search Paradigm
No ratings yet
Towards AI Search Paradigm
63 pages
Lect-1 3
No ratings yet
Lect-1 3
38 pages
Chapter 5 - Tool Use
No ratings yet
Chapter 5 - Tool Use
21 pages
Lecture-01: Introduction: TAC7011 An Agent Approach To Computational Intelligence
No ratings yet
Lecture-01: Introduction: TAC7011 An Agent Approach To Computational Intelligence
52 pages
Presentation ON Intelligent Software Agents: Name: Himani Sethi Enrolment No:03024302009 Course: B.C.A 5 Sem (EVE)
No ratings yet
Presentation ON Intelligent Software Agents: Name: Himani Sethi Enrolment No:03024302009 Course: B.C.A 5 Sem (EVE)
16 pages
2 AI Agents and Workflow Automation A Complete Guide DiegoDavila
No ratings yet
2 AI Agents and Workflow Automation A Complete Guide DiegoDavila
60 pages
Introduction To AI Intelligent Agents
No ratings yet
Introduction To AI Intelligent Agents
24 pages
Modules
No ratings yet
Modules
5 pages
Akka Infoq Agentic Ai Design Patterns
No ratings yet
Akka Infoq Agentic Ai Design Patterns
33 pages
Agenticaiguide 250106204341 C238c4fa
No ratings yet
Agenticaiguide 250106204341 C238c4fa
52 pages
Ai Agents Cheat Sheet
No ratings yet
Ai Agents Cheat Sheet
1 page
AIinHCIandUX 12 14 2022 R1
No ratings yet
AIinHCIandUX 12 14 2022 R1
37 pages
AI Unit-1 Solution (All PYQs)
No ratings yet
AI Unit-1 Solution (All PYQs)
28 pages
AI Solutions for User-Centric Design
No ratings yet
AI Solutions for User-Centric Design
118 pages
AI Mid Theory
No ratings yet
AI Mid Theory
1 page
AI Applications & Search Strategies
No ratings yet
AI Applications & Search Strategies
9 pages
Agent Technologies1
No ratings yet
Agent Technologies1
96 pages
Remembering, Reflecting and Dynamic Decision Making For Web Agents
No ratings yet
Remembering, Reflecting and Dynamic Decision Making For Web Agents
12 pages
AI - 02 Mar
No ratings yet
AI - 02 Mar
14 pages
Ai Unit1
No ratings yet
Ai Unit1
72 pages
20 AI Agents
No ratings yet
20 AI Agents
25 pages
Ai2 2
No ratings yet
Ai2 2
8 pages
AI Agents
No ratings yet
AI Agents
13 pages
Ai Unit 1 Notes
No ratings yet
Ai Unit 1 Notes
19 pages
Hãy đóng vai trò là một chuyên gia về AI và công...
No ratings yet
Hãy đóng vai trò là một chuyên gia về AI và công...
13 pages
PJ 123 Main
No ratings yet
PJ 123 Main
53 pages
AI's Role in HCI/UX Transformation
No ratings yet
AI's Role in HCI/UX Transformation
36 pages
2.10 Tool Use and Agents
No ratings yet
2.10 Tool Use and Agents
3 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
The Software Engineering of Agent-Based Intelligent Systems: Adaptive
No ratings yet
The Software Engineering of Agent-Based Intelligent Systems: Adaptive
2 pages
Unit 1
No ratings yet
Unit 1
28 pages
Fai Unit-Ii
No ratings yet
Fai Unit-Ii
12 pages
1 1 Intro Overview
No ratings yet
1 1 Intro Overview
11 pages
Using Software Agents For Information Retrieval
No ratings yet
Using Software Agents For Information Retrieval
40 pages
Agentic Technical Flow
No ratings yet
Agentic Technical Flow
8 pages
Comprehensive Guide To Using The Hypothetical Next-Generation Model, Gemini 2.5 Pro.
No ratings yet
Comprehensive Guide To Using The Hypothetical Next-Generation Model, Gemini 2.5 Pro.
7 pages
Artificial Intelligence - Past Papers Solution
100% (1)
Artificial Intelligence - Past Papers Solution
17 pages
Sanet - ST - Building Applications With AI Agents
100% (2)
Sanet - ST - Building Applications With AI Agents
72 pages
Intelligent Agents Explained
No ratings yet
Intelligent Agents Explained
25 pages
Building AI Agents - A 12-Week Guide To Automation and Time-Saving - Week 1
100% (1)
Building AI Agents - A 12-Week Guide To Automation and Time-Saving - Week 1
57 pages
AI Course Overview for Students
No ratings yet
AI Course Overview for Students
11 pages
Intelligent Agent
No ratings yet
Intelligent Agent
60 pages
NSE(Native Storage Extension) Data Tiering Options - ERP Q&A
No ratings yet
NSE(Native Storage Extension) Data Tiering Options - ERP Q&A
9 pages
40 Blog Series - SAP S_4 HANA Supply Chain for Transp... - SAP Community
No ratings yet
40 Blog Series - SAP S_4 HANA Supply Chain for Transp... - SAP Community
5 pages
Optimizing Models in BW_4HANA Mixed Scenarios - ERP Q&A
No ratings yet
Optimizing Models in BW_4HANA Mixed Scenarios - ERP Q&A
6 pages
B.Lib.I.Sc. 2024-25 (Jan Cycle) (1)
No ratings yet
B.Lib.I.Sc. 2024-25 (Jan Cycle) (1)
98 pages
Create SAP Performance Analysis Dashboard With ABAP Download Data - ERP Q&A
No ratings yet
Create SAP Performance Analysis Dashboard With ABAP Download Data - ERP Q&A
36 pages
1740712382316
No ratings yet
1740712382316
6 pages
04 SAP Make-To-Order Insights - SAP Community
No ratings yet
04 SAP Make-To-Order Insights - SAP Community
5 pages
42 Blog Series – SAP S_4 HANA Supply Chain for Transp... - SAP Community
No ratings yet
42 Blog Series – SAP S_4 HANA Supply Chain for Transp... - SAP Community
6 pages
39 Blog Series - SAP S_4 HANA Supply Chain for Transp... - SAP Community
No ratings yet
39 Blog Series - SAP S_4 HANA Supply Chain for Transp... - SAP Community
5 pages
46 Incompletion Log in Sales Order - SAP SD - SAP Community
No ratings yet
46 Incompletion Log in Sales Order - SAP SD - SAP Community
5 pages
Notification Cmo
No ratings yet
Notification Cmo
6 pages
45 Output Management in S_4HANA - SAP Community
No ratings yet
45 Output Management in S_4HANA - SAP Community
11 pages
43 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
No ratings yet
43 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
8 pages
47 Backorder Processing in AATP - SAP S4 HANA
No ratings yet
47 Backorder Processing in AATP - SAP S4 HANA
7 pages
35 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
No ratings yet
35 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
13 pages
nabard-13-10-2025
No ratings yet
nabard-13-10-2025
26 pages
01 Credit Limit Request Configuration - SAP S4Hana Cr... - SAP Community
No ratings yet
01 Credit Limit Request Configuration - SAP S4Hana Cr... - SAP Community
26 pages
44 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
No ratings yet
44 Blog Series – SAP S_4 HANA Supply Chain for TM – 0... - SAP Community
13 pages
25 BRIM - SAP S_4 HANA Service_ Subscription Order Ma... - SAP Community
No ratings yet
25 BRIM - SAP S_4 HANA Service_ Subscription Order Ma... - SAP Community
16 pages
Notification Advisor IT
No ratings yet
Notification Advisor IT
4 pages
Chapter 19 - Evaluation and Monitoring
No ratings yet
Chapter 19 - Evaluation and Monitoring
19 pages
Chapter 18 - Guardrails - Safety Patterns
No ratings yet
Chapter 18 - Guardrails - Safety Patterns
20 pages
Appendix B - AI Agentic Interactions - From GUI To Real World Environment
No ratings yet
Appendix B - AI Agentic Interactions - From GUI To Real World Environment
7 pages
Chapter 17 - Reasoning Techniques
No ratings yet
Chapter 17 - Reasoning Techniques
24 pages
Appendix D - Building An Agent With AgentSpace (On-Line Only)
No ratings yet
Appendix D - Building An Agent With AgentSpace (On-Line Only)
6 pages
Appendix C - Quick Overview of Agentic Frameworks
No ratings yet
Appendix C - Quick Overview of Agentic Frameworks
8 pages
Chapter 20 - Prioritization
No ratings yet
Chapter 20 - Prioritization
10 pages
What Are Bonds - Meaning, Types & Important Terms - Aditya Birla Capital
No ratings yet
What Are Bonds - Meaning, Types & Important Terms - Aditya Birla Capital
7 pages
33 Consultant Head Posts Advt Details NeGD 1 2
No ratings yet
33 Consultant Head Posts Advt Details NeGD 1 2
2 pages
EDB PostgreSQL 13 Certification
No ratings yet
EDB PostgreSQL 13 Certification
3 pages
UiPath Interview Q&A
No ratings yet
UiPath Interview Q&A
16 pages
Training Large Language Models Efficiently With Sparsity and Dataflow
No ratings yet
Training Large Language Models Efficiently With Sparsity and Dataflow
11 pages
ErrorCodeList V11a
No ratings yet
ErrorCodeList V11a
4 pages
Related Searches: Electrical-Interview-Questions-Answers PDF
No ratings yet
Related Searches: Electrical-Interview-Questions-Answers PDF
1 page
Application Interaction Guide
No ratings yet
Application Interaction Guide
52 pages
MSP432 Timer Programming Guide
No ratings yet
MSP432 Timer Programming Guide
40 pages
Bahya Paquda - Le Devoir Des Coeurs PDF
No ratings yet
Bahya Paquda - Le Devoir Des Coeurs PDF
587 pages
Computer Science Practical File XII 1 To 9
No ratings yet
Computer Science Practical File XII 1 To 9
13 pages
Is Security Review Questions
No ratings yet
Is Security Review Questions
4 pages
Asic Lab4a Placement
No ratings yet
Asic Lab4a Placement
15 pages
MAGNUM-HW-B User Manual 1v0 - v3
No ratings yet
MAGNUM-HW-B User Manual 1v0 - v3
26 pages
Cybercrime Prevention Act of 2012
No ratings yet
Cybercrime Prevention Act of 2012
8 pages
VxRail Appliance - VxRail Installation Procedures-VxRail E660 - E660F - E660N
100% (1)
VxRail Appliance - VxRail Installation Procedures-VxRail E660 - E660F - E660N
102 pages
EDEX-UI Terminal For Windows, Mac and Linux
No ratings yet
EDEX-UI Terminal For Windows, Mac and Linux
13 pages
Tafj Jboss
100% (1)
Tafj Jboss
20 pages
36-42 Baofeng
No ratings yet
36-42 Baofeng
7 pages
Red Hat Enterprise Linux-8-Upgrading To RHEL 8-En-US
No ratings yet
Red Hat Enterprise Linux-8-Upgrading To RHEL 8-En-US
18 pages
Sap MRS
100% (2)
Sap MRS
16 pages
ISBSG Variables Most Frequently Used For Software Effort
No ratings yet
ISBSG Variables Most Frequently Used For Software Effort
4 pages
Lists in Python Language
No ratings yet
Lists in Python Language
15 pages
MD70 Hyperion Upload Into Oracle
No ratings yet
MD70 Hyperion Upload Into Oracle
11 pages
Appendix 2. Metrowerks Techarts 9S12C32 Setup
No ratings yet
Appendix 2. Metrowerks Techarts 9S12C32 Setup
11 pages
TTS Shortcut Keys Functions Excel 2007
0% (1)
TTS Shortcut Keys Functions Excel 2007
4 pages
Imran's Project Schedule Log
No ratings yet
Imran's Project Schedule Log
3 pages
IoT-Based Industrial Motor Monitoring
No ratings yet
IoT-Based Industrial Motor Monitoring
15 pages
TRedess Fourth Series Product Brochure - v0417 - 3 PDF
No ratings yet
TRedess Fourth Series Product Brochure - v0417 - 3 PDF
24 pages
Pui Lam Paul Lo - Senior Software Developer - HA
No ratings yet
Pui Lam Paul Lo - Senior Software Developer - HA
3 pages
Nptel PPT NEW PDF
No ratings yet
Nptel PPT NEW PDF
14 pages
ABB FENA 01 11 Ethernet Manual
No ratings yet
ABB FENA 01 11 Ethernet Manual
358 pages
Java UPC-A Barcodes Generator For Java, J2EE, JasperReports
No ratings yet
Java UPC-A Barcodes Generator For Java, J2EE, JasperReports
5 pages

Chapter 16 - Resource-Aware Optimization

Uploaded by

Chapter 16 - Resource-Aware Optimization

Uploaded by

Chapter 16: Resource-Aware

Practical Applications & Use Cases

●​ Cost-Optimized LLM Usage: An agent deciding whether to use a large,

Hands-On Code Example

from google.adk.agents import Agent

# Agent using the more expensive Gemini Pro 2.5

# Agent using the less expensive Gemini Flash 2.5

from google.adk.agents import Agent, BaseAgent

async def _run_async_impl(self, context: InvocationContext) ->

if query_length < 20: # Example threshold for simplicity vs.

Additionally, its feedback can signal reinforcement learning or fine-tuning; consistent

Hands-On Code with OpenAI

●​ simple: For straightforward questions that can be answered directly without

The code is under the MIT license and available on Github:

# Load environment variables

if not OPENAI_API_KEY or not GOOGLE_CUSTOM_SEARCH_API_KEY or not

# --- Step 1: Classify the Prompt ---

user_message = {"role": "user", "content": prompt}

# --- Step 2: Google Search ---

if "items" in results and results["items"]:

# --- Step 3: Generate Response ---

return response.choices[0].message.content, model

# --- Step 4: Combined Router ---

answer, model = generate_response(prompt, classification,

Hands-On Code Example (OpenRouter)

●​ Automated Model Selection: This function routes a request to an optimized

●​ Sequential Model Fallback: This mechanism provides operational redundancy

OpenRouter offers a detailed leaderboard ( https://openrouter.ai/rankings) which ranks

Beyond Dynamic Model Switching: A Spectrum of

Dynamic Model Switching is a critical technique involving the strategic selection of

Proactive Resource Prediction involves anticipating resource demands by

Cost-Sensitive Exploration in multi-agent systems extends optimization

Energy-Efficient Deployment is specifically tailored for environments with stringent

Parallelization & Distributed Computing Awareness leverages distributed

Learned Resource Allocation Policies introduce a learning mechanism, enabling

Graceful Degradation and Fallback Mechanisms ensure that intelligent agent

Why: The standardized solution is to build an agentic system that intelligently

You might also like

● Cost-Optimized LLM Usage: An agent deciding whether to use a large,

● simple: For straightforward questions that can be answered directly without

● Automated Model Selection: This function routes a request to an optimized

● Sequential Model Fallback: This mechanism provides operational redundancy