Gemini-backed Agent Tool-Call Messages Break With OpenAI Agent - A Bug Analysis
Introduction
Hey guys! Today, we're diving deep into a fascinating issue that arises when using Gemini-backed agents and trying to reuse their messages with OpenAI agents. Imagine building an intelligent agent using Google's Gemini, having it interact with the world, and then attempting to seamlessly switch over to OpenAI's powerhouse. Sounds smooth, right? Well, there's a snag! It turns out that the way Gemini formats tool-call messages can cause a bit of a hiccup when those messages are fed into an OpenAI agent. This can be particularly frustrating because one of the major draws of platforms like Agno is the ability to mix and match different language model backends in a single session. We expect a smooth handoff, but this bug throws a wrench in the works. So, let's break down what's happening, why it matters, and how we might fix it. We'll walk through the technical details, the steps to reproduce the issue, the expected vs. actual behavior, and even some potential solutions. Buckle up; it's going to be a nerdy but crucial ride!
Problem Description
The heart of the problem lies in how Gemini structures its tool
messages. When a Gemini agent, configured with a @tool
, generates a message after a tool call, it formats the content
as a one-element list. Now, when you decide to switch gears and use an OpenAIChat model within the same Agno agent session, and you feed the same message history back into agent.run()
, OpenAI throws a fit. It complains about an invalid type because it expects the content
of a tool-call message to be an object, not a string nested within a list. This is a classic case of miscommunication between different systems that are supposed to play nice together. Agno is designed to normalize these outputs to the Function-Calling schema, but it seems like there's a hiccup in the normalization process. This issue directly impacts the user experience, especially for those who want to leverage the strengths of both Gemini and OpenAI in their applications. Imagine building a complex workflow that needs the creative flair of Gemini for initial brainstorming and then switching to OpenAI for more structured tasks. This bug throws a wrench in that seamless transition, forcing developers to implement workarounds or stick to a single model.
Steps to Reproduce
Alright, let's get our hands dirty and reproduce this bug ourselves. Here’s a step-by-step guide that will allow you to see the issue firsthand:
-
Set up your environment variables:
First, you'll need to make sure your API keys for both Google and OpenAI are set up. This is crucial because the script will be interacting with both services.
export GOOGLE_API_KEY=… export OPENAI_API_KEY=…
Make sure to replace the
…
with your actual API keys. Without these, the script won't be able to authenticate with the respective services. -
Save the script:
Copy the Python script provided below and save it as
repro_agno_tool_error.py
. This script is designed to demonstrate the bug clearly. -
Install dependencies:
You'll need to install the necessary Python packages using pip. This includes
agno
,openai
, andgoogle-cloud
.pip install agno openai google-cloud
These libraries provide the tools needed to interact with Gemini and OpenAI models, as well as the Agno framework.
-
Run the script:
Execute the Python script from your terminal.
python repro_agno_tool_error.py
This will initiate the agent interaction, first with Gemini and then with OpenAI.
-
Observe the error:
Pay close attention to the output in your terminal. You should see that the Gemini agent successfully prints a
tool
message, but when the same messages are passed to the OpenAI agent, it throws the invalid-type error. This confirms the bug we're investigating.
Here’s the Python script you’ll need:
import os
import json
from agno.agent import Agent
from agno.models.google import Gemini
from agno.models.openai import OpenAIChat
from agno.tools import tool
@tool(
name="dummy_tool",
description="A test tool that echoes its input",
show_result=True,
stop_after_tool_call=False,
requires_confirmation=False,
cache_results=False,
)
def dummy_tool(query: str) -> str:
return json.dumps({"echo": query})
agent = Agent(
model=Gemini(id="gemini-2.0-flash-lite", api_key=os.environ["GOOGLE_API_KEY"]),
tools=[dummy_tool],
)
initial_messages = [
{"role": "system", "content": "You can call dummy_tool to echo."},
{"role": "user", "content": "Please echo back 'hello world'."}
]
gemini_response = agent.run(messages=initial_messages, timeout=10)
# Print out what Gemini produced
for m in gemini_response.messages:
print(m.content)
# Switch to OpenAI
agent.model = OpenAIChat(id="gpt-4o", api_key=os.environ["OPENAI_API_KEY"])
# This will now error
agent.run(messages=gemini_response.messages, timeout=10)
By following these steps, you'll be able to see the error in action and understand the context in which it occurs. This is a crucial step in debugging and finding a solution.
Agent Configuration
Let's take a closer look at the agent configuration that triggers this bug. The Python script defines an Agno Agent
that's initially backed by Google Gemini. It sets up a simple @tool
called dummy_tool
, which is designed to echo its input. This tool is essential for demonstrating the tool-call message formatting issue.
import os
import json
from agno.agent import Agent
from agno.models.google import Gemini
from agno.models.openai import OpenAIChat
from agno.tools import tool
@tool(
name="dummy_tool",
description="A test tool that echoes its input",
show_result=True,
stop_after_tool_call=False,
requires_confirmation=False,
cache_results=False,
)
def dummy_tool(query: str) -> str:
return json.dumps({"echo": query})
agent = Agent(
model=Gemini(id="gemini-2.0-flash-lite", api_key=os.environ["GOOGLE_API_KEY"]),
tools=[dummy_tool],
)
initial_messages = [
{"role": "system", "content": "You can call dummy_tool to echo."},
{"role": "user", "content": "Please echo back 'hello world'."}
]
gemini_response = agent.run(messages=initial_messages, timeout=10)
# Print out what Gemini produced
for m in gemini_response.messages:
print(m.content)
# Switch to OpenAI
agent.model = OpenAIChat(id="gpt-4o", api_key=os.environ["OPENAI_API_KEY"])
# This will now error
agent.run(messages=gemini_response.messages, timeout=10)
Here’s a breakdown of the key components:
@tool
decorator: This decorator from Agno transforms thedummy_tool
function into a tool that the agent can use. The tool takes a stringquery
as input and returns a JSON string echoing the query.Agent
instantiation: AnAgent
is created, initially using theGemini
model (gemini-2.0-flash-lite
). The API key is pulled from the environment variables. Thetools
parameter is set to a list containing ourdummy_tool
.- Initial messages: A list of messages is defined to kickstart the conversation. The system message instructs the agent that it can call
dummy_tool
, and the user message asks the agent to echo back “hello world”. agent.run()
: The agent is run with the initial messages. Thetimeout
parameter is set to 10 seconds.- Printing Gemini’s output: The messages produced by the Gemini agent are printed to the console. This is where we can observe the problematic
tool
message format. - Switching to OpenAI: The agent’s model is then switched to
OpenAIChat
(gpt-4o
), again using an API key from the environment. - Error trigger: Finally,
agent.run()
is called again with the same messages. This is where the error occurs, as the OpenAI agent cannot process the Gemini-formatted tool-call message.
This configuration highlights the core issue: the incompatibility between Gemini’s tool-call message format and OpenAI’s expectations. By setting up this agent and running the script, we can clearly see the problem in action.
Expected Behavior
The expected behavior in this scenario is seamless compatibility between different language model backends within the Agno framework. Specifically, the messages generated by a Gemini agent should be fully digestible by an OpenAI-backed agent, without any hiccups. This is a core promise of Agno: the ability to mix and match LLMs without manual data massage. In the case of tool calls, this means that the tool
message produced by Gemini should be automatically normalized to a format that OpenAI understands. According to OpenAI’s Function-Calling specification, the tool-call message should look something like this:
{
"role": "function",
"name": "dummy_tool",
"content": "{\"echo\":\"hello world\"}"
}
Let's break down what this means:
- **
role
: `