From OpenAI to Nvidia, researchers agree: AI agents have a long way to go
Welcome to Eye on AI! AI reporter Sharon Goldman here, filling in for Jeremy Kahn, who is on holiday. In this edition…General Services Administration approves OpenAI, Google, Anthropic for federal AI vendor list…Consequences of AI spending boom on U.S. economy…Clay AI raises $100 million at $3.1 billion valuation.
Only in the Bay Area does spending a Saturday geeking out about AI agents—alongside 2,000 students, researchers, and tech insiders crammed into UC Berkeley—feel like a totally normal weekend plan. As I picked up my badge at the day-long Agentic AI Summit and watched the line snake through the student union lobby, it felt less like an academic conference and more like Silicon Valley’s version of a buzzy New York brunch spot.
This was certainly due to the speaker lineup, which was stacked with top AI researchers and scientists, including Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale, as well as a UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security.
The popularity also might have been due to the buzzy topic—AI agents, generally defined as an AI-powered system that can complete tasks, mostly autonomously, using other software tools. Think a chatbot not only suggesting a vacation itinerary, but also booking the flight and making the hotel reservation.
As my colleague Jeremy Kahn said in a recent article, “This kind of automation is a perennial C-suite fever dream. Over the past decade, companies embraced ‘robotic process automation,’ or RPA. This was software that could automate repetitive tasks, such as cutting and pasting between database programs. But traditional RPA systems are inflexible and unable to deal with exceptions, and can usually handle only one narrow task.” Agentic AI is meant to be both more flexible and powerful, adapting to business needs.
In a January 2025 blog post, OpenAI CEO Sam Altman said, “We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.”
But despite the hype, the overall message at the Agentic AI Summit was cautious and grounded: Agents may be the buzziest trend in AI right now, but the tech still has a long way to go, they said. AI agents, unfortunately, aren’t always reliable. They may not remember what came before.
Google DeepMind’s Chi, for example, stressed the gap between what agents can do in curated demos versus what’s still needed in real-world production environments. Pachocki highlighted concerns around the safety, security, and trustworthiness of agentic systems, particularly when they’re integrated into sensitive applications or operate autonomously.
“I still don’t think agents have really lived up to their promise,” said Sherwin Wu, head of engineering at OpenAI API. “Certain more generic cases have worked, but my day-to-day work doesn’t really feel that different with agents.”
While today’s agents may not currently live up to the massive hype (consider Salesforce CEO Marc Benioff’s recent claim that a shift to digital labor means he will be the “last CEO of Salesforce who only managed humans”), the speakers at the Agentic AI Summit still had plenty of optimism to share. Databricks’ Stoica expressed enthusiasm about infrastructure improvements that are making it easier to build agentic systems. Nvidia’s Dally suggested that continued hardware advances will enable more powerful and efficient agent behavior. Several pointed out “narrow wins” in specific domains, like coding.
Today’s AI agents may still have growing pains, but given the crowded UC Berkeley ballroom, the industry maintains its eye on the prize: AI agents that can reliably operate in the real world. The payoff, they believe, will be well worth the wait.
With that, here’s more AI news.
Sharon Goldman
sharon.goldman@fortune.com
@sharongoldman
This story was originally featured on Fortune.com