S2E5 | What senses do agents need to act? — The Shift Podcast by Microsoft Azure

The Shift: Your open questions about agents, honest discussi... di Microsoft

Note sull'episodio

For AI agents to move from reasoning to action, they need more than text alone. But what “senses” actually matter?

In this episode of The Shift Podcast: Agentic Edition, members of the Microsoft Foundry team discuss how multimodal inputs—such as text, vision, and speech—shape how agents perceive and interact with the world. The conversation explores what’s practical today, rather than assuming fully autonomous systems.

The discussion covers:

· Why multimodal AI expands what agents can understand.

· How vision, voice, and text models are combined in applications.

· The role of tools and APIs in enabling agent action.

· Where modality adds value—and where it introduces complexity.

Rather than framing modalities as future cap ... 

 ...  Leggi dettagli
Parole chiave
generative AIagentsagentic AIAgentic Systemsai agentsbuilding agents, ai platformsMultimodal AIAI app development API management API orchestration