Episode notes
The AI world just flipped. ChatGPT’s once-clear lead? Fading fast. A hacker used Claude to run a full-blown cybercrime spree. And a new benchmark just crushed most “smart” research agents.
We’ll talk about:
- Google Gemini & Grok catching up to ChatGPT in real-time
- How a hacker used Claude to extort 17 companies — end-to-end
- AstaBench: the new gold standard for testing science agents
- Why a physics-powered robot brain just raised $405M (and what it can actually do)
Keywords: ChatGPT, Gemini, Grok, Claude, AstaBench, FieldAI, AI benchmarks, agentic workflows, Claude hack, GPT-5
Links:
- Newsletter: Si ...
Keywords
ChatGPTGeminiClaudeGPT-5GrokAstaBenchClaude hackagentic workflowsAI benchmarks