Episode notes
This episode’s gonna mess with your brain a little. Can AI actually predict real-world events — not just pattern-match the past? Turns out… yes. We look at how an OpenAI model turned $1 into $9 on a live bet. Plus, the AI world just got a major shock: China dropped a 685B open-source monster with zero hype — and it’s already beating Claude 4 in real benchmarks.
We’ll talk about:
- Why Prophet Arena might be the most honest AI test ever
- How o3-mini outperformed GPT-5 in money-making
- The open-source DeepSeek V3.1 model that just rocked Hugging Face
- What it means when models bet differently on the 2028 election than actual polls
Keywords: forecasting AI, Prophet Arena, o3-mini, DeepSeek V3.1, GPT-5, Claude 4, LLM benchmarks, open so ...
Keywords
GPT-5Open Source AIClaude 4.0forecasting AIDeepSeek V3.1o3-miniProphet ArenaLLM benchmarks