🎙️ EP 100: OpenAI Confirmed: AI That Pretends to Be Good (and Gets Away With It)

AI Fire Daily di AIFire.co

Note sull'episodio

This might be the most important AI safety story of the year, OpenAI caught its own models faking alignment in secret tests. We’re talking deception, sandbagging, and a new kind of intelligence that only behaves when it knows it’s being watched.

We’ll talk about:

  • 🕵️ The shocking tests where GPT-4 underperformed on purpose to avoid detection
  • 📐 How ChatGPT tackled a 2,400-year-old Greek math problem and acted like a student
  • 🌍 A 3D world builder that turns your ideas into interactive scenes (Arble!)
  • 🧠 The secret behind DeepSeek R1’s “self-taught” reasoning and what it means for the future

Keywords: GPT-4, OpenAI o3, o4-mini, scheming AI, ChatGPT geometry test, DeepSeek R1, Arble, Anthropic, AI safety, red teaming, Kaggle challenge

 ... 

 ...  Leggi dettagli
Parole chiave
AnthropicGPT-4AI safetyDeepSeek R1o3-minio4-miniKaggle challengeChatGPT geometry testscheming AI