Note sull'episodio
What if I told you that a few hundred poisoned documents could break models as big as GPT-4 or Claude? đľ Anthropic just proved it. Their new paper shows that just 250 samples can secretly backdoor any LLM, no matter the size. In todayâs episode, we unpack this wild discovery, why it changes AI security forever, and what it means for the future of open-web training.
Weâll talk about:
- How Anthropicâs team used 250 poisoned docs to make 13B-parameter models output gibberish on command
- Why bigger models donât mean safer models and why scale canât protect against poison
- The rise of TOUCAN, the open dataset from MIT-IBM thatâs changing how AI agents learn real-world tools
- The new AI race: from Jony Iveâs âanti-iPhoneâ with OpenAI to Amazonâs Quick Suite for busi ...Â
Parole chiave
AnthropicAI AgentsClaudeOpenAIGoogle GeminiAI securityData SecurityTOUCAN datasetbackdoor attacks