Poison an AI with a Secret Trigger Phrase? What Chatbot "Backdoor" Really Means
Can you really hijack an LLM just by repeating a secret activation word? A recent research paper made headlines claiming that a language model can be corrupted with only a few hundred documents—and a “hidden backdoor” can then be opened with a specific codeword. In this episode, we break down what the paper actually showed, what a real “backdoor” means in machine learning, and why everyday chatbots aren’t secretly learning from you, storing memories, or being corrupted by your conversations.
Previous
Google Willow quantum supremacy—Computing revolution, or just neat lab trick?
Next