AI Models Vulnerable to Backdoors from Just a Few Malicious Documents, Anthropic Study Finds
In a striking new study, researchers from Anthropic, working alongside the UK AI Security Institute and the Alan Turing Institute, have revealed a surprising vulnerability in large language models (LLMs). Their research shows that these models can develop backdoor vulnerabilities from as few as 250 malicious documents, challenging earlier assumptions...



