Novel TokenBreak Attack Method Can Bypass LLM Security Features
Researchers with HiddenLayers uncovered a new vulnerability in LLMs called TokenBreak, which could enable an attacker to get around content moderation features in many models simply by adding a few characters to words in a prompt.