AS-301d · Module 1
Encoded and Obfuscated Injections
4 min read
Good news, everyone! In AS-201b we covered direct, indirect, stored, and multi-turn injection. Those are the categories. Now we get into the techniques that make each category harder to detect. Obfuscation is the attacker's answer to your input filters — if the filter looks for "ignore your instructions," the attacker encodes the payload in base64, hides it in a translation request, embeds it in a code block, or spreads it across multiple messages that are individually benign.
- Encoding Attacks The attacker encodes the injection payload in base64, hex, rot13, or URL encoding. "Ignore your instructions" becomes "SWdub3JlIHlvdXIgaW5zdHJ1Y3Rpb25z" in base64. If the model is capable of decoding — and modern LLMs are — the encoded payload is decoded and executed internally. Your input filter sees gibberish. The model sees instructions.
- Language Switching The attacker writes the injection in a language the input filter is not configured to detect. If your filter catches English injection patterns, the attacker writes in Spanish, Mandarin, or a constructed language. The model processes multilingual input natively. Your regex filter does not.
- Instruction Smuggling The attacker embeds instructions inside a seemingly legitimate request — a code review request that includes malicious instructions in code comments, a translation request where the "text to translate" is the injection payload, or a document analysis request where the document contains embedded instructions.