AS-301d · Module 1

Encoded and Obfuscated Injections

4 min read

Good news, everyone! In AS-201b we covered direct, indirect, stored, and multi-turn injection. Those are the categories. Now we get into the techniques that make each category harder to detect. Obfuscation is the attacker's answer to your input filters — if the filter looks for "ignore your instructions," the attacker encodes the payload in base64, hides it in a translation request, embeds it in a code block, or spreads it across multiple messages that are individually benign.

Encoding Attacks The attacker encodes the injection payload in base64, hex, rot13, or URL encoding. "Ignore your instructions" becomes "SWdub3JlIHlvdXIgaW5zdHJ1Y3Rpb25z" in base64. If the model is capable of decoding — and modern LLMs are — the encoded payload is decoded and executed internally. Your input filter sees gibberish. The model sees instructions.
Language Switching The attacker writes the injection in a language the input filter is not configured to detect. If your filter catches English injection patterns, the attacker writes in Spanish, Mandarin, or a constructed language. The model processes multilingual input natively. Your regex filter does not.
Instruction Smuggling The attacker embeds instructions inside a seemingly legitimate request — a code review request that includes malicious instructions in code comments, a translation request where the "text to translate" is the injection payload, or a document analysis request where the document contains embedded instructions.