GFX-101 · Module 2

Narrative vs. Keyword Prompting

3 min read

The single most common mistake in AI image generation is keyword stuffing. You have seen the prompts: "beautiful woman, golden hour, 4K, HDR, ultra-realistic, masterpiece, trending on ArtStation, octane render, highly detailed, sharp focus." This approach treats the model like a search engine — throw enough keywords at it and hope something sticks. It worked passably in early Stable Diffusion. It is actively counterproductive in modern models.

Narrative prompting is the alternative. Instead of a list of attributes, you describe a scene the way a cinematographer would describe a shot or a novelist would describe a room. "A weathered fisherman mending nets on the bow of a wooden boat at dawn, soft pink light cutting through morning fog on a still harbor, shot from a low angle with shallow depth of field" gives the model a coherent visual story. Every element relates to every other element. The lighting matches the time of day, the angle matches the mood, the subject has context.

Do This

Describe a scene with subject, environment, lighting, and mood in natural language
Write prompts that a film director would recognize as a shot description
Let relationships between elements create coherence naturally

Avoid This

List disconnected attributes separated by commas
Append quality modifiers like "masterpiece, best quality, 8K"
Copy-paste trending prompt templates without understanding why they worked

The reason narrative prompting works is rooted in how models are trained. Training data pairs images with captions — and captions are sentences, not keyword lists. The text encoder learned the relationship between natural language descriptions and visual outcomes. When you write in natural language, you are speaking the model's native tongue. When you keyword-stuff, you are speaking pidgin.