Synthetic Media at Scale: Building the Prompt Stack
When a client asks for a product campaign, the old workflow was: find a photographer, find a director, find a location, find talent, book everyone, shoot, edit. The new workflow: write a brief, run the prompt stack, deliver in 72 hours.
That's not always the right call. But it's increasingly often the right call for the right client.
Here's how we built the pipeline.
The Four-Tool Stack
Each tool does one thing well. We don't try to make one model do everything.
Midjourney handles stills. Brand imagery, product shots, key visuals. The model's spatial reasoning and understanding of brand aesthetics is still unmatched for static images. We use reference images when a client has established visual language.
Kling 1.6 handles video-from-image. We generate a strong Midjourney frame and animate it in Kling. This gives precise control over the first frame — critical for product work where consistency matters. Kling's physics simulation has improved enough that liquid, fabric, and particle effects now hold up at delivery quality.
Higgsfield Cinema handles narrative video. When we need something that feels shot rather than animated, Higgsfield is the right tool. Documentary feel, news-style footage, establishing shots.
ElevenLabs handles audio. We're not scoring original music on every brief. For corporate and social content, ElevenLabs generates VO that matches the visual tone, mixed with licensed music beds.
The Prompt Stack Document
We maintain a living document of prompts that have delivered at client level. Not abstract descriptions — specific, versioned, annotated prompts that work.
Structure for each saved prompt:
Tool: Midjourney v6.1
Prompt: Industrial port at golden hour, shipping containers stacked,
crane in background, moody sky, editorial photography style,
wide angle, sharp foreground --ar 16:9 --style raw
Reference: /refs/port-series-01.jpg
Rating: 4/5
Used for: MendisOne Shipping Campaign Q1 2026
Date: 2026-02-14
After six months, we have 340 prompts in the stack. When a new brief comes in, we search by category before writing anything new. Most briefs have a match or near-match at 80%+ quality.
QA Before Delivery
AI-generated content fails in specific ways. Our checklist before client review:
- Temporal consistency: does the footage drift mid-clip? Common in Kling on durations over 4 seconds
- Edge artifacts: soft halos around subjects in composited shots
- Text in frame: any generated text reads correctly — models still hallucinate text
- Colour accuracy: Midjourney's colour interpretation vs. client brand guide
- Audio sync: ElevenLabs VO timing against video cuts
We reject roughly 20% of generations at QA and regenerate. That cost is baked into project estimates.
What the Client Sees
We don't explain the pipeline in proposals. Clients brief us on the outcome. If the outcome delivers — and it does — the method is our business.
We've had clients ask to license images we generated because they wanted them for print. That's the right result.