How Splitting “Thinking” from “Formatting” Improved Our AI Pipeline
Bradley Smith · CTO & Co-Founder · February 17, 2026
Our prompts were killing our output.
If you are building LLM pipelines and getting mediocre results, check whether you are asking the model to think and format at the same time.
We were. It was destroying our output quality.
At Threadline Studio, we build AI editing tools for professional video production. Our pipeline reads hours of interview transcripts and selects the best clips for a first cut.
Early on, we asked one LLM call to do both jobs at once. The creative work: decide what story to tell. And the structural work: output precise JSON with clip references and timestamps.
The model kept compromising on both. We would get interesting creative choices with broken references. Or perfect JSON with boring, safe selections.
The Research Behind the Fix
Turns out this is not just us.
Researchers at Appier AI Research and National Taiwan University published a study showing that format restrictions significantly degrade LLM reasoning ability. The stricter the output constraints, the worse the thinking gets.
Their recommended fix: let the model reason in natural language first, then convert to structured format in a second pass.
That is exactly what we landed on independently before finding the paper.
The Writer and the Clerk
Now our pipeline has two distinct roles.
The "Writer" reasons freely about narrative, story beats, and emotional pacing with no formatting constraints at all. It thinks about what makes a compelling story from the raw material.
Then the "Clerk" takes those creative decisions and maps them to precise JSON with clip IDs, timestamps, and sequence metadata.
The Writer never sees a schema. The Clerk never makes a creative choice.
Quality jumped immediately. The model started making bolder editorial decisions because it was not trying to simultaneously satisfy a JSON validator.
Practical Takeaways
If you are running LLM pipelines where the output feels "safe" or generic, look at whether you are asking the model to reason and format in the same breath. Splitting those two jobs might be the cheapest quality upgrade you can make.
The pattern generalizes beyond video editing. Any pipeline where you need both creative judgment and structured output will likely benefit from separating the two steps.
