If You're Building LLM Pipelines and Getting Mediocre Results, Read This
Bradley Smith · CTO & Co-Founder · February 28, 2026
If you're building LLM pipelines and getting mediocre results, check whether you're asking the model to think and format at the same time.
We were. It was killing our output quality.
We build AI editing tools for documentary filmmakers at Threadline Studio. Our pipeline reads hours of interview transcripts and selects the best clips for a rough cut.
Early on, we asked one LLM call to do both:
Creative work: decide what story to tell. Structural work: output precise JSON with clip references and timestamps.
The model kept compromising on both. Interesting creative choices with broken references. Or perfect JSON with boring, safe selections.
Turns out this isn't just us.
Researchers at Appier AI Research and National Taiwan University published a study showing that format restrictions significantly degrade LLM reasoning ability. The stricter the output constraints, the worse the thinking gets.
Their recommended fix: let the model reason in natural language first, then convert to structured format in a second pass.
That's exactly what we landed on independently before finding the paper.
Now our pipeline has two distinct roles. The "Writer" reasons freely about narrative, story beats, and emotional pacing with no formatting constraints. Then the "Clerk" takes those creative decisions and maps them to precise JSON with clip IDs and timestamps.
The Writer never sees a schema. The Clerk never makes a creative choice.
Quality jumped immediately. The model started making bolder editorial decisions because it wasn't trying to simultaneously satisfy a JSON validator.
If you're running LLM pipelines where the output feels "safe" or generic, look at whether you're asking the model to reason and format in the same breath. Splitting those two jobs might be the cheapest quality upgrade you can make.
Link to the paper: https://arxiv.org/abs/2408.02442
Have you run into this tension between reasoning quality and structured output in your own LLM work?
