Every AI product I’ve touched in the last two years has been optimized for the same thing: answer the user, faster. Faster tokens, faster TTS, snappier turn-taking, lower latency to first response. The entire industry is in an arms race to make AI respond.
Then I got hired to work on Savr, a voice-first journaling app, and almost every engineering instinct I’d built up turned out to be wrong.
Savr’s product has three journaling modes — Interview, Listen, and Write — and the team’s north star for all three is something I hadn’t really seen before in AI product design: the AI’s job is not to answer the user. It’s to listen to them.
This article is about what changes when you design, prompt, and engineer an AI around that idea.
The problem with answer-optimized AI
If you open ChatGPT, Claude, Perplexity, or most voice assistants today, the implicit contract is the same: you bring a question, the AI brings an answer. Products are graded on how fast that answer arrives, how complete it is, and how confidently it’s delivered.
Apply that model to a journal and you get something that feels awful.
An answer-optimized journaling AI:
- Summarizes your day back at you before you’ve finished processing it.
- Labels what you’re feeling after two sentences. (“It sounds like you’re experiencing burnout.”)
- Offers three bullet points of advice you didn’t ask for.
- Rushes to fill every silence because silence reads like latency.
Journaling isn’t a retrieval task. It’s an expression task. The value is in the user doing the thinking, not the AI doing it for them.
Once I internalized that, a lot of “best practices” from voice AI had to get thrown out.
Four principles of a listener AI
The Savr team frames the experience around a simple promise: “no interruptions. No judgment. Just space to speak freely.” As an engineer, I had to translate that into concrete behaviors. These are the four principles I ended up working against:
Presence over productivity. An answerer wants to resolve the conversation. A listener wants to be with the user. There’s no ticket to close. Success isn’t a tidy summary — it’s the user feeling heard enough to keep going.
Questions over conclusions. When the AI does speak, its output is usually another question, not a verdict. This is harder than it sounds. LLMs are trained on data where helpful responses contain information. Getting a modern model to consistently respond with a curious, open-ended question — and not slide into giving advice — takes careful prompt design and guardrails.
Pauses are signal, not silence. This is where voice AI defaults fight you the hardest. Every VAD (voice activity detection) library ships tuned for assistants: short silence thresholds, aggressive turn-ending, snappy response times. That’s correct for “set a timer for 10 minutes.” It’s completely wrong for “I’ve been thinking about my kids a lot lately… [5-second pause while the user figures out how to say the next thing].” We had to retune VAD to treat long pauses as part of the conversation, not the end of it.
The user leaves with their own insight, not the AI’s. Maybe the most important one. The win condition for a journaling session isn’t the AI announcing something profound — it’s the user saying “huh, I didn’t realize that.” The AI is a mirror, not an oracle.
How those principles change the engineering
It’s easy to write principles on a slide. Harder to encode them in a pipeline. Here’s what actually changed at the code level:
VAD tuning for reflection. We pushed silence-end thresholds way past what most voice AI tutorials recommend. Tutorials optimize for “user stopped talking, bot go.” Journaling optimizes for “user is still thinking, bot wait.”
Interruption strategy. Most voice agents let the user interrupt the AI freely. For journaling, we actually mute the user while the AI is speaking its (short) prompts. When the AI asks a question, we want it to land — not collide with the user’s first half-formed thought.
Prompt design. The system instructions are structured around asking, not telling. Open-ended prompts, reflective mirrors, permission-giving language. No diagnostic labels, no unsolicited advice, no closing summaries. This is where most of the “listener” personality actually lives.
Tool design. In modern LLM apps, tools are usually about doing things — search the web, call an API, update a database. In Savr’s voice agent, the tools are mostly about ending gracefully: finishing the session, saving the transcript, and cancelling cleanly. The AI’s job during the conversation is to listen. Its tool calls are about respectfully wrapping up.
Deferred analysis. This might be the most important architectural decision. Insight generation — the “what did this entry reveal about you” layer — happens after the session, not during it. By separating the real-time conversation pipeline from the analysis pipeline, we keep the live moment human. The AI isn’t labeling your emotions while you’re still feeling them. It reflects back patterns later, in a different surface, when you’re ready.
Three modes, one rule
What ties Savr’s three modes together is that same listener-first rule:
- Interview Mode — the AI listens and asks thoughtful follow-ups, but never pushes to a conclusion.
- Listen Mode — the AI is almost completely silent. You talk; it captures. No turn-taking at all.
- Write Mode — the AI isn’t present in the moment at all. You write, and the AI only shows up afterward in the analysis layer.
From an engineering standpoint, these are three very different pipelines — real-time voice, audio-capture-and-process, and plain text. But the product design across all three is consistent: the AI is never trying to out-think the user in the moment.
A different benchmark
Most AI products this year will compete on reasoning benchmarks, latency numbers, and how many tools they can call in a single turn. Those benchmarks matter for a lot of use cases.
But working on Savr convinced me there’s a whole category of AI products where none of those metrics are the right ones. The benchmark that matters is something much harder to measure: did the user feel heard?
You don’t get there by making the AI smarter. You get there by making it quieter.
If you want to see what that looks like in practice, Savr is on the App Store and Google Play — three modes, one rule, and an AI that’s trying very hard not to answer you.