Model selection and routing
Picking the right model per task, routing between providers, managing cost vs quality trade-offs at runtime.
Question — — by Mahmoud Zalt
AI Product Managers own the problem, users, and roadmap. AI Engineers own the model, infrastructure, and shipped behavior. Same product, different jobs.
An AI Product Manager owns the question of what to build and why, framed around real users and a measurable outcome. They translate fuzzy demand ("founders want help with marketing") into a sharp product hypothesis ("a solo founder will pay $29 a month to ship one campaign a week without hiring"), define the success metric, prioritize the roadmap, and write the spec the team executes against. The work is heavy on customer interviews, competitive mapping, pricing logic, retention analysis, and trade-off conversations about scope. What is different from a normal PM job is that the unit of value is non-deterministic: the model can fail in shapes a feature flag cannot. So an AI PM also owns failure-mode budgets, evaluation criteria for quality, and the policy for what the agent is allowed to do. The deliverable looks like a PRD, but the spine of it is a behavior contract, not a screen flow.
An AI Engineer owns how the product actually works under the hood and whether the model behavior matches the contract the PM wrote. The job covers prompt engineering, model selection and routing, retrieval pipelines, tool integration, evaluation harnesses, latency and cost budgets, guardrails, and the production wiring that makes a chat session feel reliable. On a serious team they also own observability for LLM calls (token spend, regression on quality, hallucination rates), the rollback story when a model upgrade breaks something, and the feedback loop that turns user thumbs-down into a fix. The deliverable is shipped behavior, not a doc: a feature that responds correctly, fast enough, cheap enough, and recovers when the model is wrong. They are the only person on the team who can actually answer the question "why did the agent do that" with evidence pulled from traces.
Picking the right model per task, routing between providers, managing cost vs quality trade-offs at runtime.
Designing system prompts, context windows, memory injection, and the data the model sees on every call.
Writing eval datasets, regression tests on behavior, and gates that block bad model deployments.
Wiring the agent to APIs, databases, and external actions, with safety rails on destructive operations.
Capturing every LLM call, token, and decision so failures can be debugged with evidence, not guesses.
The healthy collaboration pattern is a tight loop, not a handoff. The PM brings a user problem with a sharp framing, the engineer prototypes the simplest behavior that could work, the two of them watch real sessions together, and the spec updates after the prototype, not before. Most AI features die when the PM writes a 12-page document, throws it over the wall, and only sees the result two sprints later when the demo embarrasses everyone. The reverse failure (engineer ships a clever model without a user problem) produces beautiful demos that nobody renews. The correct cadence I have seen work, including on Sistava, is short loops: ship a thin slice to a real user inside a week, measure the one metric that matters, decide on evidence whether to deepen or kill. The PM holds the metric, the engineer holds the implementation, and both hold the trace viewer when something looks wrong.
What that loop guards against is the most expensive AI mistake in 2025: shipping a feature that is technically impressive and commercially invisible. The PM keeps the team honest about whether anyone actually wants what is being built, and the engineer keeps the team honest about whether the demo holds up under real traffic. Neither role can replace the other, even with a smart model in the loop. On a one-person team you can wear both hats, but you cannot skip either set of work.
If you do not have either role on payroll yet, you have three honest choices: hire one and stretch them, contract a fractional specialist for two days a week, or let an AI Employee absorb the parts of the job that are mostly judgement plus repetition. None of those replace a senior human on the hard calls, but all three buy you time to learn which role binds you harder. Most solo founders find the PM judgement is what they can credibly hold themselves, and the engineering execution is what falls behind.
AI Product Managers need the standard PM toolkit (user research, prioritization, roadmap, stakeholder writing) plus three AI-specific muscles: comfort with evaluation as a product surface (you do not ship without an eval, the way you would not ship a screen without a design), fluency in failure-mode language (hallucination, drift, regression, refusal), and a calibrated intuition for what the model can and cannot do this quarter so the roadmap is not science fiction. AI Engineers need solid software engineering (Python or TypeScript, distributed systems basics, API design, observability) plus AI-specific depth: prompt engineering as a discipline not a vibe, retrieval and context construction, evaluation harnesses, agent frameworks, and the runtime trade-offs between speed, cost, and quality. The biggest gap on most teams is not raw model knowledge, it is the eval discipline shared between the two.
Interviews, ICP work, competitive mapping, pricing logic, retention analysis. The non-AI half of the PM job stays heavy.
Reads eval results like a designer reads a Figma. Writes the behavior contract and the failure budget.
APIs, queues, retries, observability, cost controls. Most of the work is normal software, done well.
Prompt patterns, retrieval, agents, evals, routing. The AI-specific half is where seniority shows.
The right answer is boring and depends on what is actually broken in your team. If the team ships features that nobody uses or the roadmap is a list of demos, hire the AI PM first. If the team has a clear roadmap but the agent breaks in production every week and nobody can explain why, hire the AI Engineer first. If you are a solo founder, hold the PM hat yourself (you know the user) and contract or hire the engineer to ship the behavior. If you are a 5-10 person team without either, the cheaper bet is usually a senior AI Engineer who can sit close to the founder doing the PM work, because the engineer can ship while the PM thinking is being formed, and the reverse rarely works. The wrong answer is to hire both and have neither own the metric: that is how you end up with a roadmap full of model upgrades that move no business number.
Not quite. The standard PM toolkit (research, prioritization, roadmap, writing) carries over fully, but the AI PM also owns evaluation as a product surface, the behavior contract for non-deterministic features, and the policy for what the agent is allowed to do. A regular PM treats quality as binary (it works or it does not). An AI PM treats quality as a distribution with a budget for failure modes.
Yes, on a small team or a solo founder setup. I do both on Sistava. The trade-off is depth: you will be shallower in one half, usually the half closer to your background. The risk is letting the comfortable half eat all your time. If you are an engineer doing both, force yourself to do real customer interviews weekly. If you are a PM doing both, force yourself to read traces and write evals weekly.
Ranges shift quickly, but as of writing, senior AI Engineers in the US command total comp in the $250-450K range at well-funded startups and well above that at top AI labs. AI Product Managers run roughly $200-350K total comp at the same companies. Both roles compress significantly at early-stage startups and in Europe, where $120-180K base is more common. The premium over a non-AI counterpart is real but not absurd.
Almost never for product AI work. The job is applied: prompt engineering, retrieval, evals, tool integration, production reliability, cost control. A strong software engineer with six months of focused AI practice is usually a better hire than a research ML engineer with no production experience. Hire research backgrounds for foundation model work, not for shipping agents.
Data PMs own a data product (dashboards, pipelines, BI) and the team that builds it. Analytics PMs own the measurement layer for other product teams. AI PMs own a product feature where the core capability is a model. They overlap on the data literacy side but the AI PM also carries the behavior contract, eval discipline, and model policy that the other two roles do not.
Before you write the job spec, the more useful exercise is to clarify what your team would do with each hire in their first 60 days. If the first-60-days plan for an AI PM is mostly meetings and no shipped behavior, you do not need a PM yet. If the first-60-days plan for an AI Engineer is mostly research with no production code, you do not need a research hire, you need a builder. The same logic applies if you are running the work with AI Employees instead of humans: pick the role that has a concrete, measurable first job.
The honest framing I use when founders ask me to choose between the two roles: AI PMs and AI Engineers are not competitors for the same headcount, they are two halves of the same job-to-be-done. The PM half asks "are we building the right thing for someone who will pay" and the engineer half asks "is the thing we built actually reliable under real load." Skip either half and the product gets worse in a predictable way. If you are a solo founder, you are already doing one half; the question is which half you are systematically neglecting and what cheapest move closes the gap. Sometimes that is a contractor for two days a week. Sometimes that is an AI Employee absorbing the repetitive parts. Sometimes it is sitting down with the trace viewer yourself and learning the discipline. None of those answers are exciting, but the exciting answers tend to be the ones that cost the most and ship the least.