The model is the replaceable part

If you switch LLM and you have to rebuild your stack, the LLM was never the asset. The thing you had to rebuild was.

I run my coding workflow across three different AI tools. Claude Code, Codex CLI, Kimi CLI. Three different models. Three different runtimes. One toolkit. I want to pick the model that fits the task in front of me without rebuilding anything to switch.

That used to mean three versions of every capability, each drifting away from the others by Wednesday. Now it's one definition that all three follow. This post is the pattern that fixed it.

What "locked in" actually means

People talk about model lock-in like it's a billing problem. It isn't. It's a structural problem. Lock-in is when the work you've done assumes one specific model on the other end, so changing the model means redoing the work.

Most "AI tooling" you build incrementally has this shape by default. You sit inside Claude. You write a Python script that calls the Anthropic SDK. You build a prompt template tuned to Claude's idioms. You create a workflow that depends on a tool name only Claude exposes. None of it asked to be Claude-specific. It just is, because you happened to be in Claude when you wrote it.

The day you want to do the same task in another harness, you have two choices. Rewrite for the new SDK. Or accept the capability doesn't exist there.

I rewrote. Then I rewrote again. Then I forgot which version was the canonical one. Each port drifted. The version I built second was always slightly behind the one I built first.

Two panels: on the left, three capability boxes funnel into a single CLAUDE bubble while GPT and KIMI sit greyed out. On the right, a CONTRACTS / MEMORY / CLIs stack fans out to three equal CLAUDE, GPT, KIMI model boxes.

That's the mistake. The model isn't supposed to be the centre of your stack. The contracts you give the model are.

The three layers that fix it

Once I worked through it, the pattern came out as three rules. They're boring on purpose. The whole point is that they're things you can build incrementally without learning anything exotic.

1. Markdown is the contract

A capability is a markdown file. The model reads it and follows it. Not a Python function calling an API. Not a model-specific prompt buried in code. A document that says: here's what to do, here's the format, here are the anti-rules.

Any model that can read English can run it. Claude has its own way to discover the file. Codex has another. Kimi has another. All three read the same file when pointed at it. The file is the contract. The harness is just how it gets opened.

This is the most counterintuitive part of the pattern for engineers, because the engineer's instinct is to wrap things in code. Wrapping it in code is what makes it model-specific. Leaving it as a document is what makes it portable.

2. Files are the memory

Long-term context lives in shared markdown on disk at a stable path. Any tool, any model, any session, reads the same files.

When I learn something about how I work (a preference, a recurring mistake to avoid, a belief about how a project should run), it goes into a memory file. When the next session starts, whatever harness I'm in, that session loads the same context. The memory doesn't know which model is reading it. It doesn't need to.

The price of this is discipline about the format. Every memory file has the same frontmatter shape. Every entry has the same fields. The benefit is a single source of truth that survives every model swap and every harness migration.

3. CLIs are the tools

Anything that needs deterministic execution (search, similarity, embeddings, transformations) is a command-line program. Models shell out.

A model can't reliably compute cosine similarity. A Python script can. So the script lives at the shell, the model invokes it with arguments, and gets JSON back. The model handles judgement. The script handles math. Both layers are model-agnostic by construction.

What this buys you

Once the toolkit is portable, the model becomes a knob, not a foundation. You're not choosing one model and building your life on it. You're choosing the right model for the task in front of you, and the toolkit you use to express the task doesn't care.

Hard reasoning task? Reach for the biggest model. Mechanical refactor across forty files? Reach for the cheapest fast one. Need to read a hundred-thousand-line transcript? Reach for whichever has the longest context window today.

Same capabilities every time. No port. No drift. No "I'll re-add that feature when I get around to it."

A single shared toolkit box at the top labelled CONTRACTS · MEMORY · CLIs. A connector fans out to four task boxes: Hard reasoning, Mechanical work, Long context, Quick draft. Each task box arrows down to a different model: OPUS, HAIKU, KIMI K2, SONNET. Tagline: 'choose by task · swap any time · no rewrite'.

The shape of the choice changes. It stops being "which model do I commit to?" and starts being "which model serves this turn?". Those are very different questions and they produce very different working habits.

The anti-pattern this kills

Vendor SDKs as your primary interface.

Every time you wrap your capability in import anthropic or import openai, you've built a model-specific abstraction. It will work beautifully right up until the moment you want to switch.

I'm not saying never use the SDKs. I'm saying don't let them be the surface your capabilities are defined against. The SDK is fine as an internal detail of one specific tool. It's a problem as the place where the behaviour lives.

The portable version doesn't import anything from the model vendor. It writes a markdown file. The vendor is whatever happens to be reading the file that session.

The one rule that does the most work

Build the interface, not the integration.

An integration ties you to one vendor. An interface accepts any vendor that can meet its contract. Markdown contracts, files, and CLIs are all interfaces. SDKs are integrations.

When in doubt, lean interface. Even when the integration is more ergonomic today. Even when "you're only using one model anyway, why bother." The cost of writing it as an interface is small. The cost of unwinding an integration after it's grown roots is enormous.

What it doesn't promise

This isn't model-agnostic in the sense that every model is equally good at every task. They aren't. Claude reasons differently from GPT differently from Kimi. Each has things it's better at and things it's worse at.

What this pattern gives you is the freedom to pick the best one for the job without throwing away your work. The capability is portable. The choice of model is still a judgement call. You just don't pay the rebuild tax anymore.

You also still need taste. The toolkit being portable doesn't mean every task should be run on every model. You're still making real engineering choices about which model belongs where, in the same way you'd make choices about which database belongs where. The point is the choice is reversible.

How this changes the way you build

A few practical knock-ons once you've adopted the pattern:

You stop reading "model X just released" announcements as a threat or a chore. You read them as "could be a new option for the knob." The release doesn't invalidate your toolkit. It expands the set of models that can plug into it.

You stop optimising your prompts for one model's quirks. You optimise the contract for clarity instead. A clear contract reads well in every model. A model-specific quirk-prompt reads weirdly in the others.

You stop fearing the cost surprise from any single vendor. If one model gets too expensive for a task, you reach for a cheaper one and the toolkit doesn't notice.

You stop equating "which AI tool I use" with "which way I work." You can switch harnesses freely and your workflow follows you.

The takeaway

The model is the replaceable part. Your contracts are the asset.

If your toolkit can't survive a model swap, you don't have a toolkit. You have a vendor relationship.

Build the interface, not the integration.