The 2026 AI Stack — How we think about it

Most AI stack posts are shopping lists. Three logos, two sentences each, a referral link buried in the URL. We read them the way you might read sponsored rankings — useful as signals about who is paying whom, less useful as advice. We do not write them, because your stack should not be ours. Your stack should reflect your constraints, your people, and the work you actually do.

What we can share is the way we think about picking AI tools in 2026 — the decision process underneath any recommendation we would make once we knew your business. If the brand names change next quarter, the framework should still hold.

This is a first version — short enough to read once, honest enough to act on. A later version will go deeper on evaluation, observability, and the categories that still carry too many "it depends" answers to commit to in one pass.

Foundation models — route, do not marry

We do not have a default model. Different work goes to different models. Coding, long-document reading, batch classification, customer-facing drafting — each has a different best answer at different times, and the best answer moves every few months as vendors release new versions. The failure mode we see most often is "we picked a vendor the way we pick a sports team." That is not strategy; it is identity.

The transferable habit is to keep a small set of evaluations written against your actual tasks, and re-run them whenever a vendor puts out a new model. The best model for a job is the one that wins your tests, not the one that wins a leaderboard screenshot.

A few factors that should guide the choice:

Where your data already lives — residency, admin, compliance.
Whether privacy-first tenancy is non-negotiable — in which case owning inference on open-weight models is sometimes the only answer.
Integration surface — who already lives inside your team's daily work.
The shape of the task — reasoning depth, instruction-following, throughput.

None of those factors collapse to "pick brand X." They collapse to "measure on your own work."

Developer infrastructure — boring on purpose

If you are building AI into real software, the durable layer is not the AI layer. It is the testing, deployment, database, and CI layer that keeps AI-generated work from turning your repository into a slot machine. Lint, typecheck, tests, preview deploys, migration checks — if your AI setup does not end in green checks, you do not have a setup. You have vibes.

Two habits we hold to:

Start with the vendor's official SDK. If you cannot explain why you need a framework on top of it, you do not need one.
Keep boring databases boring. Source-of-truth stores with real constraints and real migrations beat novelty databases you will regret during incident week.

How the work runs — who owns it when it breaks

Automation tools are useful when a non-engineer needs to move data today and the cost of wrongness is low. They become debt when they quietly become the production backend. The test: if a piece of work needs branching logic, retries, idempotency, audit logs, and an on-call rotation, you have outgrown the friendly boxes.

When work graduates into code, invest in the boring nouns: idempotency (so retries do not double-charge), dead-letter handling (so poison messages do not vanish), and human checkpoints anywhere the cost of a wrong action is asymmetrically bad. AI can draft the glue. It cannot take ownership.

What we do not recommend

Not companies — patterns. These keep recurring and we keep saying no:

Affiliate-first "stack" posts that exist to monetize clicks. If someone's income rises when you buy a worse tool, their advice is not advice.
Course-as-business models sold as passive income. A certificate does not install discipline in your company.
Thin "ChatGPT for X" wrappers with no data advantage and no depth inside a real workflow. You rent their margin and their downtime.
"AI employees" framing that promises headcount replacement without mapping responsibility. Tools do not have duty, judgment, or liability.
Passive-income AI arbitrage on someone else's terms of service. That is countdown to a ban, not a business model.
Vendors that silently swap backends to save cost while your quality cliff is left unexplained. If they cannot answer plainly about model identity, retention, and change management, you are renting a mystery box.

A preflight checklist — before you buy anything

Use this like a preflight when a vendor slides in with the word "revolutionary."

New capability or new wrapper? If it only automates what a spreadsheet and discipline did last year, you are buying theater.
Where is the moat? Good prompts are not a moat. Data, distribution, compliance posture, or depth inside a real workflow can be.
Data residency and access. Who sees prompts, attachments, outputs? What happens in a breach? If the vendor cannot answer plainly, pass.
Vendor death test. If they disappear tomorrow, what breaks?
Pricing sustainability. If the unit economics do not make sense on a napkin, your renewal will hurt.
Lock-in shape. Open formats, export paths, APIs you can mock in tests. If leaving requires a pilgrimage, say no early.
Hype-fade test. If language models became boring utilities tomorrow, would you still want this product on its non-AI merits?

If there is one meta-rule above the list: buy a working setup, not theater. The best tools reduce calendar time, reduce defect rate, or increase margin with a straight line you can explain to a skeptical CFO. Everything else is a hobby wearing a KPI costume.

What is intentionally deferred to a later version

Some categories need more case-by-case nuance than a single page can hold honestly. A later version will go deeper on content and distribution, production observability, lead generation beyond cold outbound, and specialized tools (vision pipelines, OCR, media generation) — with the kind of caveats you only write after you have been close to the sharp edges, or chosen to stay clear of them on purpose.

Closing

If you want help building your stack — not ours, not a template someone resold — that is the point of an engagement with Ochre & Co. We do not build AI for you. We hand over the frameworks, the tools, and the thinking your team needs to build it themselves, and we stay close while they do. If we work together, you walk away knowing how to keep going without us.

If you only want a single note when this page gets its next revision — no newsletter, no sequence, no funnel — leave your address wherever we wire the form on the site. The content stays visible either way. The email is optional.

Going deeper

Rather than a stack of logos, the more useful thing we can hand you is a short list of people whose writing and teaching is worth your time. If you read any three of these, you will be better equipped than ninety-five percent of buyers to sit across from a vendor without getting rushed.

Simon Willison — simonwillison.net — the clearest ongoing blog on what is actually working inside production LLM systems. If you want one person to follow for hands-on analysis of new tools without the hype, this is ours.
Andrej Karpathy — Intro to Large Language Models — a 1-hour plain-English explanation of how these systems actually work, from someone who has built them. Watch this before your next vendor call.
Nate B. Jones — Nate's Newsletter — former Amazon Prime Video product lead; daily practitioner-grade AI strategy for operators and executives.
Ethan Mollick, One Useful Thing — Wharton professor, best writing we know of for non-technical leaders trying to actually use this in real work.

For the in-house version of the arguments above, see our companion pieces: What "context" really costs you and Organization context before models.

If something in here maps to a problem you are sitting on

Two sentences on what you are trying to do is enough to start. We reply personally — no sequences, no one else to route you through.

New writing is announced via the — same list site-wide.