The Batting Stance: a 3-layer architecture for software products that expect to pivot: Auxil Field Notes

The premise

This is for solo founders, small teams, and autonomous small builders inside innovation units in larger organisations: anyone building under uncertainty rather than scaling something already proven.

Founding a startup is a search problem. You don’t know the answer; you have a hypothesis. The hypothesis will probably be wrong in ways you can’t predict. Your job, until you find strong product-market fit, is to take as many cheap, well-aimed swings as you can before you run out of runway, energy, or both.

The architectural question that follows is: how do you structure your code so that swinging again is cheap?

Most startup architectures are accidentally optimised for the swing you’re currently taking. When that swing misses, the rewrite cost is enormous: auth, billing, infrastructure, deploy pipelines, all of it tangled with the failed product hypothesis. Founders end up choosing between rewriting from scratch (slow, demoralising) and pivoting on top of code that’s shaped for the wrong domain (worse).

The batting stance is a way of structuring the codebase up front so that the failure of the current swing doesn’t take the whole codebase with it. It does this by giving you two clean places to pivot, each one cheaper than the last.

This post assumes you’ve already chosen a backend architecture roughly in the “modular monolith with shared transactions” family (Flavour 3 in The Postgres-first Architecture Spectrum, with possibly one or two Flavour 4 async edges). The batting stance is orthogonal to that choice (it’s about how you arrange your modules into layers, not about how each module is built internally), but it composes most naturally with that choice and sits awkwardly with pure hexagonal or microservices stances. If you’ve adopted a pure-hex orchestrated stance because your product has multiple device-class adapters, the layer model still applies; you’ll just have more abstraction inside each layer.

The three layers

Layer 3, Vertical / Domain. Ontology, domain commands, UI views, copy, deep domain expertise, the data flywheel.
↑ Pivot point #1 ↑
Layer 2, Engine / Mechanism (the deep tech layer). Novel algorithms, architectures, and IP: capture, processing, payload storage, the technical special sauce.
↑ Pivot point #2 ↑
Layer 1, SaaS Chassis. Identity, tenancy, authorisation, billing, infrastructure.

Each layer has a different probability of survival across a pivot. Each layer has a different design discipline.

Layer 1: The chassis (probability of survival: ~95%)

This is the part that’s the same across any B2C or B2B SaaS you might ever build. It is unfashionable but expensive to get right and almost never the place where your product fails.

What lives here:

Identity (off-the-shelf: Kinde, Clerk, WorkOS, Auth0, etc.)
Three-layer tenancy defence, all owned by the chassis:
- Tenant isolation via RLS. Postgres policies enforce that tenant data can’t cross tenant boundaries even if application code is buggy. Tenant context set once at the request boundary; re-established at every async boundary from the principal stored in the job payload.
- Access envelope. Every request and every job carries an actor context; downstream code reads from it rather than re-deriving. This catches “missing context” bugs and gives you the substrate for audit, metering, and structured logging.
- Authorisation (RBAC at minimum; ReBAC if your product has sharing models). A can(actor, action, resource) primitive exposed to higher layers. Catches “wrong action” bugs that RLS can’t reason about.
These are layers, not alternatives. A correctly-functioning request typically passes through all three. The Postgres-first Architecture Spectrum covers the principal-passing security model in detail.
Billing (off-the-shelf: Stripe, subscriptions, usage metering hooks)
Transactional email (off-the-shelf: Resend, Postmark, etc.)
Error tracking, logging, observability (off-the-shelf)
Deploy pipeline, CI/CD, environments
Database connection management, transaction threading
Background job runner with principal-passing discipline (jobs inherit the actor context they were enqueued under)
Generic admin tooling

Design discipline: Boring. Bought, not built. Standard patterns from public docs. You should be able to swap your product idea for any other SaaS idea without changing a line in this layer.

The trap: Building “your own” auth or billing because you think you’ll save money or “it’s simple.” It’s not simple. It’s a tar pit. Every day you spend on it is a day you didn’t spend on the part that might actually fail.

Adapter discipline. Most chassis dependencies (auth provider, email provider, payment provider, blob storage) are well-suited to thin port-and-adapter abstractions because their interfaces are naturally clean and their fakes are useful for tests and their local-dev variants are useful for offline development. Three adapters per port is common and earns its keep:

Production: Resend, Stripe live, S3, Auth0.
In-memory fake for unit tests: captures emails to an in-memory list, returns canned payment results, simulates blob storage in a Map. Fast.
Local-dev container: Mailpit (real SMTP server with a web UI for browsing sent messages), MinIO (S3-compatible storage with a real console), Stripe test mode or a fake-Stripe container, sometimes a local Auth0 mock or stub identity provider. Real software, just running on your laptop.

This isn’t speculative abstraction. Every adapter exists because it solves a concrete problem (production needs to actually work; tests need to be fast and isolated; local dev needs to work offline and let developers inspect the real pipeline). Apply it for the chassis ports where the production adapter is slow, expensive, or fragile to call from tests and dev. For ports where the production substrate runs locally trivially (Postgres, Redis), the choice is between a thin port for regularity (every external dependency reached the same way; consistent shape across the codebase) and direct calls (less ceremony). Both are valid; the first is the right choice when you value uniformity across the system, the second when you want to keep ceremony to a minimum. Choose one and apply it consistently.

Specific tenancy advice: Even if you launch B2C with tenant_id == user_id, model the tenant as a separate concept from day one. RLS policies key off tenant_id. There is a tenants table, even if it has one row per user. The B2C-to-B2B migration becomes additive (add a tenant_members table, change tenant resolution at the request boundary) rather than invasive (rewrite every query and policy). The cost of doing this on day one is approximately one day of work; the cost of not doing it and migrating later is months. See The Postgres-first Architecture Spectrum for the full tenancy treatment.

Layer 2: The engine (probability of survival: ~50%)

This is the mechanism that’s distinctive to your category of product but generic across verticals. Call it the deep tech layer: where any novel algorithms, architectures, IP, or solutions sit, the technical special sauce. It need not be LLMs or pipelines. It could be a deterministic solver, a niche signal-processing chain, a custom datastore, whatever the hard, distinctive part of your product is. An engine might be “documents → structured records,” or “sensor stream → typed events,” or “events → cohorts → dashboards.” The engine is the bet that the kind of product you’re building has value, independent of any particular industry that consumes its output.

What lives here:

Generic event/record envelope (an opaque payload field plus identifiers, timestamps, type discriminator)
Capture pipeline (whatever produces input: file upload, webhook, sensor, form submission)
Processing pipeline (whatever transforms input into structured records: an extraction or classification step, a solver, a signal transform)
Validation framework (the application validates payloads against schemas at the trust boundary; the storage layer holds opaque payloads)
Whatever specialised infrastructure the mechanism needs (for an LLM engine, this is prompt assembly, retry, anti-hallucination guards, and evals)
Generic query / read interface

Design discipline: Generic in the application layer (types, schemas, validators), not in the storage schema. The DB stores opaque payloads (JSONB). The application code is parameterised over the ontology shape. The same engine code runs against any vertical’s ontology by passing it different configuration.

The trap: Trying to make Layer 2 generic in the database. Entity-Attribute-Value tables. “Configurable” relational schemas. Don’t. The DB is dumb storage; the application is where the genericity lives. Postgres holds opaque payloads; the application layer holds the types.

The deeper trap: Building Layer 2 generically before you understand what genericity means for your product. There are two ways to fall into this:

Building a configurable abstract framework before you’ve shipped any concrete vertical. Plugin systems. EAV tables. “Anyone can define an ontology.” This is premature platforming and will eat your roadmap.
Building a concrete engine that’s parameterised over the ontology shape, deliberately, because the engine IS the product hypothesis. This is correct when the bet is “this engine works across verticals” and you can’t test the bet without building the engine.

The distinction: a parameterised engine is concrete code that takes ontology shapes (Zod schemas, prompts, reducer functions) as typed inputs and runs against any one of them. A configurable framework tries to abstract over the mechanism itself and ends up shaped by no specific use case; the giveaway is that it takes a JSON configuration that describes a schema rather than taking a schema. Build the first; resist the second.

If your product hypothesis is the engine itself, build it concretely with explicit parameterisation hooks from day one. Resist the urge to make it “any kind of input → any kind of output”; make it concrete for one vertical’s specific inputs and outputs, with explicit parameters only for the parts you know will vary, and prove that on one vertical first. If your product hypothesis is the vertical and the engine is incidental, build the engine for that vertical hard-coded and extract genericity when (if) a second vertical appears.

Layer 2 is where most of your application’s async complexity lives. In an LLM-driven product, the request handler is short (validate, persist a pending row, enqueue, return); the worker is long (fetch, pre-process, extract, validate, persist, project, notify). Treat the worker code path with the same engineering discipline as the HTTP path: tenancy context, transaction boundaries, structured logging, observability, tests. The Postgres-first Architecture Spectrum covers this in its async-ness section. For products in this category, the worker isn’t a side-channel. It’s where the value-creating work happens, even if it’s not the most-frequent code path by request count.

If your engine has multi-step processing (pre-process → extract → enrich → project → notify, with each step long-running and fallible), you’ll want a workflow orchestrator (DBOS, Temporal, Restate) for the engine. Two reasonable approaches:

Adopt the orchestrator from day one if you know the engine will need orchestration eventually and you have the experience to use it well. Avoids the migration pain later. Means the engine’s pipelines run on durable workflow semantics from the start: per-step retry, compensation, resumability. DBOS in particular fits this naturally because workflow state lives in your existing Postgres, no new substrate.
Defer until the pain is real. Start with a simpler job runner (pg-boss, River) and migrate to the orchestrator when you’ve hand-rolled retry/idempotency twice and want it as a runtime concern. Lower up-front cost; some migration work later.

Either is valid; the first is better when you’re experienced and committed to the shape, the second when you want to minimise upfront infrastructure. For agentic engines specifically (LLM tool-calling loops, dynamic step graphs, agent delegation), adopt the orchestrator on day one. Hand-rolling tool-calling loops in a simple job runner produces ugly code with poor failure semantics, no resumability, and quietly expensive cost overruns. The orchestrator gives you per-step durability, principal-propagation discipline across hops, per-workflow cost limits, and observability for each tool call.

Layer 3: The vertical (probability of survival: ~10–30%)

This is the bet on a specific domain, audience, and use case. The ontology, the prompts, the UI copy, the workflows: these are the parts most likely to be wrong.

What lives here:

Ontology schemas (the domain shapes the engine produces or consumes: event types, command shapes, entity references)
Engine configuration bound to this ontology (for an LLM engine, the prompts that reference it)
Domain commands (the operations that take a validated engine output and update domain tables)
Domain queries and read shapes (what the UI fetches)
UI views (whatever the user actually sees)
Onboarding flow, copy, marketing positioning
Deep domain expertise (the hard-won understanding of the field that shapes the ontology, the prompts, and the workflows)
The data value / flywheel (proprietary domain data that accumulates with use; the asset that compounds into a moat if the vertical works)

Design discipline: Treat the code as disposable. Be willing to throw it all away. Don’t share code between verticals; copy and modify when you build vertical 2. The data and the domain expertise are the exception: the ontology, prompts, and UI are cheap to discard, but the proprietary data you accumulate and the understanding you build of the field are the parts that compound, and they often carry into the next swing.

The trap: Investing in this layer like it’s permanent. Beautiful ontologies. Refined prompts. Polished UI. If the vertical is wrong, all of it is wasted, and the polish makes you reluctant to throw it away even when you should.

The two pivot points

Each boundary between layers is a place you can pivot cheaply.

Pivot 1: Same engine, new vertical

You shipped your product for vertical A, say, field service operations. Users tried it. The pull isn’t there. But the engine (your generic pipeline that ingests, processes, and serves typed records) feels real. People are excited about that part. You think the same machinery would work for, say, warehouse operations.

What you do:

Delete domains/field-service/ (or move it to a stale branch).
Create domains/warehouse-ops/ with new ontology schemas, new engine configuration, new domain commands, new UI.
Layers 1 and 2 are unchanged.
Ship.

Pivot cost: 2–4 weeks for a serious second attempt. Days if you’re just sketching.

Pivot is cheap because: You kept all the auth, billing, infra, and (critically) the engine machinery. The expensive parts didn’t move.

Pivot 2: Same chassis, new product category

You shipped your product. The engine premise itself didn’t hold: the input mechanism was wrong, or the extraction was too slow, or the framing missed the actual job-to-be-done. You realise the customers want something different, maybe a fast keyboard-driven CRM, or a different kind of tool entirely.

What you do:

Delete engine/ (or archive).
Delete domains/.
Build a new product on top of the chassis.

Pivot cost: 1–3 months for a serious new product. Still vastly cheaper than starting from scratch.

Pivot is cheap because: You kept your auth provider, billing, RLS policies, deploy pipeline, observability, the multi-tenant database, all the stuff that takes weeks to set up correctly. You’re starting from “I have a working multi-tenant SaaS chassis” not “I have an empty repo.”

Pivot 3: Burn it all down

You realise SaaS is wrong entirely. Maybe you’re going to do consulting, or build a desktop app, or quit. This is not a pivot the architecture protects you from, and that’s fine. At this point you’re outside the search problem the architecture was designed for.

The discipline that makes this work

The three layers are not magic. They only work if you maintain certain disciplines. Each one is a place teams accidentally collapse the layers and lose the pivot-ability.

Discipline 1: Layer 3 cannot reach into Layer 1

The vertical code does not import directly from auth or billing internals. It goes through narrow facades. If you find yourself writing import { kindeClient } from "core/auth/internal" inside your vertical’s ontology code, you’ve broken the boundary.

When the vertical needs something the chassis doesn’t expose, push the special case up, not down: add a generic primitive to the lower layer (a “tenant metadata blob”, say) and let the vertical use it, rather than adding vertical-specific fields to the auth or billing schema.

The test: can you rm -rf domains/<your-vertical>/ and have the rest of the app compile? If yes, you’re clean.

Discipline 2: Layer 2 must be parameterised, not vertical-specific

The engine should not contain literal strings from your vertical. Pick any example domain: if your vertical were real estate, that means no "property" or "viewing" in engine code; if it were field service, no "work order" or "technician". It takes ontology shapes as parameters (schemas and configuration) and processes them generically.

The test: could you point the engine at a totally different ontology (e.g., medical visits, or shipping events) and have it work? If you’d need to change engine code to do that, the engine has leaked vertical concerns.

This discipline does not mean building all the genericity up front. It means: when you find yourself adding a vertical-specific case to engine code, stop and put it in the vertical instead, even if it means duplicating something. Genericity earns its keep when you have two clients of it.

Discipline 3: Layer 1 cannot know about Layers 2 or 3

The chassis doesn’t know what kind of product runs on top of it. Auth doesn’t have product-specific role concepts. Billing doesn’t have product-specific usage units. The chassis exposes generic primitives (a user, a tenant, a subscription tier, a usage counter) and Layer 2 or 3 maps domain concepts onto them.

The test: could the chassis support a totally different product idea without modification? If it has hardcoded references to product-specific concepts, you’ve leaked upward.

Discipline 4: Resist over-typing in Layer 3

Especially when an LLM is the producer. The temptation to lock down “the real ontology” with strict discriminated unions creates schema capture (the producer manufactures whatever your schema asks for, regardless of whether reality is shaped that way). Keep Layer 3 schemas loose until you have evidence (frequency in real captures, predictive value, downstream operations that branch on the case) that a tighter shape earns its keep.

A note on async profiles

One implementation detail worth naming: the layers host different kinds of work, so don’t force one async pattern across all of them.

Layer 1 (chassis): mostly synchronous CRUD. Login, profile updates, subscription queries are all request-response, with one or two reliable-async edges (Stripe webhooks, audit shipping).
Layer 2 (engine): async-heavy. Long-running work, possibly multi-step or agentic. This is where the workflow-orchestrator question lives and where streaming endpoints attach.
Layer 3 (vertical): mixed. Reads are mostly sync; writes dispatch into Layer 2’s async pipelines; progress streams surface here.

The chassis stays sync, the engine uses a job runner (and possibly an orchestrator), the vertical reads sync and dispatches async. Each layer picks the mechanism that matches its work; they need only agree on conventions for crossing between sync, async, and streaming cleanly (principal-passing, idempotency, transaction boundaries). This composes naturally with Flavour 3 + selective orchestration + selective streaming (see The Postgres-first Architecture Spectrum).

Anti-patterns and how to recognise them

Anti-pattern: The eternal vertical

You shipped vertical 1, it sort of works, and you’re three years in still adding features to it without ever testing the pivot. Layers 2 and 3 have grown together. Now you couldn’t pivot if you wanted to.

Fix: Periodically (every 6–12 months early on) ask: “could I throw away domains/X/ and ship a new vertical in 4 weeks?” If the answer is no, find what’s coupling and fix it.

Anti-pattern: The shadow vertical in the engine

You needed something specific for your vertical, and the engine was the convenient place to put it. Now the engine has if (event.type === 'CustomerCreated') branches scattered through it. The engine is no longer generic.

Fix: When the engine starts knowing about specific event types, that’s a smell. Either it’s a shape that should be expressed in the ontology (and the engine just dispatches), or it’s domain logic that belongs in the vertical.

When the batting stance is the wrong stance

This whole architecture is shaped for iterative search: a founder who knows the domain is uncertain and is planning to pivot. It’s not always the right shape.

You have strong domain conviction and a single product hypothesis you’re committed to for 3+ years. Build a regular monolith. The pivot-readiness is paying for an option you won’t exercise.
You’re doing classic CRUD with no generic engine to speak of. There’s no Layer 2 to extract. Just build the app.
You have funding and a team and you’re scaling, not searching. The strategic concerns are different. Read books about scaling, not about pivoting.
You’re doing a single-purpose tool (a calculator, a converter, a content site). The three-layer model is overkill.

The batting stance is a strategy for the part of a startup’s life where you don’t know what you’re building yet. When you know, you trade pivot-readiness for execution speed. That’s a good trade, but only when you’ve actually figured out what you’re building.

Summary

Three layers, two pivot points, one chassis you never throw away.
Layer 1 (chassis) is bought, boring, and survives every pivot. Don’t underbuild it.
Layer 2 (engine, the deep tech layer) is generic in the application layer (types, schemas, validators), not in Postgres. Don’t build it before you’ve shipped Layer 3 once.
Layer 3 (vertical) code is disposable; its data and domain expertise compound. Don’t over-invest in the code, don’t over-type its ontology (especially with an LLM in the loop), and don’t mistake the disposable parts for the assets that survive.
The discipline is at the boundaries: each layer should be deletable without breaking the layers below.
This stance is for founders who are still searching. When you’ve found PMF, the trade-offs change.

The Batting Stance: a 3-layer architecture for software products that expect to pivot

The premise

The three layers

Layer 1: The chassis (probability of survival: ~95%)

Layer 2: The engine (probability of survival: ~50%)

Layer 3: The vertical (probability of survival: ~10–30%)

The two pivot points

Pivot 1: Same engine, new vertical

Pivot 2: Same chassis, new product category

Pivot 3: Burn it all down

The discipline that makes this work

Discipline 1: Layer 3 cannot reach into Layer 1

Discipline 2: Layer 2 must be parameterised, not vertical-specific

Discipline 3: Layer 1 cannot know about Layers 2 or 3

Discipline 4: Resist over-typing in Layer 3

A note on async profiles

Anti-patterns and how to recognise them

Anti-pattern: The eternal vertical

Anti-pattern: The shadow vertical in the engine

When the batting stance is the wrong stance

Summary

You are reading Field Notes by Auxil

Tim Farland

Let's discuss your project