← Back to all field notes
System architecture

The Batting Stance: a 3-layer architecture for software products that expect to pivot

The premise

Founding a startup is a search problem. You don’t know the answer; you have a hypothesis. The hypothesis will probably be wrong in ways you can’t predict. Your job, until you find product-market fit, is to take as many cheap, well-aimed swings as you can before you run out of runway, energy, or both.

The architectural question that follows is: how do you structure your code so that swinging again is cheap?

Most startup architectures are accidentally optimised for the swing you’re currently taking. When that swing misses, the rewrite cost is enormous: auth, billing, infrastructure, deploy pipelines, all of it tangled with the failed product hypothesis. Founders end up choosing between rewriting from scratch (slow, demoralising) and pivoting on top of code that’s shaped for the wrong domain (worse).

The batting stance is a way of structuring the codebase up front so that the failure of the current swing doesn’t take the whole codebase with it. It does this by giving you two clean places to pivot, each one cheaper than the last.

This post assumes you’ve already chosen a backend architecture roughly in the “modular monolith with shared transactions” family (Flavour 3 in The Architecture Spectrum, with possibly one or two Flavour 4 async edges). The batting stance is orthogonal to that choice (it’s about how you arrange your modules into layers, not about how each module is built internally), but it composes most naturally with that choice and sits awkwardly with pure hexagonal or microservices stances. If you’ve adopted a pure-hex orchestrated stance because your product has multiple device-class adapters, the layer model still applies; you’ll just have more abstraction inside each layer.

The three layers

  • Layer 3, Vertical / Domain. Ontology, domain commands, UI views, copy.
  • ↑ Pivot point #1 ↑
  • Layer 2, Engine / Mechanism. Capture, processing, payload storage, prompting infrastructure.
  • ↑ Pivot point #2 ↑
  • Layer 1, SaaS Chassis. Identity, tenancy, authorisation, billing, infrastructure.

Each layer has a different probability of survival across a pivot. Each layer has a different design discipline.

Layers have different async profiles

Worth naming explicitly because it matters for how you build each layer: the three layers naturally host different kinds of work, and trying to apply one async pattern across all of them produces friction.

  • Layer 1 (chassis) is mostly synchronous CRUD. Login is a request-response. Updating a profile is a request-response. Querying subscription state is a request-response. There are one or two reliable-async edges (Stripe webhooks, audit log shipping), but the bulk of the chassis is sync.
  • Layer 2 (engine) is async-heavy. Long-running LLM pipelines, possibly multi-step, possibly agentic. This is where the workflow-orchestrator question (when to add one) lives. Streaming endpoints (token-by-token LLM output, progress updates) usually attach at the edge of this layer.
  • Layer 3 (vertical) is mixed. Domain reads are mostly synchronous (the user wants to see their entities now). Domain writes triggered by extraction outputs are dispatched into Layer 2’s async pipelines. UI dispatching to the engine is an async-edge invocation; UI reading from domain state is sync. Streaming progress updates from the engine surface here.

The architectural implication: don’t pick one async pattern and force it through every layer. The chassis stays sync. The engine uses a job runner (and possibly an orchestrator). The vertical reads sync, dispatches async, and consumes streams. Each layer picks the mechanism that matches its work.

This composes naturally with Flavour 3 + selective orchestration + selective streaming (see The Architecture Spectrum). The shared transaction handles within-request work cleanly; the job runner handles long async work; the orchestrator is added for multi-step flows specifically; SSE handles streaming where the UX needs it. The layers don’t need to agree on one pattern. They need to agree on conventions for crossing between sync, async, and streaming cleanly (principal-passing, idempotency, transaction boundaries, connection lifecycle).

Layer 1: The chassis (probability of survival: ~95%)

This is the part that’s the same across any B2C or B2B SaaS you might ever build. It is unfashionable but expensive to get right and almost never the place where your product fails.

What lives here:

  • Identity (off-the-shelf: Kinde, Clerk, WorkOS, Auth0, etc.)

  • Three-layer tenancy defence, all owned by the chassis:

    • Tenant isolation via RLS. Postgres policies enforce that tenant data can’t cross tenant boundaries even if application code is buggy. Tenant context set once at the request boundary; re-established at every async boundary from the principal stored in the job payload.
    • Access envelope. Every request and every job carries an actor context; downstream code reads from it rather than re-deriving. This catches “missing context” bugs and gives you the substrate for audit, metering, and structured logging.
    • Authorisation (RBAC at minimum; ReBAC if your product has sharing models). A can(actor, action, resource) primitive exposed to higher layers. Catches “wrong action” bugs that RLS can’t reason about.

    These are layers, not alternatives. A correctly-functioning request typically passes through all three. The Architecture Spectrum covers the principal-passing security model in detail.

  • Billing (off-the-shelf: Stripe, subscriptions, usage metering hooks)

  • Transactional email (off-the-shelf: Resend, Postmark, etc.)

  • Error tracking, logging, observability (off-the-shelf)

  • Deploy pipeline, CI/CD, environments

  • Database connection management, transaction threading

  • Background job runner with principal-passing discipline (so jobs inherit the actor context they were enqueued under, and workers re-establish RLS + envelope from the payload before doing anything)

  • Generic admin tooling

Design discipline: Boring. Bought, not built. Standard patterns from public docs. You should be able to swap your product idea for any other SaaS idea without changing a line in this layer.

The trap: Building “your own” auth or billing because you think you’ll save money or “it’s simple.” It’s not simple. It’s a tar pit. Every day you spend on it is a day you didn’t spend on the part that might actually fail.

Adapter discipline. Most chassis dependencies (auth provider, email provider, payment provider, blob storage) are well-suited to thin port-and-adapter abstractions because their interfaces are naturally clean and their fakes are useful for tests and their local-dev variants are useful for offline development. Three adapters per port is common and earns its keep:

  • Production: Resend, Stripe live, S3, Auth0.
  • In-memory fake for unit tests: captures emails to an in-memory list, returns canned payment results, simulates blob storage in a Map. Fast.
  • Local-dev container: Mailpit (real SMTP server with a web UI for browsing sent messages), MinIO (S3-compatible storage with a real console), Stripe test mode or a fake-Stripe container, sometimes a local Auth0 mock or stub identity provider. Real software, just running on your laptop.

This isn’t speculative abstraction. Every adapter exists because it solves a concrete problem (production needs to actually work; tests need to be fast and isolated; local dev needs to work offline and let developers inspect the real pipeline). Apply it for the chassis ports where the production adapter is slow, expensive, or fragile to call from tests and dev. For ports where the production substrate runs locally trivially (Postgres, Redis), the choice is between a thin port for regularity (every external dependency reached the same way; consistent shape across the codebase) and direct calls (less ceremony). Both are valid; the first is the right choice when you value uniformity across the system, the second when you want to keep ceremony to a minimum. Choose one and apply it consistently.

Specific tenancy advice: Even if you launch B2C with tenant_id == user_id, model the tenant as a separate concept from day one. RLS policies key off tenant_id. There is a tenants table, even if it has one row per user. The B2C-to-B2B migration becomes additive (add a tenant_members table, change tenant resolution at the request boundary) rather than invasive (rewrite every query and policy). The cost of doing this on day one is approximately one day of work; the cost of not doing it and migrating later is months. See The Architecture Spectrum for the full tenancy treatment.

Layer 2: The engine (probability of survival: ~50%)

This is the mechanism that’s distinctive to your category of product but generic across verticals. An engine might be “documents → structured records,” or “sensor stream → typed events,” or “events → cohorts → dashboards.” The engine is the bet that the kind of product you’re building has value, independent of any particular industry that consumes its output.

What lives here:

  • Generic event/record envelope (an opaque payload field plus identifiers, timestamps, type discriminator)
  • Capture pipeline (whatever produces input: file upload, webhook, sensor, form submission)
  • Processing pipeline (transcription, extraction, classification: whatever transforms input into structured records)
  • Validation framework (the application validates payloads against schemas at the trust boundary; the storage layer holds opaque payloads)
  • LLM prompting infrastructure (prompt assembly, retry, anti-hallucination guards, evals)
  • Generic query / read interface

Design discipline: Generic in the application layer (types, schemas, validators, prompts), not in the storage schema. The DB stores opaque payloads (JSONB). The application code is parameterised over the ontology shape. The same engine code runs against any vertical’s ontology by passing different schemas and prompts.

Two valid shapes for Layer 2, depending on the product hypothesis:

  • CRUD-with-audit-log. Records flow through the engine and update domain tables transactionally. Past records are kept for audit and re-processing, but state is stored canonically as current state, not derived from an event log. Right when the LLM extraction is a one-shot translation step and the read models are the truth.
  • FRP-shaped event sourcing. Raw input (audio, sensor readings, etc.) is the canonical event stream. Typed events are throwaway derivations produced by running the current interpreter (prompts + schemas) over raw inputs. Read models are projections over typed events. Re-derivation is a normal operation, not exceptional. Right when the product hypothesis depends on iterating interpreters and reducers over time, or when multiple verticals need different views over the same input stream. See the ES discussion in The Architecture Spectrum for the eyes-open commitments.

These are different shapes with different trade-offs. The CRUD-with-audit-log shape is simpler; the FRP/ES shape is more powerful when the product needs interpreter iteration as a primary operation. Most products want the first; products whose value proposition is “explore the same raw stream with different interpretations” want the second. Pick deliberately.

The trap: Trying to make Layer 2 generic in the database. Entity-Attribute-Value tables. “Configurable” relational schemas. Don’t, regardless of which shape you pick. The DB is dumb storage; the application is where the genericity lives. Postgres holds JSONB (or opaque event payloads); the application layer holds the types.

The deeper trap: Building Layer 2 generically before you understand what genericity means for your product. There are two ways to fall into this:

  • Building a configurable abstract framework before you’ve shipped any concrete vertical. Plugin systems. EAV tables. “Anyone can define an ontology.” This is premature platforming and will eat your roadmap.
  • Building a concrete engine that’s parameterised over the ontology shape, deliberately, because the engine IS the product hypothesis. This is correct when the bet is “this engine works across verticals” and you can’t test the bet without building the engine.

The distinction: a parameterised engine is concrete code that takes ontology shapes (Zod schemas, prompts, reducer functions) as inputs and runs against any one of them. A configurable framework tries to abstract over the mechanism and ends up shaped by no specific use case.

If your product hypothesis is the engine itself, build it concretely with explicit parameterisation hooks from day one. Resist the urge to make it “any kind of input → any kind of output”; make it “raw observation stream → typed derivations via vertical-supplied reducers” and prove that on one vertical first. If your product hypothesis is the vertical and the engine is incidental, build the engine for that vertical hard-coded and extract genericity when (if) a second vertical appears.

Layer 2 is where most of your application’s async complexity lives. In an LLM-driven product, the request handler is short (validate, persist a pending row, enqueue, return); the worker is long (fetch, pre-process, extract, validate, persist, project, notify). Treat the worker code path with the same engineering discipline as the HTTP path: tenancy context, transaction boundaries, structured logging, observability, tests. The Architecture Spectrum covers this in its async-ness section. For products in this category, the worker isn’t a side-channel. It’s where the value-creating work happens, even if it’s not the most-frequent code path by request count.

If your engine has multi-step processing (pre-process → extract → enrich → project → notify, with each step long-running and fallible), you’ll want a workflow orchestrator (DBOS, Temporal, Restate) for the engine. Two reasonable approaches:

  • Adopt the orchestrator from day one if you know the engine will need orchestration eventually and you have the experience to use it well. Avoids the migration pain later. Means the engine’s pipelines run on durable workflow semantics from the start: per-step retry, compensation, resumability. DBOS in particular fits this naturally because workflow state lives in your existing Postgres, no new substrate.
  • Defer until the pain is real. Start with a simpler job runner (pg-boss, River) and migrate to the orchestrator when you’ve hand-rolled retry/idempotency twice and want it as a runtime concern. Lower up-front cost; some migration work later.

Either is valid; the first is better when you’re experienced and committed to the shape, the second when you want to minimise upfront infrastructure. For agentic engines specifically (LLM tool-calling loops, dynamic step graphs, agent delegation), adopt the orchestrator on day one. Hand-rolling tool-calling loops in a simple job runner produces ugly code with poor failure semantics, no resumability, and quietly expensive cost overruns. The orchestrator gives you per-step durability, principal-propagation discipline across hops, per-workflow cost limits, and observability for each tool call.

Layer 3: The vertical (probability of survival: ~10–30%)

This is the bet on a specific domain, audience, and use case. The ontology, the prompts, the UI copy, the workflows: these are the parts most likely to be wrong.

What lives here:

  • Ontology schemas (the domain shapes the LLM is asked to extract: event types, command shapes, entity references)
  • LLM prompts that reference the ontology
  • Domain commands (the operations that take a validated extraction and update domain tables)
  • Domain queries and read shapes (what the UI fetches)
  • UI views (whatever the user actually sees)
  • Onboarding flow, copy, marketing positioning

Design discipline: Treat this layer as disposable. Be willing to throw it all away. Don’t share code between verticals; copy and modify when you build vertical 2.

The trap: Investing in this layer like it’s permanent. Beautiful ontologies. Refined prompts. Polished UI. If the vertical is wrong, all of it is wasted, and the polish makes you reluctant to throw it away even when you should.

The two pivot points

Each boundary between layers is a place you can pivot cheaply.

Pivot 1: Same engine, new vertical

You shipped your product for vertical A, say, field service operations. Users tried it. The pull isn’t there. But the engine (your generic pipeline that ingests, processes, and serves typed records) feels real. People are excited about that part. You think the same machinery would work for, say, warehouse operations.

What you do:

  1. Delete domains/field-service/ (or move it to a stale branch).
  2. Create domains/warehouse-ops/ with new ontology schemas, new prompts, new domain commands, new UI.
  3. Layers 1 and 2 are unchanged.
  4. Ship.

Pivot cost: 2–4 weeks for a serious second attempt. Days if you’re just sketching.

Pivot is cheap because: You kept all the auth, billing, infra, and (critically) the engine machinery. The expensive parts didn’t move.

Pivot 2: Same chassis, new product category

You shipped your product. The engine premise itself didn’t hold: the input mechanism was wrong, or the extraction was too slow, or the framing missed the actual job-to-be-done. You realise the customers want something different, maybe a fast keyboard-driven CRM, or a different kind of tool entirely.

What you do:

  1. Delete engine/ (or archive).
  2. Delete domains/.
  3. Build a new product on top of the chassis.

Pivot cost: 1–3 months for a serious new product. Still vastly cheaper than starting from scratch.

Pivot is cheap because: You kept your auth provider, billing, RLS policies, deploy pipeline, observability, the multi-tenant database, all the stuff that takes weeks to set up correctly. You’re starting from “I have a working multi-tenant SaaS chassis” not “I have an empty repo.”

Pivot 3: Burn it all down

You realise SaaS is wrong entirely. Maybe you’re going to do consulting, or build a desktop app, or quit. This is not a pivot the architecture protects you from, and that’s fine. At this point you’re outside the search problem the architecture was designed for.

The discipline that makes this work

The three layers are not magic. They only work if you maintain certain disciplines. Each one is a place teams accidentally collapse the layers and lose the pivot-ability.

Discipline 1: Layer 3 cannot reach into Layer 1

The vertical code does not import directly from auth or billing internals. It goes through narrow facades. If you find yourself writing import { kindeClient } from "core/auth/internal" inside your vertical’s ontology code, you’ve broken the boundary.

The test: can you rm -rf domains/<your-vertical>/ and have the rest of the app compile? If yes, you’re clean.

Discipline 2: Layer 2 must be parameterised, not vertical-specific

The engine should not contain literal strings from your vertical: no "property", no "viewing", no "invoice", no "appointment". It takes ontology shapes as parameters (schemas, prompt fragments) and processes them generically.

The test: could you point the engine at a totally different ontology (e.g., medical visits, or shipping events) and have it work? If you’d need to change engine code to do that, the engine has leaked vertical concerns.

This discipline does not mean building all the genericity up front. It means: when you find yourself adding a vertical-specific case to engine code, stop and put it in the vertical instead, even if it means duplicating something. Genericity earns its keep when you have two clients of it.

Discipline 3: Layer 1 cannot know about Layers 2 or 3

The chassis doesn’t know what kind of product runs on top of it. Auth doesn’t have product-specific role concepts. Billing doesn’t have product-specific usage units. The chassis exposes generic primitives (a user, a tenant, a subscription tier, a usage counter) and Layer 2 or 3 maps domain concepts onto them.

The test: could the chassis support a totally different product idea without modification? If it has hardcoded references to product-specific concepts, you’ve leaked upward.

Discipline 4: Build Layer 2 deliberately, not reflexively

The default failure mode is “build the platform first”: abstract generic infrastructure before any vertical has shipped. Don’t.

But there’s a legitimate exception: when the engine itself is the product hypothesis. If you’re betting that a specific kind of processing pipeline works across verticals (and that bet is the actual product), you have to build the engine to test it. In that case:

  • Build it concretely, parameterised over ontology shape, for one vertical first.
  • Resist abstracting beyond what one vertical needs.
  • Treat the parameterisation as a hypothesis-test, not as a configurable framework.

The distinction that matters: are you building Layer 2 as infrastructure for product features (defer until needed), or as the product itself (build it now, but build it concretely)? Most teams are in the first case and should defer. Some teams (particularly LLM-pipeline products) are in the second and should not defer, but should still build concretely rather than abstractly.

Discipline 5: Resist over-typing in Layer 3

Especially when an LLM is the producer. The temptation to lock down “the real ontology” with strict discriminated unions creates schema capture (the producer manufactures whatever your schema asks for, regardless of whether reality is shaped that way). Keep Layer 3 schemas loose until you have evidence (frequency in real captures, predictive value, downstream operations that branch on the case) that a tighter shape earns its keep.

Anti-patterns and how to recognise them

Anti-pattern: The premature platform

You build Layer 2 with elaborate genericity before any concrete vertical exists. Configurable everything. Plugin architecture. Six abstractions deep. Three months later you’ve shipped nothing and you’re not sure your product idea works.

Fix: Build concretely, parameterised over the ontology shape, for one vertical. Resist abstracting beyond what that one vertical needs. The genericity that earns its keep is “this engine takes a Zod schema and a set of reducer functions”; the genericity that doesn’t is “this engine takes a JSON configuration that describes the schema.” The first is concrete code with type-safe inputs; the second is a framework with a configuration DSL. Build the first; resist the second.

Anti-pattern: The leaky vertical

Your vertical reaches into auth to add custom user fields, into billing to model vertical-specific subscription tiers, and into the engine to hardcode domain-specific extraction logic. Now pivoting means rewriting all three layers.

Fix: Push the special cases up, not down. Add a generic primitive to the lower layer (e.g., “tenant metadata blob”) and let the vertical use it, rather than adding vertical-specific fields to the auth schema.

Anti-pattern: The eternal vertical

You shipped vertical 1, it sort of works, and you’re three years in still adding features to it without ever testing the pivot. Layers 2 and 3 have grown together. Now you couldn’t pivot if you wanted to.

Fix: Periodically (every 6–12 months early on) ask: “could I throw away domains/X/ and ship a new vertical in 4 weeks?” If the answer is no, find what’s coupling and fix it.

Anti-pattern: The shadow vertical in the engine

You needed something specific for your vertical, and the engine was the convenient place to put it. Now the engine has if (event.type === 'CustomerCreated') branches scattered through it. The engine is no longer generic.

Fix: When the engine starts knowing about specific event types, that’s a smell. Either it’s a shape that should be expressed in the ontology (and the engine just dispatches), or it’s domain logic that belongs in the vertical.

Anti-pattern: Underbuilding the chassis

You skipped over Layer 1 because “auth and billing are boring.” Now you have homegrown half-broken auth, no real billing, no observability, and every customer issue requires you to SSH into the server. You’re spending all your time on the chassis instead of the product.

Fix: Buy. Resend, Kinde/Clerk, Stripe, Sentry. The chassis is the place where boring is the highest virtue.

When the batting stance is the wrong stance

This whole architecture is shaped for iterative search: a founder who knows the domain is uncertain and is planning to pivot. It’s not always the right shape.

  • You have strong domain conviction and a single product hypothesis you’re committed to for 3+ years. Build a regular monolith. The pivot-readiness is paying for an option you won’t exercise.
  • You’re doing classic CRUD with no generic engine to speak of. There’s no Layer 2 to extract. Just build the app.
  • You have funding and a team and you’re scaling, not searching. The strategic concerns are different. Read books about scaling, not about pivoting.
  • You’re doing a single-purpose tool (a calculator, a converter, a content site). The three-layer model is overkill.

The batting stance is a strategy for the part of a startup’s life where you don’t know what you’re building yet. When you know, you trade pivot-readiness for execution speed. That’s a good trade, but only when you’ve actually figured out what you’re building.

Summary

  • Three layers, two pivot points, one chassis you never throw away.
  • Layer 1 (chassis) is bought, boring, and survives every pivot. Don’t underbuild it.
  • Layer 2 (engine) is generic in the application layer (types, schemas, prompts), not in Postgres. Don’t build it before you’ve shipped Layer 3 once.
  • Layer 3 (vertical) is disposable. Don’t over-invest in it. Don’t over-type its ontology, especially with an LLM in the loop.
  • The discipline is at the boundaries: each layer should be deletable without breaking the layers below.
  • This stance is for founders who are still searching. When you’ve found PMF, the trade-offs change.
// Context & author

You are reading Field Notes by Auxil

Auxil is an independent software systems consultancy and active product factory operated by veteran software practitioner Tim Farland alongside a vetted peer network of senior specialists. Based on Waiheke Island, Auckland, we design, build, and audit high-stakes SaaS systems and production-grade AI pipelines globally.

Explore →

Tim Farland

Operator / Architect / Engineer
// Contact

Let's discuss your project

If you are looking for a reliable, competent, efficient Principal Architect or Engineer for scoped, delivery-focused contracting or advisory, reach out.

Waiheke Island, Auckland · Available for remote or CBD hybrid engagements