Architecture

Aver implements a three-layer acceptance testing architecture inspired by Dave Farley’s four-layer model and the Screenplay pattern from Serenity.js.

The name “aver” means “to declare with confidence” — your tests aver that the system behaves as intended.

The three layers

Farley’s model separates acceptance tests into four layers: Tests, DSL, Protocol Driver, and the System Under Test. Aver gives you three of those — the fourth is your system itself.

Test (verify)  →  Domain (what)  →  Adapter (how)  →  [your system]

Test composes domain operations into scenarios. Tests never import adapters — they speak only domain language. The suite resolves adapters from configuration at runtime.

Domain declares the vocabulary of a bounded context — what the system does, in business language. No implementation details. Just names and type signatures for actions, queries, and assertions.

Adapter binds the vocabulary to a real implementation via a protocol. One adapter per interaction mode: unit (direct code interfaces), HTTP (API), Playwright (browser). The adapt() function enforces at compile time that every domain operation has a handler.

Why three layers, not two

Page objects and test helpers give you two layers: tests and implementation. This works until you need the same behavior verified at multiple levels. With two layers, you either duplicate test logic per level or couple tests to a specific protocol.

The domain layer is the missing abstraction. It sits between your tests and your implementation, giving both a shared language. Tests compose domain operations without knowing the protocol. Adapters implement domain operations without knowing the tests. The domain is the contract that lets both sides evolve independently.

Domain vocabulary

Three concepts make up the domain language:

Concept	Purpose	Returns
Action	Do something (side effect)	void
Query	Read something	typed data
Assertion	Check something	pass/fail

Actions perform operations. Queries extract data. Assertions verify expectations.

Assertions could be expressed as query + expect, but they earn their place because they express intent in domain language, enable protocol-optimized checks (Playwright’s auto-waiting vs manual polling), and produce better traces on failure.

Here’s what a domain definition looks like:

import { defineDomain, action, query, assertion } from '@averspec/core'

export const taskBoard = defineDomain({
  name: 'task-board',
  actions: {
    createTask: action<{ title: string }>(),
    moveTask: action<{ title: string; status: string }>(),
  },
  queries: {
    taskDetails: query<{ title: string }, Task>(),
  },
  assertions: {
    taskInStatus: assertion<{ title: string; status: string }>(),
  },
})

from averspec import domain, action, query, assertion

@domain("task-board")
class TaskBoard:
    create_task = action(str)        # title
    move_task = action(str, str)     # title, status
    task_details = query(str, Task)  # title -> Task
    task_in_status = assertion(str, str)  # title, status

class TaskBoard < Aver::Domain
  domain_name "task-board"

  action :create_task    # title
  action :move_task      # title, status
  query  :task_details   # title -> Task
  assertion :task_in_status  # title, status
end

type TaskBoardDomain struct {
    CreateTask   aver.Action[CreateTaskParams]
    MoveTask     aver.Action[MoveTaskParams]
    TaskDetails  aver.Query[TaskDetailsParams, Task]
    TaskInStatus aver.Assertion[TaskInStatusParams]
}

var TaskBoard = aver.NewDomain[TaskBoardDomain]("task-board")

aver_domain! {
    name: "task-board",
    struct TaskBoard {
        create_task: Action<CreateTaskParams>,
        move_task: Action<MoveTaskParams>,
        task_details: Query<TaskDetailsParams, Task>,
        task_in_status: Assertion<TaskInStatusParams>,
    }
}

val taskBoard = domain("task-board") {
    action<CreateTaskParams>("createTask")
    action<MoveTaskParams>("moveTask")
    query<TaskDetailsParams, Task>("taskDetails")
    assertion<TaskInStatusParams>("taskInStatus")
}

Given/When/Then

Tests can use given, when, and then as narrative aliases for act and assert. They call the same adapter handlers — the difference is in step labeling:

Test steps (unit):
  [PASS] GIVEN  ShoppingCart.addItem({"name":"Widget","qty":2})  12ms
  [PASS] WHEN   ShoppingCart.checkout()  45ms
  [PASS] THEN   ShoppingCart.totalCharged({"amount":35})  2ms

Protocols

A protocol manages session lifecycle. It creates a context in setup() that every adapter handler receives as its first argument, and cleans up in teardown().

For the unit protocol, the context is your domain object. For playwright, it’s a Playwright Page. For http, it’s an HTTP client pointed at a running server.

Protocols can also hook into test lifecycle events: onTestStart runs before each test body, onTestFail runs on failure and can return attachments (screenshots, logs), and onTestEnd runs after each test for cleanup.

Aver ships three protocols:

Protocol	Context	Use Case
`unit(factory)`	Your object	Direct interface testing
`http({ baseUrl })`	HTTP client	API-level testing
`playwright()`	Playwright `Page`	Browser UI testing

The unit protocol is built into core (zero dependencies). HTTP and Playwright are separate packages.

You can write custom protocols for any interaction mode — WebSocket, gRPC, CLI, whatever your system exposes.

Protocol composition with `withFixture`

withFixture(protocol, { before?, after? }) wraps a protocol with setup/teardown hooks — useful when an adapter needs external infrastructure (a database, a server process) that sits outside the protocol’s own context:

import { withFixture } from '@averspec/core'
import { http } from '@averspec/protocol-http'

const httpWithServer = withFixture(http({ baseUrl: 'http://localhost:3000' }), {
  before: () => startServer(),
  after: () => stopServer(),
})

from averspec import with_fixture
from averspec.protocol_http import http

http_with_server = with_fixture(
    http(base_url="http://localhost:3000"),
    before=start_server,
    after=stop_server,
)

http_with_server = Aver.with_fixture(
  Aver::Protocol::Http.new(base_url: "http://localhost:3000"),
  before: -> { start_server },
  after: -> { stop_server }
)

httpWithServer := aver.WithFixture(
    http.New("http://localhost:3000"),
    aver.FixtureOpts{
        Before: func() error { return startServer() },
        After:  func() error { return stopServer() },
    },
)

let http_with_server = with_fixture(
    http("http://localhost:3000"),
    |_| start_server(),
    |_| stop_server(),
);

val httpWithServer = withFixture(
    http(baseUrl = "http://localhost:3000"),
    before = { startServer() },
    after = { stopServer() },
)

before runs before protocol.setup(). after runs after protocol.teardown(), even if teardown throws. All other protocol behavior (lifecycle hooks, telemetry) passes through unchanged.

Multi-adapter resolution

When you register multiple adapters for the same domain, every test runs against all of them automatically:

 ✓ add item to cart [unit]           1ms
 ✓ add item to cart [http]          12ms
 ✓ add item to cart [playwright]   280ms

Each test gets an isolated protocol context. Test names are parameterized with the protocol name. When two adapters disagree on a behavior, that disagreement surfaces a real bug.

You can filter adapters at runtime with AVER_ADAPTER=unit to run only the fast tests during development.

Cross-domain composition with named-config `suite()`

When a test scenario spans multiple bounded contexts, suite() accepts a named config object instead of a single domain. Each key maps to a [domain, adapter] tuple, and the test callback receives a context with a namespace per key:

const { test } = suite({
  cart: [shoppingCart, cartAdapter],
  payments: [paymentGateway, paymentAdapter],
})

test('checkout charges the card', async ({ cart, payments }) => {
  await cart.given.addItem({ name: 'Widget', qty: 1 })
  await cart.when.checkout()
  await payments.then.chargeRecorded({ amount: 25 })
})

s = suite({
    "cart": (shopping_cart, cart_adapter),
    "payments": (payment_gateway, payment_adapter),
})

@s.test
def test_checkout_charges_the_card(ctx):
    ctx.cart.given.add_item(name="Widget", qty=1)
    ctx.cart.when.checkout()
    ctx.payments.then.charge_recorded(amount=25)

s = Aver.suite(
  cart: [shopping_cart, cart_adapter],
  payments: [payment_gateway, payment_adapter]
)

s.test "checkout charges the card" do |ctx|
  ctx.cart.given.add_item(name: "Widget", qty: 1)
  ctx.cart.when.checkout
  ctx.payments.then.charge_recorded(amount: 25)
end

s := aver.ComposeSuite(map[string]aver.DomainAdapter{
    "cart":     {ShoppingCart, cartAdapter},
    "payments": {PaymentGateway, paymentAdapter},
})

s.Run(t, "checkout charges the card", func(ctx aver.ComposedCtx) {
    aver.Given(ctx.NS("cart"), ShoppingCart.AddItem, AddItemParams{Name: "Widget", Qty: 1})
    aver.When(ctx.NS("cart"), ShoppingCart.Checkout, CheckoutParams{})
    aver.Then(ctx.NS("payments"), PaymentGateway.ChargeRecorded, ChargeParams{Amount: 25})
})

let s = suite_composed! {
    cart: (shopping_cart, cart_adapter),
    payments: (payment_gateway, payment_adapter),
};

s.run("checkout charges the card", |ctx| {
    ctx.ns("cart").given(&shopping_cart.add_item, AddItemParams { name: "Widget", qty: 1 });
    ctx.ns("cart").when(&shopping_cart.checkout, ());
    ctx.ns("payments").then(&payment_gateway.charge_recorded, ChargeParams { amount: 25 });
});

val s = suite(
    "cart" to (shoppingCart to cartAdapter),
    "payments" to (paymentGateway to paymentAdapter),
)

@Test fun `checkout charges the card`() = s.run { ctx ->
    ctx.ns("cart").Given(shoppingCart.addItem, AddItemParams("Widget", 1))
    ctx.ns("cart").When(shoppingCart.checkout)
    ctx.ns("payments").Then(paymentGateway.chargeRecorded, ChargeParams(25))
}

Each domain gets its own protocol lifecycle. The shared trace() function on the context collects trace entries from all domains in execution order.

Domain extensions

Domains can be extended with additional vocabulary for protocol-specific concerns:

const taskBoardUI = taskBoard.extend('task-board-ui', {
  assertions: {
    showsLoadingSpinner: assertion(),
  },
})

@TaskBoard.extend("task-board-ui")
class TaskBoardUI:
    shows_loading_spinner = assertion()

class TaskBoardUI < TaskBoard
  domain_name "task-board-ui"
  assertion :shows_loading_spinner
end

type TaskBoardUIDomain struct {
    TaskBoardDomain
    ShowsLoadingSpinner aver.Assertion[struct{}]
}

var TaskBoardUI = aver.ExtendDomain[TaskBoardUIDomain](TaskBoard, "task-board-ui")

aver_extend! {
    base: TaskBoard,
    name: "task-board-ui",
    struct TaskBoardUI {
        shows_loading_spinner: Assertion<()>,
    }
}

val taskBoardUI = taskBoard.extend("task-board-ui") {
    assertion<Unit>("showsLoadingSpinner")
}

Extensions inherit all vocabulary from the parent. An adapter for the extended domain must implement everything from both the parent and the extension. This lets you write shared behavioral tests against the base domain, and protocol-specific tests against the extension.

Error reporting

On failure, Aver shows the test steps — every domain operation leading to the error:

FAIL  shopping-cart.spec.ts > full checkout flow [unit]

Test steps (unit):
  [PASS] GIVEN  ShoppingCart.addItem({"name":"Widget","qty":2})  12ms
  [PASS] THEN   ShoppingCart.hasItems({"count":1})  1ms
  [PASS] QUERY  ShoppingCart.cartTotal()  0ms
  [FAIL] WHEN   ShoppingCart.checkout() — Expected order to be confirmed  45ms

  Expected order to be confirmed

The trace is recorded automatically as the suite proxies domain calls through the adapter. Each step is logged as it executes, so failures show the full sequence leading to the error.

Telemetry verification

Observability data is made powerful by context — the relationships between spans matter more than any individual span. A checkout span alone tells you little. A checkout span causally connected to a payment span and a fulfillment span, all sharing an order ID within the same trace, tells you the whole story. When those relational seams break — a refactor drops the trace propagation, a renamed attribute disconnects two spans — your dashboards go dark and your agents can’t validate what they shipped.

Aver treats these relationships as a testable contract. Domains declare expected OTel spans on operations, and the framework verifies both the spans and their connections.

Two verification layers run automatically:

Per-step: After each operation, verify a matching span exists with the expected name and attributes
End-of-test correlation: After all steps, verify that correlated steps (shared attribute key + value) are causally connected — same traceId or span links

This catches instrumentation bugs that behavioral tests miss: the API returns the right data, but the spans are missing, misnamed, or disconnected. The @averspec/telemetry package extends this to production — extract a behavioral contract from passing tests, then verify that production traces conform to the same contract. See the telemetry guide for details.

Design principles

Zero runtime dependencies in core. Protocols are separate packages.
Type-safe — the type system enforces that adapters implement every domain item.
Adapter authors receive ready-to-use context — protocols handle lifecycle.
Tests are protocol-agnostic — they import domains, never adapters.

Economics

The cost model determines when Aver earns its keep.

What grows with what: Vocabulary grows with domain surface area — the number of distinct behaviors. Tests grow with scenarios — the number of ways those behaviors compose. Surface area grows slowly; scenarios grow fast. Five domain operations can support fifty tests.

Cost per operation: One vocabulary entry in the domain, plus one handler per adapter. At one adapter, this is comparable to extracting a page object method. At three adapters, it’s a 1:3 ratio — but each handler is isolated.

The breakeven: With a single adapter, Aver’s overhead roughly equals well-structured page objects. The cross-adapter benefit kicks in at the second adapter. By the time you have two adapters, the bugs caught by cross-level verification exceed the cost of maintaining two sets of handlers.

AI agent integration

Aver’s domain layer naturally separates what an AI agent is good at (generating code, iterating until tests pass) from what requires human judgment (deciding what the system should do, naming the vocabulary). The human defines the outer loop — domain specs in business language. The agent works the inner loop — implementing code until aver run passes.

See AI-Assisted Testing for the full rationale and setup.