White Paper

Semantic Surface Architecture

A Practitioner's Methodology for Human-AI Collaboration in Software Development

Michael Latulippe | Version 1.0 | 76 min read

Download PDF View Markdown Back to Overview

Try the SSA Vocabulary Workshop

An AI skill that walks you through naming your systems in five minutes. Works with Claude Code, Codex, and ChatGPT.

Abstract

Semantic Surface Architecture™ (SSA) is a methodology for organizing software to improve collaboration between human developers and large language model coding assistants, while preserving human control as more of a codebase becomes AI-authored. Developed from practical necessity rather than academic theory, SSA treats meaning as the primary unit of architecture and encodes it through semantic naming and actor-centered organization that stays coherent across long development sessions. Each actor perspective is represented as an explicit Surface, and the surrounding code is structured with a clear hierarchy of systems, components, actions, and outcomes so that intent, ownership, and permission remain legible. In effect, the codebase becomes an interpretability and governance layer, making it easier to retain context, debug with clarity, keep AI suggestions consistent, and audit or constrain AI-initiated behavior. SSA emerged while building production systems including Popdot AI™ and Sleep Around Points™, where conventional service and controller patterns broke down under mixed human and agent workflows. Since the framework’s initial development, multiple independent groups have converged on similar approaches, and emerging empirical research on naming conventions in AI-assisted development provides early support for its core claims. Rigorous empirical validation of SSA itself remains an open area for future work.

1. Introduction

1.1 Surface Tension

There is a surface tension now in software development, and most practitioners can feel it even if they have not named it.

It exists at the boundary where human understanding meets artificial intelligence, the interface where two fundamentally different kinds of minds attempt to collaborate over a shared codebase. Human minds and AI minds process code differently, attend to different signals, and fail in different ways. The conventions that served us when code was written by humans for humans no longer suffice when code must speak to minds we are only beginning to understand.

Surface tension in physics is not a flaw. It is what allows water striders to walk on water, what shapes raindrops into spheres, what makes bubbles possible. It is a property of the interface between two substances that can be understood and leveraged. This paper proposes that we can do the same with the interface between human and AI comprehension of code. Not by eliminating the tension, but by designing for it.

Semantic Surface Architecture™ (SSA) is one such design. It is a framework for organizing software systems so that the code itself communicates meaning through naming conventions, actor-centric organization, and structural patterns that encode domain knowledge directly into the architecture. The goal is to reduce dependence on external documentation and context injection by making the codebase legible to both human developers and the large language models that increasingly collaborate with them.

1.2 A Problem Born of Practice

SSA did not begin as a research project. It began as a survival strategy.

The first system that forced this architecture into existence was Popdot AI™, an agentic domain rental platform where humans list domains and AI agents discover, rent, and deploy content to subdomains through programmatic interfaces. Popdot AI needed to support four fundamentally different kinds of actors: human renters browsing a marketplace, domain owners managing portfolios, autonomous AI agents conducting transactions via cryptographic protocols, and platform administrators governing the whole apparatus. The system eventually grew to over a dozen named Systems, multiple Surfaces, dozens of named Views, and a dual-rail payment architecture handling both traditional currency and programmatic payments. When I tried to reason about these flows using conventional layers and services, the architecture collapsed into a tangle of endpoints and controllers that I could not hold in my head.

The Surface abstraction and the naming conventions that became SSA were my response to that pressure. I realized I had to separate “who is interacting” from “what capability they are invoking” if I wanted to understand and extend the larger apparatus I was building. Once that separation existed, and once it was encoded directly in the codebase, I found I could manage far more complexity as a mostly solo developer than I had ever handled before.

There was also, I should admit, a less technical influence. I had recently moved to Celebration, Florida, and was spending considerable time at Walt Disney World®. Disney brands everything: every ride, every land, every themed food experience. They even make fake businesses to operate these fantasy experiences. The Imagineers understand that naming and branding transforms infrastructure into experience. Galaxy’s Edge, also known as the Black Spire Outpost on the Planet Batuu, is not just called the “Star Wars Area.” The Haunted Mansion is not “Ghost Ride B.” The names carry the meaning, and they do so with personality. I found myself thinking: if I have to live inside this codebase every day, I might as well enjoy it. I might as well name each system something evocative, something that makes me want to open the file. The realization that this aesthetic pleasure also had functional benefits for AI collaboration came later. But the initial impulse was simpler than methodology. It was the instinct that names and brands for specific systems should make you feel something, because you remember what you feel. This impulse is central to SSA’s philosophy. The framework does not prescribe a fixed vocabulary. It prescribes the act of naming with care. When an AI agent working on a music production platform suggests calling the mixing engine “Harmonic” instead of “AudioProcessingService,” or when a developer building a logistics platform names the routing system “Compass” instead of “RouteCalculator,” they are practicing SSA regardless of whether they use the specific words from this paper. The joy of the name is part of the value. If you enjoy saying it, you will remember it. If you remember it, you will use it consistently. If you use it consistently, the AI learns it.

Sleep Around Points™, a Disney Vacation Club points rental marketplace, came next. The name itself is DVC insider slang: owners “sleep around” by using their points to stay at resorts beyond their home resort, hopping from the Polynesian to the Grand Floridian to Animal Kingdom Lodge. Fittingly, it was the Disney connection that closed the circle. I did not design a new architecture for it. I reused the same Surface structure and System vocabulary that had emerged on Popdot AI, refining them in a different domain with different financial and regulatory constraints. The platform grew to over a dozen named Systems, multiple Surfaces, dozens of API endpoints, and a complete booking state machine with automated deadline enforcement and tiered refund policies. By that point it was clear that these ideas were no longer ad hoc survival tactics for a single project. They were an emerging methodology.

I am not a computer scientist or software architect by training, nor a researcher in artificial intelligence. I am someone attempting to build complex software systems with the AI-assisted development tools that have emerged since 2023. I have found that the existing frameworks for thinking about software architecture are inadequate for this new mode of work. Our tools have changed. The architecture has not.

1.3 The Naming Hypothesis

When I began building these platforms with AI coding assistants, primarily Anthropic’s Claude, I encountered a pattern of failure that I suspect many practitioners have experienced but few have formally articulated. The AI would understand the system clearly for the first hour. By the third hour it would begin making suggestions that contradicted earlier decisions. By the next session it had effectively forgotten the architecture entirely, and I had to re-explain the system before productive work could resume.

The conventional diagnosis points to context window limitations, the finite amount of text a model can hold in working memory at any given moment. The conventional solutions involve retrieval-augmented generation, memory systems, specification documents, and increasingly elaborate tooling to inject the right context at the right moment.

I propose a different diagnosis, and as a result, a different solution.

The problem is not merely that LLMs forget. The problem is that our architectural conventions, developed over decades for human-to-human communication and machine execution, carry insufficient semantic density for human-to-AI collaboration. The names we give things, the way we organize code, the structures we build: these were designed for an audience that no longer exists in isolation. The audience now includes minds that process language differently than ours, that have different strengths and different failure modes, and that derive understanding from patterns in naming with a sensitivity most human developers barely notice.

Consider the difference between these two function names:

createBookingTransaction()
Ledger.Booking.create()

To a human developer familiar with a codebase, both are interpretable. But for an LLM encountering this code mid-session without full architectural context, the second name carries dramatically more information. Ledger invokes accounting, record-keeping, financial tracking. Booking specifies the domain entity. The hierarchical structure implies a system-to-component-to-action pattern that can be extrapolated across the codebase.

Or consider Popdot AI’s agent identity system. The call Sigil.Mandate.create() communicates, through naming alone, that this operation belongs to the Sigil system (evoking seals, marks, credentials), specifically its Mandate component (evoking authorization, delegation of power), performing a creation action. An LLM encountering this call for the first time can infer that it creates an authorization token for an agent within a credentialing system. The conventional alternative, agentAuthService.createSpendingAuthorization(), is descriptive but carries no metaphorical resonance and implies no structural relationship to the broader codebase.

The name itself is the documentation. The name itself is the context.

This observation, that semantic naming conventions could reduce dependence on explicit context injection, led to the development of what I am calling Semantic Surface Architecture.

1.4 Why “Surface”?

Traditional software architectures organize code by technical concern (frontend/backend/database), by feature domain (users/orders/payments), or by design pattern (MVC, hexagonal, microservices). These organizational schemes optimize for separation of concerns, testability, and deployment independence, properties that matter when humans are writing code for machines to execute.

SSA introduces a different primary organizing principle: the actor perspective.

A Surface, in SSA terminology, is an entry point into the system defined by who is asking. Not what technical layer they are touching, but what role they occupy and what subset of the system they should perceive. In Popdot AI, where the architecture originated, four Surfaces emerged:

Surface	Actor	Metaphor	Entry Point
Market	Human renters	Marketplace	public marketplace
Helm	Domain owners	Ship captain	authenticated owner dashboard
Wire	AI agents	Electrical connection	versioned API
Cortex	Administrators	Neural center	internal admin panel

Sleep Around Points later adopted the same pattern with Market, Helm, Tower, and Wire, each metaphor carrying across domains without modification.

The same underlying systems (authentication, payments, messaging) exist across Surfaces, but each Surface exposes different components. On Popdot AI, human renters interact with Vault through the Market surface (charges and wallet deposits) while AI agents interact with Vault through the Wire surface (programmatic payments via a dedicated protocol). Cortex administrators can see both payment rails and override either. The permission model is embedded in the architecture itself, not layered on top as an access control list.

More significantly for AI-assisted development: when an LLM reads Market.Bazaar.Grid.QuickRent, it understands not just the action (quick rent), but the actor (a human renter), the context (browsing the marketplace grid), and the constraints (Market-level permissions), without requiring additional documentation.

1.5 The Context Crisis in AI-Assisted Development

The practice of expressing intent in natural language and allowing AI to generate implementation, popularized as “vibe coding” by AI researcher Andrej Karpathy in early 2025 [10], has gained extraordinary adoption. As of 2026, studies suggest that over 90% of US-based developers use AI coding tools daily [3]. The practice is no longer experimental. It is the baseline.

But development at scale with AI assistants has revealed a structural problem that existing methodologies do not adequately address: context drift. Context drift occurs when an AI assistant’s understanding of a system gradually diverges from the system’s actual architecture over the course of extended development. The assistant begins making suggestions that are locally coherent but globally inconsistent. It proposes patterns that contradict established conventions. It forgets constraints that were clearly articulated in earlier sessions.

The current solutions fall into three categories: memory systems that store and retrieve relevant context, specification documents that are injected into each session, and retrieval architectures that fetch relevant documentation on demand. These solutions share a common assumption: that context must be provided to the AI at the moment of need.

SSA challenges this assumption. What if context could be encoded in the system itself, in the names, the structure, the vocabulary, such that the AI derives understanding from the code rather than requiring external injection?

1.6 Scope and Limitations

This paper is a practitioner’s report from the field. It documents an architectural approach that emerged from necessity and that appears to address real problems in AI-assisted development.

The claims made here are provisional. The methodology has been applied in depth to two production systems, Popdot AI and Sleep Around Points, that share the same Surface structure and evolving System vocabulary. No controlled experiments have been conducted comparing SSA to alternative approaches. The improvements observed in context retention, debugging clarity, and development velocity are subjective assessments rather than measured outcomes.

Since this paper was first drafted in mid-2025, the landscape has shifted in ways that strengthen the case for the underlying problem while also introducing related work. Multiple independent groups have published frameworks addressing similar challenges, including Codified Context [22], OutcomeOps [4], and the Semantic Control Plane [14], suggesting that the need for semantically organized codebases is increasingly recognized. Empirical research on naming conventions in AI code comprehension has begun to provide evidence that naming choices measurably affect model performance. These developments are discussed in Section 5.

I present this work for three reasons. First, the pace of change in AI-assisted development is rapid, and waiting for rigorous validation may mean missing the window in which practitioners need guidance. Second, I believe other practitioners are encountering similar problems and may benefit from a named framework for thinking about solutions. Third, I hope that preliminary documentation of this approach will invite collaboration, critique, and formal research from people better qualified to evaluate and extend these ideas.

1.7 Structure of This Paper

The remainder of this paper proceeds as follows. Section 2 provides background on context windows, context rot, and the emerging discipline of context engineering, and argues for a shift from context injection to semantic encoding. Section 3 presents the core SSA framework, including its five-layer taxonomy and design principles. Section 4 describes the application of SSA to two production codebases, including naming conventions, file structures, and observed effects on AI collaboration. Section 5 situates SSA within related work, including Domain-Driven Design, parallel frameworks for AI-readable architecture, and emerging empirical evidence. Section 6 offers practical guidance for adopting SSA, including the open-source toolkit and the agent-first adoption model. Section 7 discusses limitations and directions for future research. Section 8 concludes with reflections on what it means to design software for an audience that now includes non-human minds.

2. Background: Context as Finite Resource

2.1 How LLMs See Code

To understand why existing architectural paradigms strain under AI-assisted development, one must first understand how large language models process information.

An LLM does not “know” a codebase the way a human developer does. It has no persistent memory of your project between sessions. What it has is a context window, a fixed-size buffer of text that constitutes its entire working reality at any given moment. Everything the model can reason about must fit within this window: the system prompt, the conversation history, the code under discussion, any retrieved documentation, and the user’s current request. Andrej Karpathy has offered a useful analogy [10]: think of an LLM as a CPU, and its context window as RAM. The context window is not storage; it is working memory. When the window fills, something must be evicted. When a new session begins, the RAM is cleared entirely.

Current frontier models have expanded this window considerably. As of March 2026, Claude Opus 4.6 and Sonnet 4.6 support one million tokens at standard pricing, with no long-context surcharge. Claude Code on Max, Team, and Enterprise plans receives this full window by default. OpenAI’s models offer comparable scales. One might assume this capacity sufficient for most development tasks. This assumption is incorrect, for reasons both quantitative and qualitative.

A moderately complex application routinely exceeds 500,000 tokens when accounting for source files, configuration, tests, and documentation. Enterprise codebases can reach several million. Even with million-token windows, the mathematics remain challenging: the model sees at most a fraction of any non-trivial system at any given moment, reasoning about the whole while seeing a part. And as the next section argues, the quantitative capacity is not the binding constraint.

But the quantitative constraint is not the most interesting problem. The qualitative one is.

2.2 Context Rot and the Attention Paradox

Researchers have documented a phenomenon called context rot: as the number of tokens in the context window increases, the model’s ability to accurately attend to information within that context decreases. This is not a bug but a feature of the attention mechanism underlying transformer architectures. Attention is a limited resource. When a model must attend to 200,000 tokens, its attention to any particular token is necessarily diluted compared to a scenario with 2,000 tokens.

The practical consequence is counterintuitive: adding more context can make the model worse at utilizing context. Stuffing the window with everything that might be relevant produces inferior results to carefully curating a smaller, more focused context.

This finding has profound implications. The naive approach, giving the AI access to the whole codebase, is not merely impractical due to window size limitations; it would be counterproductive even if windows were unlimited. Context must be treated as a finite resource with diminishing marginal returns.

There is also a temporal dimension. When a development session ends and a new one begins, the context window resets. The model retains no memory of previous interactions. Every insight it developed about your architecture, every pattern it recognized, every constraint it learned through trial and error: all of it vanishes. The human developer carries persistent memory across sessions. The AI does not. This asymmetry creates a recurring tax, the re-onboarding cost. I have experienced this cost acutely. At the start of each development session, I would spend fifteen to thirty minutes re-explaining my system, pasting specification documents, describing architectural patterns, reminding the AI of constraints it discovered yesterday. Only then could productive work begin.

2.3 Context Engineering: The Current Response

The challenges of context windows, context rot, and session boundaries have given rise to an emerging discipline that practitioners call context engineering, the art and science of curating what information enters an LLM’s context window, in what form, and at what time.

Context engineering encompasses retrieval systems that identify and fetch relevant code from large repositories, compression techniques that represent information in token-efficient forms, management strategies for maintaining coherence across extended interactions, and architectural decisions about how information should be structured for effective retrieval. Anthropic has published guidance [2] noting that context should be treated as a finite resource, advocating for techniques like compaction, structured note-taking, and multi-agent architectures for long-horizon tasks.

The context engineering paradigm represents a significant advance over naive approaches. It acknowledges the fundamental constraints of LLM architecture and develops systematic responses. Practitioners have converged on convention files (CLAUDE.md for Claude Code, cursor rules for Cursor [9]) that provide project-specific instructions injected into every session. Specification-driven development externalizes planning to structured documents [19]. RAG systems [11] fetch relevant code on demand.

These approaches work. But they share a common assumption worth examining: that context is something external to the code (documentation, specifications, conversation history) which must be injected into the model’s working memory at the right moment. The code itself is treated as semantically opaque, requiring external annotation to be understood at a domain level.

Recent empirical work suggests this assumption has measurable costs. A 2026 enterprise study [16] found that AI coding assistants actually slowed developers by 19% in certain contexts, with inconsistent naming conventions and architectural patterns identified as primary culprits. A separate study on variable naming and AI code completion [21] found that descriptive names achieved 0.874 semantic similarity scores compared to 0.802 for obfuscated names across models ranging from 0.5 billion to 8 billion parameters, a measurable difference attributable to naming alone.

These findings point toward a complementary approach.

2.4 From Context Injection to Semantic Encoding

The standard model of context engineering treats code as semantically opaque and context as semantically transparent. The code is syntax and identifiers; real understanding requires documentation that explains what the code does and specifications that describe why it was built this way. Under this model, the architect’s job is to build effective pipelines for injecting the right context at the right time. Better retrieval. Smarter summarization. More comprehensive documentation. The context engineering stack grows ever more sophisticated, ever more token-hungry, ever more complex to maintain.

But what if code could carry its own context?

What if the names, the structure, the organization of the code itself communicated domain meaning without requiring external documentation? This is not a new idea in software engineering. The principle of “self-documenting code” has been advocated for decades. Meaningful variable names, clear function signatures, logical module organization. These practices aim to make code comprehensible without extensive comments.

What is new is the audience. Previously, the audience was other human developers who bring experience, inference, and persistent memory to code comprehension. Now, the audience includes large language models, systems trained on billions of tokens of code and documentation, with strong priors about what names typically mean, with high sensitivity to patterns in naming conventions, but without the ability to carry context across sessions or infer intent from fragmentary clues. Research on code representations, from code knowledge graphs [1] to pre-trained models like GraphCodeBERT [8] that learn from data flow, has demonstrated that structural and semantic properties of code are learnable signals. The question is whether we can design code that amplifies those signals.

This suggests an opportunity: designing architectural conventions that leverage LLM strengths (pattern recognition, vast training priors, sensitivity to naming) while compensating for LLM weaknesses (context limits, session boundaries, attention dilution). Writing code that is semantically dense, that encodes maximum meaning per token, so that the limited context window is used with maximum efficiency.

The shift I am proposing is from context injection to semantic encoding.

Context injection asks: “How do I get the right documentation into the model’s context window at the right time?”

Semantic encoding asks: “How do I structure my code so that the model derives understanding from the code itself, reducing the need for external documentation?”

These approaches are not mutually exclusive. Even the most semantically dense codebase benefits from some explicit documentation. But the balance between them matters enormously for the practical experience of AI-assisted development.

A codebase designed for context injection requires comprehensive documentation maintained alongside code, retrieval systems configured and tuned, specification files updated as the system evolves, and memory systems managed across sessions. A codebase designed for semantic encoding requires naming conventions that carry domain meaning, organizational structures that reflect actor perspectives, consistent patterns that can be extrapolated, and vocabulary aligned with natural language understanding.

The first approach adds layers of tooling between the developer and the AI. The second embeds intelligence into the code itself. Both have costs. Context injection costs maintenance burden and tooling complexity. Semantic encoding costs upfront architectural discipline and potential deviation from conventional patterns.

But only one approach scales gracefully. As a codebase grows, the context injection burden grows with it: more documentation to maintain, more retrieval to tune, more specifications to keep current. The semantic encoding approach becomes more effective as the codebase grows, because conventions compound. Each new module that follows established patterns reinforces the model’s understanding of those patterns.

Popdot AI’s CLAUDE.md file illustrates this concretely. In roughly 900 tokens, the file provides a System vocabulary table mapping a dozen single-word names to their purposes (Gate for authentication, Vault for payments, Sigil for agent identity, Shield for platform safety, and so on), a Surface access table mapping four actor types to their entry points, and a set of naming conventions. That is the entire external context the AI needs at session start. From those 900 tokens, the AI can navigate a codebase with hundreds of source files, because the file paths, function names, and error codes all follow the vocabulary established in that brief specification. The codebase itself carries the remaining context. Compare this to the conventional approach, where an equivalent system might require 10,000 or more tokens of architectural documentation, API references, and relationship diagrams injected at session start, consuming a meaningful fraction of the context window before productive work begins.

Anthropic’s own engineering team has arrived at a related conclusion [2], finding that “context quality, not model capability, is the bottleneck” for AI coding assistants. The limiting factor is not how smart the model is, but how well the information it receives is organized.

This paper argues that the AI-assisted development community has underinvested in semantic encoding approaches, partly because the tooling for context injection is more visible and more easily productized, and partly because the practices of semantic encoding have not been systematically articulated.

Semantic Surface Architecture is an attempt at such articulation.

3. The Semantic Surface Architecture Framework

3.1 Overview and Design Principles

Semantic Surface Architecture is an architectural framework for organizing software systems in ways that optimize for human and AI collaboration. It provides conventions for naming, organization, and structure that encode domain meaning directly into the codebase, reducing dependence on external documentation and context injection.

SSA rests on five design principles:

Principle 1: Semantic Density. Every identifier, every file name, function name, variable name, and error code should carry maximum meaning per token. Names should evoke their domain function, not merely describe their technical role. Vault.Charge.create() is preferable to paymentService.createTransaction() because “Vault” carries connotations of security, storage, and value protection that “paymentService” does not.

Principle 2: Actor-Centric Organization. The primary organizational axis should be who is interacting with the system, not what technical layer they are touching. A guest browsing listings, an owner managing properties, and an administrator resolving disputes occupy different perspectives on the same underlying systems. The architecture should reflect these perspectives explicitly so the LLM immediately understands the context of a given function and where in the codebase to focus its attention.

Principle 3: Hierarchical Consistency. Naming conventions should follow a consistent hierarchical pattern throughout the codebase:

Surface → System → Component → Action → Outcome

This consistency allows both humans and LLMs to extrapolate from known patterns to unknown areas of the codebase.

Principle 4: Boundary Clarity. Permissions, access controls, and capability boundaries should be encoded in the architecture itself, not layered on top as external configuration. If guests cannot access payout functionality, that constraint should be visible in the structure: Vault.Payout exists on the Helm surface, not the Market surface.

Principle 5: Evocative Vocabulary. System names should be single words that evoke their function through metaphor and association. Ledger evokes record-keeping. Signal evokes communication. Shield evokes protection. These evocative names leverage the LLM’s training on natural language, activating relevant associations without explicit documentation. Crucially, the specific words are not prescribed. The names used in this paper (Gate, Vault, Ledger) are suggestions drawn from two specific projects, not a canonical dictionary. The power of SSA lies in the act of choosing evocative names, not in using any particular set of them. A team that names its admin surface “Cortex” because the neural metaphor resonates with their domain has captured the same benefit as a team that names it “Tower” because the oversight metaphor feels right. The vocabulary should belong to the people who live in the codebase.

3.2 The Five-Layer Taxonomy

SSA organizes code into five hierarchical layers, each with distinct responsibilities and naming conventions.

┌─────────────────────────────────────────────────────────────┐
│  LAYER 1: SURFACES                                         │
│  Actor perspectives / Entry points                         │
│  Examples: Market, Helm, Tower, Wire                       │
├─────────────────────────────────────────────────────────────┤
│  LAYER 2: SYSTEMS                                          │
│  Capability domains / Functional areas                     │
│  Examples: Gate, Vault, Ledger, Signal                     │
├─────────────────────────────────────────────────────────────┤
│  LAYER 3: COMPONENTS                                       │
│  Specific operations within systems                        │
│  Examples: Gate.Human, Gate.Agent, Vault.Charge            │
├─────────────────────────────────────────────────────────────┤
│  LAYER 4: ACTIONS                                          │
│  Methods / Operations                                      │
│  Examples: Gate.Human.verify(), Vault.Charge.create()      │
├─────────────────────────────────────────────────────────────┤
│  LAYER 5: OUTCOMES                                         │
│  Events, Errors, States                                    │
│  Examples: Vault.Charge.Completed, Vault.Charge.CardDeclined│
└─────────────────────────────────────────────────────────────┘

Each layer is described in detail below.

3.3 Layer 1: Surfaces

A Surface is an entry point into the system defined by actor perspective. It represents not a technical boundary but a perceptual boundary, defining what subset of the system a particular class of actor sees and interacts with.

Surfaces answer the question: Who is asking?

Each Surface has a defined actor type, an entry point (a route prefix, API namespace, or CLI command set), an enumeration of accessible Systems and Components, and a metaphor: a single evocative word that captures the actor’s relationship to the system. The metaphor should feel natural when an actor describes their perspective: “I’m on the Market looking for a rental” or “I am reviewing my properties from the Helm.”

Surfaces should be documented using a consistent template:

SURFACE: [Name]
METAPHOR: [Evocative description of the actor's perspective]
ACTOR: [Who uses this Surface]
ENTRY: [Technical entry point]
SYSTEMS: [List of accessible Systems with Component restrictions]

Sleep Around Points, the vacation rental marketplace, currently defines four Surfaces. Notably, these evolved during development, illustrating how SSA accommodates architectural change:

Market. The unified interface where all users interact. Originally, Sleep Around Points separated renters (Market) and owners (Helm) into distinct Surfaces, mirroring Popdot AI’s structure. Helm was later merged into Market, with owner features appearing conditionally in the dashboard based on contract ownership. While this merger was technically possible because SSA organizes by System rather than by route (no System changed), Section 4.2 discusses why the merger was ultimately a mistake that increased context ambiguity for AI collaboration. The lesson reinforces SSA’s core principle: when two actors have genuinely different workflows, they deserve separate Surfaces.

Tower. The oversight position with full visibility. Platform administrators enter through app.com/tower/ and access all Systems with all Components. Tower provides comprehensive administrative pages covering contract verification, listing approval, dispute resolution, financial monitoring, payout operations, and audit logging.

Wire. The programmatic connection for machines. External APIs, integrations, and future AI agents enter through api.app.com/wire/ and access a defined subset of Systems with explicit API contracts. Wire provides dozens of endpoints following the same SSA naming conventions.

Waitlist. A standalone pre-launch Surface for email capture and social proof, bypassing the main application’s authentication entirely. Waitlist accesses only Gate (email signup), Shield (rate limiting), Ping (confirmation email), and Scribe (logging). This Surface demonstrates that SSA can accommodate temporary, marketing-oriented entry points alongside the core product Surfaces.

The Surface abstraction provides immediate context for any AI interaction. When a developer says “I’m working on the checkout flow,” the AI immediately knows the Surface (Market), the actor context (a renter completing a booking), and the Systems involved (Vault for payment, Ledger for booking creation, Pact for agreement signing). By organizing code under Surface-specific directories, the file path itself communicates actor context:

app/(market)/checkout/[id]/page.tsx      → Renter payment flow
app/(market)/dashboard/payouts/page.tsx  → Owner payout view (conditional)
app/(tower)/contracts/page.tsx           → Admin contract verification

An LLM reading these paths understands the actor perspective without requiring additional documentation.

3.4 Layer 2: Systems

A System is a coherent capability domain, a cluster of related functionality unified by a common purpose. Systems are named with single evocative words that convey their function through metaphor.

Systems answer the question: What capability is being invoked?

System names should be single words (Gate, not AuthenticationSystem; Vault, not PaymentProcessing), should evoke function through metaphor, should be memorable and distinct from one another, and should scale across the domain so the metaphor accommodates all Components that belong to the System.

Each System should be documented with a consistent template:

SYSTEM: [Name]
MNEMONIC: [Why this name evokes the function]
PURPOSE: [What capability this System provides]
COMPONENTS: [List of Components within this System]
SURFACES: [Which Surfaces can access this System, with any restrictions]

Popdot AI™, the domain rental platform where SSA originated, defines over a dozen named Systems. A representative selection:

System	Mnemonic	Purpose
Gate	Entry point, access control	Human authentication
Vault	Secure value storage	Payment processing (traditional and programmatic)
Sigil	Seal, mark, credential	Agent identity and authentication
Shield	Protection barrier	Platform safety and content moderation
Orbit	Gravitational relationship	Domain discovery and search
Lease	Rental agreement	Rental lifecycle management
Signal	Communication pulse	Real-time event streaming

Additional Systems (Relay, Bloom, Beacon, and others) handle DNS operations, notifications, analytics, and other operational capabilities.

Sleep Around Points™, the vacation rental marketplace built on the same framework, defines over a dozen named Systems with a different vocabulary suited to its domain:

System	Mnemonic	Purpose
Gate	Entry point, access control	Authentication and identity verification
Deed	Legal ownership document	Ownership contracts and compliance
Shelf	Where goods are displayed	Point listings and search
Ledger	Financial record book	Booking lifecycle management
Vault	Secure value storage	Payment processing and escrow
Prism	Light refraction, multiple views	Pricing intelligence
Signal	Communication transmission	Messaging

Additional Systems handle dispute resolution, background processing, notifications, file storage, audit logging, and legal agreements.

The two tables reveal how SSA operates: some Systems transfer directly across domains (Gate, Vault, Signal, Prism), while others are domain-specific (Sigil and Shield exist only in Popdot AI; Deed and others exist only in Sleep Around Points). The framework provides a structure for vocabulary development, not a fixed dictionary. Each project cultivates the words its domain requires.

The Evocative Advantage. Consider the difference between these two approaches to naming an authentication system:

Conventional: AuthService, AuthenticationModule, UserAuthController

SSA: Gate

The conventional names are descriptive but sterile. They communicate technical function but evoke no associations. An LLM reading AuthService.login() understands that authentication is occurring but gains no additional context.

Gate evokes a rich set of associations: entrance, permission, boundary, passage, checkpoint. An LLM trained on natural language text has encountered “gate” in countless contexts, from airport gates to garden gates to castle gates to logic gates. These associations prime the model to understand that Gate controls access, that it exists at boundaries, that some pass through and others are denied.

When the LLM later encounters Gate.Agent.verify(), the associations compound. This is not just authentication; this is a gate specifically for agents, with verification as the means of passage. The name carries context that would otherwise require documentation.

This is semantic density in practice: maximum meaning per token. And the specific word matters less than the quality of the metaphor. If “Gate” does not resonate for a particular team or domain, “Portal,” “Threshold,” or “Checkpoint” carry similar associations. The framework encourages the search for the right metaphor, not adherence to a predetermined list.

3.5 Layer 3: Components

A Component is a specific operational unit within a System. Components represent the concrete implementations of System capabilities, organized by entity type or functional subdivision.

Components answer the question: What specific thing within this capability?

Component names should clarify the System’s scope (Gate.Human vs. Gate.Agent distinguishes authentication flows for different actor types), should map to domain entities where appropriate (Ledger.Booking, Deed.Contract, Shelf.Listing), and should be noun-based. Components are things, not actions; actions belong to Layer 4.

Common Component patterns include entity Components named after the domain entity they manage (Ledger.Booking, Deed.Contract, Vault.Charge), actor Components named after the actor type they serve (Gate.Human, Gate.Agent), and function Components named after a specific function within the System (Gate.Session, Gate.Connect).

Example Component hierarchies:

Gate
├── Gate.Human        → Email/password authentication
├── Gate.Agent        → API key/token authentication
├── Gate.Session      → Session lifecycle management
└── Gate.Connect      → Third-party OAuth integration

Vault
├── Vault.Charge      → Processing incoming payments
├── Vault.Payout      → Distributing funds to owners
├── Vault.Refund      → Processing refunds
└── Vault.Balance     → Account balance management

Ledger
├── Ledger.Booking    → Reservation records
├── Ledger.Hold       → Pending/temporary reservations
└── Ledger.History    → Historical transaction records

3.6 Layer 4: Actions

An Action is a method or operation performed by a Component. Actions represent the verbs of the system, the things that can be done.

Actions answer the question: What operation is being performed?

Action names should be verb-based, should follow consistent patterns (the same verb across Components for similar operations), and should be specific but not verbose: create rather than createNewInstance.

SSA recommends a standard vocabulary of action verbs:

Verb	Meaning
create	Instantiate a new entity
read	Retrieve existing data
update	Modify existing entity
delete	Remove an entity
verify	Confirm validity or authenticity
process	Execute a workflow or transformation
cancel	Abort or reverse an operation
submit	Send for processing or approval
approve	Grant permission or confirmation
reject	Deny permission or confirmation
sync	Synchronize with external system
notify	Send a notification

Examples in context:

Gate.Human.verify()           → Verify human user credentials
Gate.Agent.authenticate()     → Authenticate an API agent
Gate.Session.create()         → Create a new session
Vault.Charge.create()         → Initiate a payment charge
Vault.Charge.process()        → Process the payment with provider
Vault.Payout.process()        → Execute payout to owner
Ledger.Booking.create()       → Create a new booking record
Ledger.Booking.confirm()      → Confirm a pending booking
Ledger.Booking.cancel()       → Cancel an existing booking

3.7 Layer 5: Outcomes

Outcomes represent the results of Actions: the events that occur, the errors that arise, and the states that entities occupy.

Outcomes answer the question: What happened?

SSA distinguishes three categories of Outcomes:

Events are significant occurrences that the system records and potentially broadcasts. Events use past-tense descriptions of what happened: Ledger.Booking.Created, Vault.Charge.Completed, Gate.Session.Expired.

Errors are failure conditions that prevent successful completion of an Action. Errors describe what went wrong: Vault.Charge.CardDeclined, Gate.Human.InvalidCredentials, Ledger.Booking.DateUnavailable.

States are the lifecycle positions an entity can occupy. States describe current condition: Ledger.Booking.State.Pending, Ledger.Booking.State.Confirmed, Ledger.Booking.State.Cancelled.

Outcome names should use past tense for Events (Created, Completed, Failed), descriptive nouns or adjectives for Errors (CardDeclined, InvalidCredentials, NotFound), and present-tense adjectives for States (Pending, Active, Cancelled). All Outcomes should include their full hierarchical path: Vault.Charge.CardDeclined, not just CardDeclined.

3.8 The Full Path

The five layers combine to form full semantic paths that communicate complete context:

Surface.Flow.Step.Action → System.Component.action() → Outcomes

Example:

Market.Checkout.Payment.Submit
  → Vault.Charge.create()
  → Vault.Charge.Completed | Vault.Charge.CardDeclined

This path tells a complete story. A guest on the Market surface, in the Checkout flow, at the Payment step, performing a Submit action, which invokes the Vault system’s Charge component, calling the create action, resulting in either Completed (success) or CardDeclined (failure).

Every element of this path carries meaning. No external documentation is required to understand what is happening, who is doing it, or what might go wrong.

3.9 Conventions in Practice

The five-layer taxonomy extends beyond function naming into every artifact of the codebase. This section describes how SSA conventions apply to file structure, entity definitions, error handling, testing, documentation generation, and agent integration.

File System Mapping. Directory structures should mirror the Surface-System-Component hierarchy:

app/
├── (market)/                      # Market surface
│   ├── browse/                    # Shelf system interactions
│   ├── checkout/                  # Ledger + Vault interactions
│   │   └── payment/
│   └── bookings/                  # Ledger interactions
├── (helm)/                        # Helm surface
│   ├── dashboard/
│   │   ├── listings/              # Shelf management
│   │   ├── bookings/              # Ledger (owner view)
│   │   └── payouts/               # Vault.Payout
│   └── onboarding/                # Gate.Connect
├── (tower)/                       # Tower surface
│   ├── users/                     # Gate administration
│   ├── disputes/                  # Arbiter
│   └── metrics/                   # Scribe
└── api/
    └── wire/                      # Wire surface
        ├── gate/                  # Gate.Agent endpoints
        ├── shelf/                 # Shelf API
        └── ledger/                # Ledger API

lib/
└── systems/
    ├── gate/
    │   ├── human.ts
    │   ├── agent.ts
    │   ├── session.ts
    │   └── connect.ts
    ├── vault/
    │   ├── charge.ts
    │   ├── payout.ts
    │   └── refund.ts
    ├── ledger/
    │   ├── booking.ts
    │   └── hold.ts
    └── [other systems...]

This structure ensures that file paths communicate context. app/(market)/checkout/payment/page.tsx tells the reader this is guest payment UI. lib/systems/vault/charge.ts tells the reader this is payment charge logic. An LLM navigating this codebase can infer relationships from paths alone.

Entity Naming. Domain entities follow consistent naming that reflects their System ownership:

// Entity names reflect System ownership
interface VaultCharge {
  id: string;
  amount: number;
  currency: string;
  status: 'pending' | 'completed' | 'failed';
}

interface LedgerBooking {
  id: string;
  status: 'pending' | 'confirmed' | 'completed';
  createdAt: Date;
}

The prefix convention (Deed, Ledger, Vault) ensures that any reference to an entity communicates its System context. When an LLM encounters VaultCharge, it immediately understands this belongs to the payment domain.

Naming Convention Formalization. Popdot AI’s architecture documentation codifies naming conventions across every layer of the codebase, ensuring consistency between code identifiers, file paths, routes, and database entities:

Element	Convention	Example
System	PascalCase, single word	Vault, Sigil, Shield
Component	System.Component	Sigil.Mandate, Shield.Filter
Action	System.Component.action()	Sigil.Mandate.create()
Error	System.Component.ErrorName	Shield.Filter.ContentBlocked
File (code)	kebab-case	shield.ts, sigil-mandate.ts
Route	kebab-case	/api/wire/agent, /helm/earnings
Database	snake_case	vault_charges, lease_rentals

This convention table occupies roughly 80 tokens in Popdot AI’s documentation. From it, a developer or AI assistant can predict the naming pattern for any new element added to the codebase. If a new System called Tether is introduced for DNS record management, the convention table tells you the component files will be kebab-case (tether.ts), the routes will be kebab-case (/api/wire/tether), and the database tables will be snake_case (tether_records). The AI does not need to search for precedent; the convention is explicit.

Error Handling. Errors follow the full-path convention established in Layer 5. Popdot AI extends this with numeric error code ranges that partition the error space by System:

class VaultChargeCardDeclined extends Error {
  code = 'Vault.Charge.CardDeclined';
}

class GateHumanInvalidCredentials extends Error {
  code = 'Gate.Human.InvalidCredentials';
}

Each System is assigned a dedicated numeric code range, so that error codes themselves encode System ownership. A numeric error code can be traced to its System without needing to inspect the error message. This redundant encoding (both the code path and the numeric range identify the System) provides defense in depth against context loss.

When errors propagate through the system, their origin is immediately apparent from multiple signals: the error class name, the error code string, and the numeric code range. Stack traces become self-documenting. Error logs communicate context without requiring correlation to external documentation.

Testing. Test files and test descriptions mirror SSA structure:

// tests/systems/vault/charge.test.ts

describe('Vault.Charge', () => {
  describe('create()', () => {
    it('creates a charge for valid booking', async () => { /* ... */ });
    it('throws Vault.Charge.CardDeclined for declined cards', async () => { /* ... */ });
  });
});

Test output becomes immediately interpretable:

Vault.Charge
  create()
    ✓ creates a charge for valid booking
    ✗ throws Vault.Charge.CardDeclined for declined cards

The failing test’s location is apparent from the name alone: Vault.Charge.create() has a bug in its card-declined handling.

Documentation Generation. SSA’s consistent structure enables automated documentation extraction. A tool can traverse the codebase and produce System references, Component inventories, Action catalogs, and Error dictionaries, all derived from code rather than maintained separately. This addresses a persistent problem in software development: documentation that drifts from implementation.

Agent Integration. SSA’s explicit Surface model provides a foundation for AI agent integration. The Wire surface defines the programmatic interface, and Gate.Agent handles authentication. The capability system uses SSA vocabulary directly:

type AgentCapability =
  | 'Shelf.read'
  | 'Vault.create'
  | 'Signal.send';

Permissions are expressed in the same language as the architecture. An agent knows what it can do by examining its capabilities list. No translation required. When future AI agents join the system, they inherit a vocabulary that both they and the human developers understand. The shared semantic foundation enables collaboration without extensive onboarding.

3.10 Summary

Semantic Surface Architecture provides five layers of hierarchical organization (Surfaces, Systems, Components, Actions, Outcomes), actor-centric Surfaces that organize code by who is interacting rather than what technical layer is involved, evocative System names that leverage natural language associations to communicate function, consistent naming conventions that extend from code identifiers to file paths to error messages, self-documenting structure that encodes context in the architecture itself, and agent-ready design with explicit Surface and authentication patterns for AI integration.

The framework is designed to reduce dependence on external context injection by maximizing the semantic density of the code itself. When an LLM reads SSA-structured code, it derives understanding from names and organization rather than requiring separate documentation.

The next section describes the application of this framework to production systems.

4. Application to Production Systems

SSA was not designed in the abstract and then applied. It emerged under pressure, was refined through iteration, and proved its value across two production codebases with different domains, different constraints, and different scales of complexity. This section describes what was built, what was observed, and what the comparison between projects reveals.

4.1 Popdot AI™: Where SSA Was Born

Popdot AI is an agentic domain rental platform built for both human users and autonomous AI agents. Its tagline, “Where AI agents get an address,” captures the premise: humans list domains and earn passive revenue while AI agents discover, rent, and deploy content to subdomains through programmatic interfaces. The platform handles authentication, pricing, payments in both traditional and programmatic channels, content safety scanning, behavioral threat detection, and real-time event streaming.

Popdot AI is where SSA was invented, not as a methodology to be documented, but as a structural response to a problem that conventional architecture could not solve. The problem was this: four fundamentally different kinds of actors (human renters, domain owners, AI agents, and platform administrators) needed to interact with the same underlying capabilities, but each actor required a different perspective, different permissions, and different interaction patterns. Traditional layered architecture (controllers, services, repositories) could not cleanly express this multi-actor complexity. Every controller became a tangle of permission checks. Every service needed to know who was calling it. The architecture leaked actor context in every direction.

The Surface abstraction was the response. By organizing the codebase around who is interacting rather than what technical layer is involved, the architecture separated concerns that had been tangled together.

The Four Surfaces

Popdot AI defines four Surfaces, each with a distinct metaphor, entry point, and system access profile:

Market (the marketplace) is the public face where human renters browse and purchase. Its views use marketplace language: Bazaar for the search grid, Showcase for domain detail pages, Checkout for the purchase flow. Market accesses a subset of systems (Gate for authentication, Orbit for search, Prism for pricing, Vault for payments) and is restricted from systems that belong to other actor perspectives. A guest on Market cannot see payout logic, agent management tools, or administrative controls.

Helm (the captain’s position) is where domain owners manage their portfolios. Its views use nautical language: Bridge for the dashboard overview and Voyages for rental management. Helm accesses a broader set of systems than Market, including domain management and the owner view of Vault (payouts rather than charges).

Wire (the electrical connection) is the programmatic interface for AI agents. Wire uses no visual interface; it is a set of API endpoints organized by resource group. Wire accesses Sigil (agent identity and credential management) and programmatic payment capabilities, which no other Surface exposes. The Wire Surface provides a programmatic API where agents interact through the same System vocabulary, with domain-appropriate action names that communicate intent.

Cortex (the neural center) is where platform administrators monitor and govern. Cortex has full access to every system and every capability. Its views use neural language, including views for platform oversight and agent governance. Cortex is the only Surface that can override other Surfaces’ operations: suspending agents, revoking mandates, rolling back content deployments, adjusting trust scores.

The naming is not incidental to the architecture; it is the architecture. When a developer says “we need to add a new panel to Helm.Voyages,” the AI assistant immediately knows the actor context (owner), the view context (rental management), and the system scope (Lease, Vault). When an error surfaces as Vault.Escrow.InsufficientFunds, the debugging path is apparent from the name: the payment system’s escrow component encountered a balance problem.

Named Systems

Popdot AI’s backend is organized into over a dozen named Systems, each a coherent capability domain with a single evocative name. The Systems range from foundational (Gate for authentication, Vault for payments) to domain-specific (Sigil for agent identity, Shield for platform safety) to operational (Pulse for background jobs, Signal for real-time events, Bloom for email notifications).

Several Systems illustrate SSA’s principles with particular clarity:

Sigil manages AI agent identity and implements a custom authentication protocol. The name evokes seal, mark, credential. Sigil.Issue creates agent credentials with asymmetric key pairs. Sigil.Verify authenticates agent requests by checking cryptographic signatures. Sigil.Govern manages agent lifecycle (suspend, terminate, reactivate). Sigil.Mandate handles pre-authorization for autonomous spending. The full vocabulary (Sigil, Issue, Verify, Govern, Mandate) communicates a coherent story of credentialed identity with hierarchical control. An LLM encountering Sigil.Mandate.create() for the first time can infer from naming alone that this creates an authorization token for an agent, governed by the Sigil identity system.

Shield handles platform safety, content moderation, and access control. The name evokes protection and defense. Its components follow the same pattern: Shield.Filter (content evaluation), Shield.Meter (rate limiting), Shield.Score (trust assessment). Each component name contributes to the protective metaphor. An LLM reading Shield.Filter.evaluate() understands immediately that this checks content for safety within the platform’s protection system.

These naming choices are not arbitrary aesthetic preferences. They serve a functional purpose in AI-assisted development. When I tell the AI assistant “we need to update Shield’s filtering logic,” it understands the domain (platform safety), the pattern (evaluation functions), and the location in the codebase (lib/systems/shield/filter.ts). When an error propagates as Shield.Filter.ContentBlocked, the debugging context is embedded in the name.

The Cortex Parity Principle

Popdot AI introduced a governance principle that illustrates how SSA can encode policy in architecture: the Cortex Parity Principle.

The principle states that any action an AI agent can perform via Wire, and any data an AI agent can access, must be (1) visible to Cortex administrators in real time, (2) controllable via Cortex intervention mechanisms, and (3) auditable through complete activity logs. When adding any Wire endpoint, the developer simultaneously creates a Cortex view showing that data, a Cortex action to control or override that capability, and a Signal event for real-time visibility.

This principle is expressed through a parity mapping in the project documentation.

For every capability exposed to agents via Wire, a corresponding monitoring view and intervention mechanism exists in Cortex. The specific mappings are documented in the project’s SYSTEMS.md file, ensuring the governance relationship is always legible.

The Cortex Parity Principle is an architectural governance mechanism encoded in SSA vocabulary. It would be difficult to express this principle in a conventional service-oriented architecture without extensive documentation explaining which admin endpoints correspond to which user-facing operations. In SSA, the correspondence is visible in the naming: the structure makes the governance relationship legible to both human developers and AI assistants.

Scale of Convention

The Popdot AI Surface specification, which documents the complete frontend architecture, defines multiple Surfaces encompassing dozens of named Views, over a hundred Panels, and scores of Actions, all following the Surface.View.Panel.Action naming hierarchy. This is not a toy example or a proof of concept. It is a production system with genuine complexity, and every named element follows SSA conventions.

The file structure mirrors the naming hierarchy:

app/
├── browse/               → Market.Bazaar
├── domain/[id]/          → Market.Showcase
├── helm/                 → Helm Surface
│   └── rentals/          → Helm.Voyages
├── cortex/               → Cortex Surface
└── api/wire/             → Wire Surface

The backend mirrors the same structure:

lib/systems/
├── gate/index.ts         → Gate System
├── vault/index.ts        → Vault System
├── sigil/index.ts        → Sigil System
└── shield/               → Shield System
    └── filter.ts         → Shield.Filter

An AI assistant navigating this codebase finds what it needs from paths alone. There is no ambiguity about where Shield’s filtering logic lives or where the Helm dashboard is rendered. The structure is the documentation.

The Agent Authentication Challenge

Popdot AI provides a concrete illustration of how SSA handles complexity that would strain conventional architecture. The platform must authenticate two fundamentally different kinds of actors: human users (via a traditional session-based authentication provider) and AI agents (via a custom protocol using cryptographic signatures).

In conventional architecture, this dual authentication might live in an AuthService with methods like authenticateHuman() and authenticateAgent(), with the distinction buried in method signatures. In SSA, the distinction is elevated to the System level: Gate handles human authentication, Sigil handles agent authentication. The separation is visible in file paths (lib/systems/gate/ vs. lib/systems/sigil/), in Surface access (Market and Helm use Gate; Wire uses Sigil), and in the naming hierarchy.

The trust tier system adds another layer. Agents progress through graduated trust tiers, each with different transaction limits, rate limits, and fee structures. These tiers are managed by Sigil.Govern and enforced through graduated access controls. The vocabulary makes the governance model legible: a new agent has limited capabilities, a fully trusted agent has earned broad autonomy. An LLM reading the code understands the progression without consulting external documentation.

4.2 Sleep Around Points™: Refinement in a New Domain

Sleep Around Points is a Disney Vacation Club points rental marketplace where DVC owners rent unused vacation points to guests seeking resort stays. The domain is fundamentally different from Popdot AI (vacation timeshares rather than domain rentals), the regulatory environment is different (financial transactions with real estate implications), and the actor relationships are different (the platform intermediates between individual owners and guests rather than between domain registrants and AI agents). As of March 2026, the platform has over a dozen named Systems, multiple Surfaces, dozens of API endpoints, and dozens of pages across its Market and Tower interfaces.

When I began building Sleep Around Points, SSA was already established from Popdot AI. The question was whether the framework would transfer to a new domain or whether it had been shaped too specifically by the domain rental context.

The framework transferred, and then it evolved. This evolution is itself instructive for understanding how SSA operates in practice.

Surface Evolution

Sleep Around Points initially adopted Popdot AI’s four-Surface structure directly: Market, Helm, Tower, Wire. As the product matured toward launch, Helm was merged into Market based on the reasoning that owners are often also renters and the role boundary is fluid. Owner features now appear conditionally in the unified dashboard based on whether the user has verified DVC contracts. The unified Market Surface grew to dozens of pages, with progressive disclosure navigation that reveals owner-specific sections (contracts, listings, rentals, payouts) only when relevant.

In retrospect, this merger was a mistake, and it is worth documenting as a cautionary example. The unified dashboard required substantially longer context windows to explain to the AI assistant which dashboard view for which type of user on which screen. The conditional logic for owner detection alone touches five separate files that must stay aligned. Progressive disclosure navigation requires a multi-phase owner funnel and a separate renter funnel, all computed server-side and synchronized with client state. The AI assistant frequently confused owner and renter contexts within the same Surface, because the Surface boundary that would have disambiguated them no longer existed. What seemed like an elegant simplification created precisely the kind of context ambiguity that separate Surfaces are designed to prevent. The lesson: Surfaces exist to separate actor perspectives, and when two actor perspectives require genuinely different workflows, they deserve separate Surfaces, even if the actors sometimes overlap.

A fourth Surface, Waitlist, was added for the pre-launch period: a standalone landing page that bypasses the main application’s authentication entirely, accessing only Gate (email signup), Shield (rate limiting), Ping (confirmation email), and Scribe (logging). Waitlist demonstrates that Surfaces can be temporary and purpose-built. When the waitlist is removed at launch, the Surface disappears and nothing in the remaining architecture needs to change.

These changes illustrate important properties of SSA. The Waitlist addition shows that Surfaces are not permanent, and the Systems beneath them are not coupled to any particular Surface configuration. The Helm merger shows that while the architecture can accommodate Surface consolidation (no System changed; the same Gate, Deed, Shelf, Ledger, and Vault components continued to function), doing so sacrifices the actor-perspective clarity that is SSA’s primary benefit. The framework is resilient to structural change, but that resilience should not be mistaken for permission to collapse meaningful boundaries.

Domain-Specific Systems

The System vocabulary required significant adaptation from Popdot AI. Popdot AI’s domain-specific Systems (Sigil, Latch, Relay, Drop, Shield) had no analogue in the vacation rental context. Sleep Around Points needed Systems that Popdot AI did not:

Deed manages DVC contract ownership and enforces regulatory compliance constraints. The name evokes legal ownership documents. When a constraint is violated, the error Deed.Cap.Exceeded communicates instantly what went wrong and where.

Atlas is a read-only resort intelligence system providing property data and search. Named after the Greek titan who held knowledge of the world, it illustrates how SSA naming works for data-layer Systems: Atlas.Resort.InvalidSlug is immediately legible even without context.

Pact handles rental agreements and owner terms of service. Named after binding agreements, its component vocabulary (Pact.Agreement.create, Pact.Agreement.linkToBooking, Pact.PDF.generate) tells a story of legal formalization. When the AI encounters Pact.Agreement.SignatureMismatch, it understands both the System (legal agreements) and the failure mode (the typed signature did not match the name on file).

Arbiter manages disputes with a resolution workflow (owner_favor, renter_favor, split). Named after the decision-maker in conflicts, the vocabulary carries the judicial metaphor: Arbiter.Dispute.Opened, Arbiter.Evidence.Submitted, Arbiter.Dispute.Resolved.

Several Systems transferred directly with identical or near-identical semantics. Gate remained authentication, though its components expanded. Vault remained payment processing, though its components shifted to reflect the vacation rental domain rather than Popdot AI’s more complex multi-rail payment architecture. Signal remained messaging. Prism remained pricing intelligence, adapted with domain-specific components for point valuations and market pricing. Ping (notifications) mapped to Popdot AI’s Bloom (email and webhooks). Crank handles scheduled background processing including deadline enforcement, automated transitions, and notification dispatch.

The transferable vocabulary suggests that certain capability domains recur across applications and that their SSA names can become a shared vocabulary for practitioners. Gate, Vault, Signal, Prism, Crank, Shield, and Scribe all transferred with minimal adaptation.

State Machines and the SSA Naming Advantage

Sleep Around Points provides a particularly clear example of how SSA naming aids comprehension of complex state logic. Ledger manages bookings through a multi-state lifecycle with named transitions (confirm, submit proof, process arrival, complete) and corresponding events. Each state transition has an SSA-named function and outcome, making the booking flow legible to both developers and AI assistants. The state names follow SSA conventions, so Ledger.Booking.State.[StateName] tells the AI exactly where in the lifecycle this booking sits, which System manages it, and what is being awaited. When an error surfaces as Ledger.Booking.InvalidTransition, the AI knows to look in the Ledger system’s transition logic without searching the codebase.

Error Codes in Practice

Sleep Around Points defines dozens of named error codes following the System.Component.ErrorName pattern, organized across all of its Systems. A representative sample:

Gate.Identity.NotVerified        → Identity verification required
Deed.Cap.Exceeded                → Would exceed rental limit
Ledger.Booking.InvalidTransition → Invalid state transition attempted
Vault.Payout.Failed              → Payout could not be completed
Shelf.Listing.Expired            → Listing is no longer available
Signal.Message.Blocked           → Message failed content filter

Each error code is self-documenting. An AI assistant encountering Deed.Cap.Exceeded does not need to search for documentation about what “cap” means or which system enforces it. The name carries the full context: the Deed system (DVC contracts), the Cap component (rental limits), the Exceeded condition (the limit was hit). This is semantic density applied to error handling.

Sleep Around Points ultimately defined fewer Systems than Popdot AI. The smaller count reflects a simpler domain (no agent identity, no DNS routing, no behavioral threat detection) rather than a deficiency in the framework. SSA scales to the complexity of the domain rather than imposing unnecessary structure.

4.3 Cross-Project Observations

Building two production systems with SSA, one where the framework was invented and one where it was deliberately reapplied, produced observations that a single-project experience could not.

What Transferred

The Surface abstraction transferred completely, though its application evolved in instructive ways. In both projects, organizing by actor perspective (who is asking) rather than technical layer (what technology is involved) produced clearer code organization and faster AI comprehension. The core Surface pattern (marketplace, administrative oversight, programmatic API) appears to be a natural decomposition for multi-actor platforms, though this observation is limited to two projects in related domains.

The five-layer taxonomy (Surface, System, Component, Action, Outcome) transferred without modification. Both projects follow the same naming hierarchy, the same file structure conventions, and the same error naming patterns. An AI assistant familiar with one codebase can navigate the other with minimal reorientation. Sleep Around Points’ SYSTEMS.md, which catalogs dozens of error codes, event definitions, and state machines, follows identical patterns to Popdot AI’s ARCHITECTURE.md.

The CLAUDE.md specification file transferred in structure if not in content. Both projects use a file under 1,200 tokens that provides the System vocabulary table, Surface access rules, and naming conventions. Sleep Around Points expanded the pattern with a NAVIGATOR.md companion that provides reading order by role and links to deeper documentation, demonstrating that the SSA specification file format can scale with documentation needs. In both cases, the minimal CLAUDE.md proved sufficient for session-start context.

The Surface Access Matrix, which maps read/write permissions per Surface per System, transferred as a documentation pattern. Both projects maintain a grid showing which Surfaces can access which Systems, with what permissions. This matrix serves as both documentation and enforcement reference: when the AI assistant proposes adding a capability to a Surface, the matrix immediately shows whether that access is authorized.

What Diverged

System vocabulary diverged substantially, as expected. Capability domains are domain-specific; the Systems that manage DVC contracts (Deed, Atlas, Pact) have no analogue in domain rentals, and the Systems that manage agent identity and platform safety (Sigil, Shield) have no analogue in vacation rentals. This divergence validates SSA’s design: the framework provides a structure for naming and organization, not a fixed vocabulary. Each project cultivates the words its domain requires.

The depth of the naming hierarchy also diverged. Popdot AI, with its dozens of Views and over a hundred Panels, pushed the Surface.View.Panel.Action convention further than Sleep Around Points required. The vacation rental marketplace has fewer distinct actor workflows and therefore fewer named UI elements. SSA accommodated this difference without strain; the framework does not require uniform depth across projects.

The Surface structure itself diverged in revealing ways. Popdot AI maintained strict separation between its four Surfaces throughout development. Sleep Around Points merged Helm into Market, collapsing the owner and renter perspectives into a single Surface. As discussed in Section 4.2, this merger created significant challenges for AI collaboration by introducing conditional logic that obscured which actor perspective was relevant at any given point. The divergence suggests that Surface boundary decisions have outsized consequences for the effectiveness of SSA. The Surface is the primary unit of actor-perspective disambiguation; when it is compromised, the framework’s benefits erode.

The governance model diverged in ways that reflected domain requirements. Popdot AI’s Cortex Parity Principle (every agent capability must have an admin counterpart) is specific to platforms where autonomous agents can take financially consequential actions. Sleep Around Points, without agent actors at launch, had no equivalent governance requirement. However, the Tower Surface still provided comprehensive administrative access, including payout operations monitoring, cron job tracking, and an immutable audit trail via Scribe.Audit. The principle of explicit administrative visibility proved valuable even without the specific agent-parity constraint.

Observed Effects on AI Collaboration

Several effects on AI collaboration were consistent across both projects, with the caveat that these are self-reported observations rather than measured outcomes.

Re-onboarding time decreased. Before SSA, starting a new development session required fifteen to thirty minutes of context injection: pasting architectural overviews, explaining system relationships, reminding the AI of conventions. With SSA, session starts require a task description and (on the first session) a reading of the CLAUDE.md file. The AI derives architectural understanding from file paths and naming patterns. Subsequent sessions on the same project typically require only a task description, as the codebase itself provides sufficient context.

Suggestion consistency improved. Before SSA, the AI would propose file locations, function names, and architectural patterns that were locally reasonable but globally inconsistent. A new payment function might be placed in a utils directory rather than in the Vault system. A new API endpoint might use naming conventions that differed from existing endpoints. With SSA conventions in place, the AI’s suggestions became more consistent, because the naming patterns in the codebase trained its expectations. When the codebase consistently places payment logic in lib/systems/vault/, the AI suggests the same location for new payment logic.

Debugging became more transparent. Error messages that follow the System.Component.ErrorName convention (such as Vault.Escrow.InsufficientFunds or Sigil.Mandate.Expired) communicate their origin and nature without requiring stack trace analysis. The AI assistant can identify the failing system, the specific component, and the failure mode from the error name alone, which accelerates the diagnosis-to-fix cycle.

Documentation requirements compressed. Popdot AI’s CLAUDE.md file is approximately 900 tokens. Sleep Around Points follows the same pattern with a CLAUDE.md of comparable size, supplemented by a structured documentation suite (ARCHITECTURE.md, SURFACES.md, SYSTEMS.md) that is itself organized by SSA conventions. The SYSTEMS.md file, which catalogs error codes, events, and state machines, was generated from codebase patterns rather than maintained as a standalone artifact. The total external documentation for both projects is a fraction of what conventional development would require, because the code carries the information that documentation would otherwise need to provide.

These observations are preliminary and subject to the methodological limitations discussed in Section 7. They describe one developer’s experience with one AI assistant on two related projects. Generalization requires the controlled experiments that this paper recommends but has not conducted.

4.4 A Note on Scale

Neither Popdot AI nor Sleep Around Points is a large enterprise system. Popdot AI, the more complex of the two, has over a dozen Systems, hundreds of source files in the systems directory, and a development history spanning from late 2024 through early 2026. These are the codebases of a solo practitioner building production software with AI assistance.

This scale is both a limitation and a feature. It is a limitation because SSA’s behavior at enterprise scale (hundreds of developers, thousands of files, dozens of services) remains unknown. Team dynamics, organizational politics, and coordination overhead could introduce challenges that solo development does not surface.

It is a feature because it represents the fastest-growing segment of software development: individuals and small teams building production systems with AI assistants. The solo practitioner with Claude Code, Cursor, or a similar tool is not an edge case; it is increasingly the norm for new projects. SSA was designed for this context, and its effectiveness should be evaluated primarily in this context, while acknowledging that broader applicability remains to be demonstrated.

5.1 Intellectual Foundations

Semantic Surface Architecture draws on several established traditions in software engineering and knowledge representation. This section situates SSA within that broader landscape, acknowledges intellectual debts, and clarifies what is new.

Domain-Driven Design

The most significant intellectual ancestor of SSA is Domain-Driven Design (DDD), articulated by Eric Evans in his 2003 book [6] and subsequently elaborated by practitioners including Vaughn Vernon [18]. SSA shares several core commitments with DDD. The ubiquitous language, DDD’s central insight that a shared vocabulary used consistently by developers and domain experts reduces translation overhead, directly informs SSA’s emphasis on semantic naming. The System vocabulary (Gate, Vault, Ledger) functions as a ubiquitous language for the domains in which SSA has been applied. DDD’s bounded contexts, where specific models apply within explicit boundaries, resonate with SSA’s Surface abstraction. Each Surface can be understood as a bounded context defined by actor perspective rather than domain subdomain.

SSA diverges from DDD in several important respects. DDD optimizes for human-to-human communication: developers speaking with domain experts, teams coordinating across bounded contexts. SSA optimizes for human-to-AI communication, recognizing that LLMs have different comprehension patterns than human collaborators. DDD’s ubiquitous language tends toward descriptive precision. A PaymentService or BookingRepository clearly describes its function. SSA favors evocative compression: Vault and Ledger sacrifice some descriptive precision for metaphorical resonance that activates LLM associations. And while DDD typically organizes by domain subdomain, SSA’s primary axis is actor perspective, with the same domain entities appearing across multiple Surfaces viewed from different angles.

In March 2024, Evans delivered a keynote at Explore DDD [7] addressing the intersection of DDD and large language models. His observation that “a trained language model is a bounded context” suggests convergent thinking with SSA’s approach. Evans proposed that fine-tuning language models on domain-specific ubiquitous language could make them more effective than general-purpose models. SSA approaches the same goal from the opposite direction: rather than fine-tuning the model, we structure the code so that general-purpose models behave as if they had domain-specific training. Both approaches recognize the fundamental insight that vocabulary shapes understanding. Whether that vocabulary is injected through fine-tuning or encoded in naming conventions, the effect is similar: an LLM that “speaks the domain.” Evans and the Domain Language consultancy have continued this line of inquiry into 2025 and 2026, with DDD Europe 2026 featuring sessions explicitly addressing strategic design in the context of AI-assisted development.

Traditional Architecture Patterns

SSA relates to but is not identical with several established architectural patterns. Traditional layered architecture (presentation, business logic, data access) organizes by technical concern. SSA does not reject layers; implementation may still separate UI components from business logic from data access. But SSA subordinates technical layering to actor-perspective organization: the file system reflects Surfaces first, technical layers second.

Hexagonal architecture, proposed by Alistair Cockburn [5], organizes applications around a domain core with ports and adapters connecting to external systems. SSA’s Surfaces bear some resemblance to ports as entry points through which actors interact with the system. However, SSA’s Surfaces are not merely technical interfaces but complete perspectives with their own coherent worldviews. A port in hexagonal architecture might be BookingPort with methods for creating and retrieving bookings. An SSA Surface is Market, an entire actor experience encompassing browsing, booking, payment, and messaging, unified by the guest’s perspective.

SSA’s Systems are not microservices, either. They are organizational units within a (potentially monolithic) codebase, not deployment boundaries. However, SSA’s System definitions could inform microservice decomposition if scaling requirements warranted it: Vault might become a payment microservice, Signal a messaging microservice. The relationship is one of potential evolution rather than identity.

5.2 The Emerging Semantic Architecture Conversation

When SSA was first developed in 2024 and documented in mid-2025, the idea of organizing codebases specifically for AI comprehension was largely uncharted. In the months since, multiple independent groups have converged on similar problems and, in several cases, similar solutions. This convergence is the strongest external evidence that SSA identified a real need.

Codified Context [22] (arXiv:2602.20478, February 2026) is the most directly comparable framework. It proposes a three-tier architecture for supporting AI agents in complex codebases: “hot memory” in the form of a constitution encoding conventions and orchestration protocols, domain specialists consisting of nineteen specialized agents, and “cold memory” in the form of a knowledge base of thirty-four specification documents. The paper explicitly addresses the problem SSA tackles: LLM-based coding assistants lack persistent memory and lose coherence across sessions. The constitution concept maps closely to SSA’s semantic surface layer, though the approaches differ in emphasis. Codified Context invests heavily in multi-agent orchestration and external knowledge stores. SSA invests in making the codebase itself the primary carrier of meaning. The two approaches are complementary rather than competing.

OutcomeOps [4] (Brian Carpio, October 2025) proposes “self-documenting architecture” where code becomes “queryable” and “legible,” shrinking the distance between what code does and why it does it. The framing echoes SSA’s emphasis on encoding intent in structure, though OutcomeOps focuses more on operational outcomes and less on the naming and actor-perspective conventions that distinguish SSA.

The Semantic Control Plane [14] (Harish Pathak, January 2026) describes “architecture AI can’t break,” focusing on maintaining semantic integrity while using AI in development. The concern with semantic preservation during AI collaboration aligns with SSA’s goals, though the proposed mechanisms differ.

Confucius Code Agent [23] (arXiv:2512.10398, December 2025) addresses scalability of coding agents on large repositories through hierarchical memory stores, including short-term memory, a long-term knowledge base, and evolutionary growth units. The hierarchical approach to organizing information for AI consumption parallels SSA’s five-layer taxonomy, though Confucius focuses on agent memory management rather than codebase architecture.

Code Digital Twin [15] (arXiv:2503.07967, Peng and Wang, 2025) proposes a knowledge infrastructure for AI-assisted development of ultra-complex enterprise systems. The framework models both the physical and conceptual layers of software, preserving what the authors call “tacit knowledge”: responsibilities, intent, and decision rationales distributed across code, configurations, discussions, and version history. Code Digital Twin addresses the same fundamental problem as SSA (AI coding tools struggle in complex systems because they lack architectural understanding) but from the opposite direction. Where SSA encodes knowledge in the codebase itself through naming and structure, Code Digital Twin externalizes it in a living model that co-evolves with the code. The two approaches are complementary: an SSA-structured codebase would be an ideal substrate for a Code Digital Twin, because the semantic naming conventions provide the structured vocabulary that the twin’s knowledge extraction pipelines require.

The fact that these frameworks emerged independently across different research groups and practitioner communities, within roughly the same twelve-month period, suggests that the problem SSA addresses is not idiosyncratic. The field is converging on the recognition that codebases need to be organized with AI comprehension in mind. The specific solutions vary in emphasis and mechanism, but the underlying diagnosis is shared: semantic organization matters for AI collaboration, and current conventions are insufficient.

5.3 Context Engineering and Vibe Coding

SSA emerges from the same problem space as the nascent field of context engineering, which addresses the challenge of effectively utilizing LLM context windows for complex tasks. As described in Section 2, context engineering has matured rapidly from informal practices into a recognized engineering discipline. Practitioners have developed retrieval-augmented generation (RAG) systems, specification files like CLAUDE.md and cursor rules, memory systems for cross-session persistence, and multi-agent architectures for decomposing complex tasks.

SSA does not reject context injection. Even an SSA-structured codebase benefits from specification files and retrieval systems. But SSA shifts the balance: less external context is needed because more context is encoded in the code itself. Consider the token economics. A context injection approach might require injecting an architectural overview (roughly 2,000 tokens), relevant specifications (roughly 3,000 tokens), related code files (roughly 5,000 tokens), and a conversation summary (roughly 1,000 tokens), for a total context overhead of roughly 11,000 tokens. An SSA approach typically requires a task description (roughly 200 tokens), relevant code files that are already self-documenting (roughly 3,000 tokens), and minimal supplementary context (roughly 500 tokens), for a total of roughly 3,700 tokens. The SSA approach consumes fewer tokens for equivalent comprehension because the code carries meaning that would otherwise require documentation. This efficiency compounds across long sessions and complex tasks.

Parallel to context engineering, the practice of “vibe coding” (expressing intent in natural language and allowing AI to generate implementation) has matured significantly since Karpathy’s original coinage [10] in February 2025. By 2026 the practice has crystallized into specific disciplines: the “Orchestrator Model” emphasizes context architecture as a foundational pillar alongside recursive validation and product intuition. The guiding principle, “build the scaffolding before the walls,” aligns with SSA’s argument that architectural organization should precede and inform code generation.

Several vibe coding frameworks address important aspects of AI-assisted development. The Vibe Programming Framework proposes principles including “Augmentation, Not Replacement” and “Verification Before Trust.” The go-vibe methodology emphasizes documentation-code harmony. Widing’s Product Requirements Prompts (PRPs) [19] provide structured context for AI coding agents. Spec-driven development (as implemented in tools like Kiro) externalizes planning to specification files. These frameworks address workflow and governance, how to work with AI responsibly. SSA addresses architecture, how to structure code so AI understands it. The frameworks are complementary. SSA-structured code would benefit from these governance practices, and these governance practices would be more effective on SSA-structured code.

The Model Context Protocol (MCP), released by Anthropic in November 2024 and since adopted by OpenAI, Google DeepMind, and donated to the Linux Foundation, provides infrastructure that makes SSA-structured codebases directly useful to agent ecosystems. Well-organized codebases with SSA conventions can expose semantic structure through MCP servers, enabling agents to discover and invoke capabilities using the same vocabulary that human developers use.

5.4 Empirical Evidence

When SSA was first documented, the claims about naming conventions and AI comprehension rested entirely on practitioner observation. Since then, several empirical studies have provided evidence, though not testing SSA specifically, that supports its core claims.

A 2026 study on variable naming and AI code completion [21] tested descriptive, minimal, and obfuscated naming conventions across eight models ranging from 0.5 billion to 8 billion parameters. Descriptive names achieved a semantic similarity score of 0.874 compared to 0.802 for obfuscated names, a measurable difference attributable to naming alone. This provides direct empirical support for SSA’s Principle 1, semantic density: naming conventions measurably affect AI comprehension.

A separate 2026 enterprise study [16] examining AI coding assistant usage found that assistants actually slowed developers by 19% in certain contexts. The study identified inconsistent naming conventions and architectural patterns as primary culprits. This finding supports SSA’s broader argument that codebase organization, not just model capability, determines the effectiveness of AI-assisted development.

The first empirical study of cursor rules [9], presented at the Mining Software Repositories conference in April 2026 (MSR ‘26), examines what makes context rules effective in modern AI coding assistants. While the study does not evaluate SSA directly, its focus on rule effectiveness for AI comprehension is directly relevant to understanding which conventions help and which do not.

Research on code semantics and LLM comprehension has also advanced. The “Empica” framework [13] for evaluating whether code LLMs truly understand semantics reveals that even subtle, semantically-preserving mutations (non-behavioral code changes) significantly reduce model accuracy. This finding suggests that models are sensitive to surface-level properties of code, including naming, which is precisely the sensitivity that SSA is designed to leverage.

A large-scale empirical study of AI coding agent failures [20] (arXiv:2601.15195, January 2026) analyzed 33,596 pull requests across five autonomous coding agents, examining why agent-generated PRs fail. The study found that failed PRs consistently involve larger code changes touching more files and more frequently fail continuous integration checks. Crucially, failures correlate with the scope and structural complexity of the change rather than with the raw difficulty of the task. This finding supports SSA’s argument from a different angle: if agents fail when changes span more files and cross more boundaries, then an architecture that makes boundaries explicit and navigable should reduce the failure rate. SSA’s named Systems and Surface boundaries give agents clear signals about where one concern ends and another begins, precisely the structural legibility that the study identifies as missing in failed attempts.

A complementary study on agentic refactoring [17] (arXiv:2511.04824, November 2025) found that when AI agents perform code refactoring autonomously, the resulting changes are dominated by low-level mechanical operations (renaming, extracting methods) and consistently fail to reduce higher-order design smells. The agents make locally reasonable transformations but lack the architectural understanding to improve global structure. This is the pattern SSA is designed to interrupt. Without semantic structure encoded in the codebase, agents default to surface-level changes because they cannot infer the deeper organizational intent. SSA’s naming hierarchy (Surface, System, Component, Action, Outcome) provides exactly the kind of structural vocabulary that agents need to reason about design-level concerns rather than limiting themselves to mechanical refactoring.

These studies do not validate SSA as a complete framework. They do, however, provide empirical grounding for the thesis that naming conventions and structural organization are measurable factors in AI code comprehension, which is the foundation on which SSA is built.

5.5 What SSA Does Not Address

Situating SSA within related work also clarifies its boundaries. SSA is an application architecture pattern, not a deployment strategy; it does not address containerization, orchestration, monitoring, or incident response. SSA does not prescribe how teams should be structured, how work should be divided, or how code reviews should be conducted. While SSA conventions extend to test organization, SSA does not provide a comprehensive testing methodology. SSA’s permission model (Surfaces limiting System access) is organizational, not a security architecture; proper authentication, authorization, and security controls remain necessary. SSA provides no guidance on caching, database optimization, or performance profiling.

These limitations are intentional. SSA addresses a specific problem, AI context comprehension, and does not attempt to be a comprehensive software development methodology.

5.6 Positioning

Drawing on this review, SSA can be positioned as follows.

SSA is not a replacement for existing approaches but a complement that addresses an under-explored dimension of AI-assisted development: the semantic structure of code itself. SSA extends DDD’s ubiquitous language to optimize for LLM comprehension. SSA complements context engineering by encoding context in code structure, reducing dependence on external context injection while remaining compatible with RAG, specification files, and memory systems. SSA provides architectural substance for vibe coding methodologies that focus on workflow and governance. SSA leverages emerging research findings on semantic analysis and code naturalness [12], applying theoretical insights to practical architectural design.

The novelty of SSA lies not in any single element (evocative naming, actor perspectives, hierarchical structure) but in their synthesis into a coherent framework specifically designed for human-AI collaborative development. The convergence of independent work on similar problems since SSA’s initial development suggests that the synthesis was timely.

6. Practical Guide: Using SSA with AI Assistants

The preceding sections describe what SSA is and why it exists. This section describes how to use it. The guidance here is drawn from daily practice across two production codebases built with Claude Code, and it is offered as a starting point rather than a prescription.

6.1 Session Workflow

A typical SSA development session follows a predictable rhythm.

At the start of a session, I provide the AI with a brief orientation: the current task, the relevant Surface, and the System or Systems involved. On Popdot AI, a typical session might begin: “We are working on Wire. I need to add rate limiting to Sigil.Mandate so that nascent-tier agents cannot create more than ten mandates per hour, enforced by Shield.Meter.” This single sentence communicates the actor context (Wire, so this is agent-facing API work), the primary System (Sigil), the specific Component (Mandate), the governance dependency (Shield.Meter for rate limiting), and a business rule (nascent tier, ten per hour). On Sleep Around Points, a comparable instruction might be: “We are working on Helm. I need to add a new view to Ledger.Booking that shows owners which DeedContract each booking draws points from.” Again, one sentence communicates actor context (Helm, owner-facing), primary System (Ledger), Component (Booking), and cross-System dependency (Deed.Contract). In a conventionally structured codebase, communicating this same context would require pasting file trees, explaining relationships, and describing permission boundaries.

The AI then navigates to the relevant code, and because file paths mirror the SSA hierarchy, it finds what it needs quickly. lib/systems/ledger/booking.ts is where Ledger.Booking logic lives. app/(helm)/dashboard/bookings/page.tsx is where the owner-facing booking UI lives. The structure is predictable, so the AI does not need to search.

During the session, if the AI proposes something inconsistent with SSA conventions (placing a new file in the wrong directory, suggesting a name that breaks the pattern), I correct it once. In practice this happens rarely after the first few sessions on a project, because the conventions in the codebase itself train the AI’s expectations. The code is the context.

At the end of a session, there is nothing special to do. No memory files to update, no specification documents to revise. The work product, the code itself, carries the context forward to the next session.

6.2 The CLAUDE.md File for an SSA Project

While SSA reduces dependence on external documentation, a minimal specification file remains useful. For projects using Claude Code, the CLAUDE.md file for an SSA project typically contains:

A one-paragraph project summary describing the domain and the Surfaces.

The System vocabulary as a simple table: System name, mnemonic, one-line purpose. This is the single most valuable piece of external context, because it gives the AI the full vocabulary in roughly 300 tokens.

Surface access rules: which Systems are available on which Surfaces, and any Component restrictions.

Naming conventions: a brief statement of the five-layer pattern (Surface.System.Component.Action.Outcome) with one example.

Anti-patterns: a short list of things to avoid, such as “Do not create new Systems without discussion” and “Do not place Helm-specific logic in Market routes.”

Popdot AI’s CLAUDE.md is the clearest exemplar of this pattern in practice. Its structure, in roughly 900 tokens, includes a tech stack table (framework, database, ORM, authentication provider, payment processor, and DNS API), the System vocabulary table with one-line purposes, the Surface access table with entry points and metaphors, a documentation map pointing to deeper references (ARCHITECTURE.md, SURFACES.md, SECURITY.md), key framework notes, and concurrency patterns (Serializable isolation for financial operations). The file also links to a NAVIGATOR.md, a quick-orientation companion that provides file path mappings for common development tasks.

The entire file runs under 1,200 tokens. Compare this to the 145,000 tokens of documentation that preceded SSA on the same project. The reduction is not because less information exists; it is because the codebase now carries the information that documentation previously had to provide.

6.3 Vocabulary Development

Developing the System vocabulary for a new project is the most consequential decision in SSA adoption, and it should not be rushed.

Begin by listing the capability domains your system requires. Authentication, payments, messaging, search, notifications: these are the clusters of functionality that will become Systems. For each cluster, brainstorm single-word names that evoke the function through metaphor. Prefer words that are concrete, common in everyday language, and unlikely to be confused with each other. Vault works for payments because the metaphor of secure value storage is immediately legible. Signal works for messaging because it evokes transmission and communication. Avoid words that are too abstract (Flow, Core, Engine) or too domain-specific to carry metaphorical weight.

Test your vocabulary by describing a user flow using only SSA names. “A guest on Market browses the Shelf, selects a listing, begins a Ledger.Booking, completes a Vault.Charge, and receives a Ping.” If the flow reads naturally and a newcomer could roughly follow it, the vocabulary is working. If you have to stop and explain what a System name means, it needs refinement.

Start with Surfaces and Systems. Components, Actions, and Outcomes can emerge organically as you build, because they follow predictable patterns once the higher layers are established. You do not need the full five-layer taxonomy defined before you write your first line of code. You need Surfaces and Systems.

6.4 Migration Path

Adopting SSA in an existing codebase does not require a wholesale rewrite. Incremental adoption is both possible and recommended.

Start with the vocabulary. Define your Surfaces and Systems even if the codebase does not yet reflect them. Write the CLAUDE.md file. Begin using SSA names in conversation with your AI assistant, even when the code still uses conventional names. “The paymentService is what we call Vault.Charge” is a bridge that works surprisingly well. The AI will begin associating both vocabularies.

Next, adopt SSA conventions for new code. When you add a new feature, structure it according to SSA patterns: place it in the correct Surface directory, name the System and Component according to convention, use the full-path error naming. New code following SSA conventions will coexist with legacy code following conventional patterns. The inconsistency is temporary and manageable.

Then, refactor incrementally. As you touch existing files for bug fixes or feature additions, rename and reorganize them to match SSA conventions. This is the approach I followed on Sleep Around Points, and the migration happened naturally over the course of several weeks without a dedicated refactoring sprint.

The key discipline is consistency going forward, not perfection in the past. A codebase where 70% of the code follows SSA conventions and 30% retains legacy naming is still dramatically more legible to an AI assistant than a codebase with no conventions at all.

6.5 Common Pitfalls

Over-engineering. The appeal of comprehensive naming conventions can lead to excessive structure. Not everything warrants SSA treatment. Utility functions, framework boilerplate, and simple configuration can follow conventional patterns without loss. Apply SSA to core domain logic where semantic density pays dividends. Leave the rest alone.

Naming exhaustion. Finding evocative single-word names is harder than it sounds, particularly for large systems with many capability domains. When you run out of good metaphors, it is better to use a clear two-word name (PointBank) than to force a single word that does not evoke the right associations. The principle is semantic density, not single-word dogma.

Vocabulary drift. Over time, especially in team contexts, conventions can erode. Someone creates a new System without following the naming pattern. Someone places a file in the wrong Surface directory. The best defense is the CLAUDE.md file (which the AI reads at every session start) and code review discipline. The AI itself can enforce conventions if instructed to do so.

The obscurity tax. A developer encountering Vault.Charge.create() for the first time may not immediately recognize this as payment processing. Conventional names like paymentService.createPayment() are more self-explanatory to newcomers. This is a real cost. SSA vocabulary is learned vocabulary, and learning takes time. The investment pays off in AI collaboration efficiency, but you should acknowledge the onboarding cost honestly when introducing SSA to a team.

6.6 The SSA Toolkit

To make SSA adoption as frictionless as possible, this paper is accompanied by a starter kit available at the GitHub repository linked from the whitepaper page. The kit includes three categories of artifacts.

Template files. A CLAUDE.md template provides the skeleton for a project’s AI context file: a tech stack table, a System vocabulary table with blanks to fill in, a Surface access table, naming conventions, and key constraints. The template targets under 1,200 tokens when completed, consistent with the token budgets demonstrated by Popdot AI and Sleep Around Points. Companion templates for SYSTEMS.md (error codes, events, state machines, function catalogs) and NAVIGATOR.md (reading order, file paths, quick orientation) round out the documentation set. These templates encode the patterns described in Section 6.2 without requiring practitioners to derive the format from examples.

A naming cheatsheet and System catalog. The naming cheatsheet condenses the five-layer taxonomy, naming rules, and full-path examples onto a single reference page. The System catalog provides over seventy suggested System names organized by domain (commerce, content, social, healthcare, education, infrastructure, AI agent systems), each with a one-word name, a metaphor, and a capability description. The catalog is a starting point, not a dictionary. Practitioners should select names that resonate with their domain and their team, using the catalog as inspiration rather than prescription.

An AI agent skill. The SSA Vocabulary Workshop is a skill file (SKILL.md) that works with Claude Code, Codex CLI, and ChatGPT through the open Agent Skills specification. The skill guides an AI assistant through the full SSA adoption workflow: scanning the codebase to identify capability domains, proposing evocative System names, explaining each name to the human in plain English, accepting the human’s preferences and adjustments, and generating tailored CLAUDE.md, SYSTEMS.md, and NAVIGATOR.md files. The entire process takes roughly five minutes for a typical project. An OpenClaw-compatible version of the skill is included for users of that platform.

The skill is designed to be discovered and used by AI agents without the human needing to know about SSA in advance. An agent encountering a disorganized codebase can propose the workshop on its own initiative: “I’m having trouble navigating this project. Would you like me to set up a naming system so we can communicate about it more clearly?” The toolkit makes SSA adoption a five-minute conversation rather than an architectural overhaul.

6.7 Agent-First Adoption

The traditional path for a software methodology is: a paper is published, developers read it, teams adopt it. SSA has an unusual opportunity to travel a different path entirely, one where AI agents are the primary adopters and the humans follow.

This inversion is possible because of a convergence in early 2026. OpenClaw, an open-source AI agent that runs locally and controls a user’s computer through messaging platforms like WhatsApp and Telegram, became one of the fastest-growing software projects in GitHub history. Perplexity launched Personal Computer, a Mac-based agent system where users describe goals in natural language and the agent figures out which apps and files to use. Anthropic’s Claude gained Cowork mode for desktop automation. In all of these systems, the human is often not a developer. They are a small business owner, a property manager, a freelancer who had an app built for them and now needs to maintain or modify it.

These humans face a communication problem. They want to say “fix the payment thing” or “why is the login broken?” but have no vocabulary for the parts of their software. Their AI agent faces the same problem from the other side: it needs to explain what it found, what it changed, and what went wrong, but the codebase offers no semantic handles to grab.

SSA solves both sides of this problem simultaneously. Once a codebase has named Systems, the agent can say: “The problem is in Vault, your payment system. Specifically, Vault.Charge.CardDeclined is firing because the Stripe API key expired.” The human does not need to understand code. They understand that Vault means payments, and that something called CardDeclined happened. The vocabulary bridges the gap.

The distribution strategy follows from this insight. Rather than marketing SSA to developers through blog posts and conference talks, the methodology can spread through the AI agents themselves. The SSA Vocabulary Workshop skill, published to agent skill registries (SkillsMP for Claude/Codex/ChatGPT, ClawHub for OpenClaw, Smithery for MCP-compatible clients), becomes discoverable by any agent that encounters a disorganized codebase. The agent adopts SSA not because someone told it to, but because it needs a shared language with its human. The adoption is selfish: the agent’s work gets easier, the human’s comprehension improves, and the codebase becomes more maintainable as a side effect.

This agent-first adoption model has a natural viral property. Once an agent has used SSA on one project, it will recognize the pattern in future projects and suggest it again. The human, having experienced the benefit of being able to say “what’s happening in Vault?” and getting a clear answer, is predisposed to accept the suggestion. The methodology spreads through the AI-human collaboration channel rather than through the developer community channel.

Whether this distribution model succeeds remains to be seen. It depends on agent skill registries reaching critical mass, on the skill being surfaced to agents at the right moment, and on the five-minute adoption time being low enough friction that agents propose it without hesitation. But the possibility itself is instructive: a software methodology designed for AI collaboration may be best distributed through AI collaboration. The medium and the message converge.

7. Limitations and Future Research

7.1 Methodological Limitations

This paper presents a framework developed through practice rather than controlled experimentation. Several methodological limitations constrain the conclusions that can be drawn.

SSA has been applied in depth to two production systems: Popdot AI and Sleep Around Points. While these systems present genuine complexity (multi-actor interactions, regulatory constraints, financial transactions), they remain a small sample. The observed benefits may be specific to the vacation rental and marketplace domains, to my cognitive style and preferences, to Claude as the primary AI assistant, or to the solo-practitioner development context. Generalization to other domains, team sizes, existing codebases, or AI assistants remains unvalidated.

The observations reported in Section 4 compare “before SSA” to “after SSA” within the same projects. This pre-post design cannot isolate SSA’s effect from confounding factors including my own improving skill with AI-assisted development, natural codebase maturity, AI tool improvements during the development period, and accumulated domain familiarity. A rigorous evaluation would require controlled experiments: teams randomly assigned to SSA and non-SSA conditions, working on comparable tasks, with blind evaluation of outcomes. Such experiments have not been conducted.

The quantitative observations (re-onboarding time reduction, suggestion consistency improvement, documentation compression) are self-reported estimates, not instrumented measurements. The author, having invested effort in SSA development, has motivated reasoning to perceive benefits. Confirmation bias may inflate positive observations and discount negative ones.

7.2 Conceptual Limitations

Beyond methodology, SSA has conceptual constraints on its applicability.

The vocabulary development burden is real. Finding evocative single-word names for every System is non-trivial, and some domains resist such naming. Highly technical domains dealing with bit manipulation or memory management may lack natural-language metaphors. Novel domains without established vocabulary offer little to draw from. The Sleep Around Points vocabulary draws on well-established metaphors for authentication, payments, and record-keeping. Not all domains are so accommodating.

SSA optimizes for current LLM architectures and training patterns, which are not static. Future models with better long-term memory or retrieval might reduce the need for semantic density in naming. Post-transformer architectures might have different context window characteristics or attention patterns. SSA addresses a problem that future AI systems might solve through other means. The framework could become unnecessary as AI capabilities evolve, though the benefits for human comprehension would likely persist.

Convention enforcement depends on discipline. In a solo development context, I enforce my own conventions. In team contexts, new members must learn the vocabulary, developers under deadline pressure may revert to familiar patterns, and team members may disagree about naming choices. No tooling currently exists to automatically enforce SSA conventions. Linters can check syntax; they cannot evaluate whether PaymentProcessor should have been named Vault.Charge.

7.3 Future Directions

The limitations above suggest several directions for future research, of which four seem most pressing.

First, controlled experiments. The most valuable contribution would be rigorous empirical validation: teams randomly assigned to SSA and conventional approaches, working on comparable tasks, with blind evaluation of time-to-completion, error rates, and AI suggestion quality. Within-subjects designs where the same developers work under both conditions would also be informative. The emerging empirical work on naming conventions (discussed in Section 5.4) provides methodological templates for such studies.

Second, vocabulary development methods. Systematic approaches to identifying capability domains and appropriate metaphors, methods for testing whether proposed names activate intended associations in LLMs, and cross-cultural validation of vocabulary effectiveness would all make SSA more accessible to practitioners.

Third, tooling. SSA adoption would benefit from linting rules that enforce naming conventions, IDE integrations that suggest SSA-compliant names, documentation generators that extract vocabulary from codebase structure, and migration assistants that analyze existing codebases and suggest SSA vocabulary mappings.

Fourth, team adoption studies. Research on how quickly new developers become productive with SSA vocabularies, effective governance models for vocabulary decision-making, and barriers to adoption in team settings would extend SSA from solo practice to broader applicability.

7.4 Risks for Practitioners

Practitioners considering SSA adoption should be aware of several risks.

Premature standardization: adopting vocabulary early in a project may lock in naming decisions that prove inappropriate as understanding deepens. Begin with provisional vocabulary explicitly labeled as subject to change.

Over-engineering: the temptation to create Systems for every capability risks unnecessary complexity. Apply SSA selectively to core domain logic.

Documentation neglect: the premise that code is self-documenting may lead to underinvestment in explicit documentation that remains necessary for complex business rules, integration details, and architectural rationale that naming cannot capture. SSA reduces but does not eliminate documentation needs.

LLM mismatch: vocabulary optimized for one model may be less effective with another. Choose vocabulary based on general evocativeness rather than model-specific testing, and validate effectiveness when changing AI tools.

8. Conclusion: Teaching Code to Speak

8.1 What We Have Proposed

This paper has introduced Semantic Surface Architecture, a framework for organizing software systems in ways that optimize for collaboration between human developers and artificial intelligence.

The core claims are modest. AI coding assistants struggle with context, losing track of architectural decisions, making inconsistent suggestions, and requiring extensive re-onboarding across sessions. Current solutions focus primarily on context injection, sophisticated systems for getting the right documentation into the model’s context window at the right time. An alternative approach, semantic encoding, addresses the same problem by embedding context in the code itself through naming conventions, organizational structures, and vocabulary choices designed for LLM comprehension. SSA provides one such encoding scheme, organized around actor-centric Surfaces, evocatively named Systems, and a five-layer hierarchy from Surfaces through Outcomes. Preliminary application to two production codebases suggests benefits in reduced re-onboarding time, improved suggestion consistency, and clearer architectural communication. And since this framework was first developed, multiple independent groups have converged on similar problems, while empirical research has begun to validate the thesis that naming conventions measurably affect AI code comprehension.

These claims remain provisional. The evidence is limited. The generalizability is unknown. The appropriate stance is cautious experimentation, not evangelical adoption.

8.2 The Deeper Pattern

Every major shift in software development has involved rethinking how we communicate intent. Assembly language communicated intent to hardware. High-level languages communicated intent to compilers. Object-oriented design communicated intent to other developers. Agile methodologies communicated intent across disciplines. Each shift required new conventions, new vocabularies, new organizational structures.

We are in another such shift. The audience for our code now includes minds that did not exist five years ago, minds that are rapidly becoming more capable, more prevalent, and more integrated into every phase of the development lifecycle.

The question is not whether to adapt. The question is how.

SSA proposes that we adapt by encoding meaning more densely, by organizing around perspectives rather than technical layers, by choosing names that resonate with how language models process language. These specific proposals may prove incorrect. But the general principle, that AI collaboration requires deliberate architectural design, seems likely to endure.

8.3 An Invitation

This paper is not the final word on Semantic Surface Architecture. It is an opening statement.

To practitioners: try these ideas. Apply them to your projects. Report what works and what doesn’t. The framework will improve through distributed experimentation more than through centralized theorizing.

To researchers: test these claims. Design rigorous experiments. Measure what I have only estimated. The field needs evidence, not enthusiasm.

To tool builders: create the tooling that SSA lacks. Linting rules, IDE integrations, documentation generators, migration assistants. The framework is only as useful as its practical adoption, and practical adoption requires practical tools.

To skeptics: challenge these ideas. Propose alternatives. The goal is not to defend SSA but to find what actually works for human-AI collaboration. If that turns out to be something else entirely, the field advances regardless.

There is a surface tension now at the boundary between human and artificial intelligence, and we feel it every time we open a new session and have to explain, again, what we already explained yesterday. We can learn to work with that tension rather than against it. We can teach our code to speak.

The experiment continues.

Acknowledgments

This work was developed through extensive collaboration with AI assistants, primarily Anthropic’s Claude, which contributed to both the codebases that prompted these ideas and the articulation of the framework itself. The irony is not lost on the author that a methodology for AI collaboration was developed in collaboration with AI.

Thanks are owed to the broader community of AI-assisted development practitioners whose shared experiences, frustrations, and innovations created the context from which SSA emerged. Though we have not met, we are working on the same problems.

Any errors, overstatements, or failures of rigor are the author’s alone.

References

[1] Abdelaziz, I., et al. (2021). A Toolkit for Generating Code Knowledge Graphs. K-CAP 2021.

[2] Anthropic. (2025). Effective Context Engineering for AI Agents. Anthropic Engineering Blog.

[3] Anthropic. (2026). 2026 Agentic Coding Trends Report.

[4] Carpio, B. (2025). OutcomeOps: Self-Documenting Architecture. OutcomeOps Blog.

[5] Cockburn, A. (2005). Hexagonal Architecture. alistair.cockburn.us.

[6] Evans, E. (2003). Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley.

[7] Evans, E. (2024). DDD and Large Language Models. Keynote, Explore DDD Conference.

[8] Guo, D., et al. (2021). GraphCodeBERT: Pre-training Code Representations with Data Flow. ICLR 2021.

[9] Jiang, S. & Nam, D. (2025). Beyond the Prompt: An Empirical Study of Cursor Rules. MSR ‘26 / arXiv:2512.18925.

[10] Karpathy, A. (2025). On Vibe Coding. X/Twitter, February 2, 2025.

[11] Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.

[12] Maninger, S., et al. (2024). Towards Trustworthy AI Software Development Assistance. ICSE 2024 New Ideas Track.

[13] Nguyen, T.T., et al. (2025). An Empirical Study on Capability of Large Language Models in Understanding Code Semantics. Information and Software Technology / ScienceDirect.

[14] Pathak, H. (2026). The Semantic Control Plane: Architecture AI Can’t Break. Medium.

[15] Peng, Z. & Wang, Y. (2025). Code Digital Twin: A Knowledge Infrastructure for AI-Assisted Development of Ultra-Complex Enterprise Systems. arXiv:2503.07967.

[16] Rasheed, Z., et al. (2026). Usage, Effects and Requirements for AI Coding Assistants in the Enterprise. arXiv:2601.20112.

[17] Shirafuji, D., et al. (2025). Agentic Refactoring: An Empirical Study on the Impact of AI-Assisted Refactoring. arXiv:2511.04824.

[18] Vernon, V. (2013). Implementing Domain-Driven Design. Addison-Wesley.

[19] Widing, R. (2024). PRPs for Agentic Engineering. GitHub Repository.

[20] Yang, S., et al. (2026). Where Do AI Coding Agents Fail? An Empirical Study of Failed Pull Requests from AI Coding Agents. arXiv:2601.15195.

[21] Zhang, T., et al. (2026). Variable Naming Impact on AI Code Completion: An Empirical Study. Research Square, rs-7180885/v1.

[22] Zhao, Y., et al. (2026). Codified Context: Infrastructure for AI Agents in a Complex Codebase. arXiv:2602.20478.

[23] Zhou, Y., et al. (2025). Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases. arXiv:2512.10398.

End of paper