Does this apply to Training and Audit too, or only to builds?

The discipline applies to all three offers: scope before we act, lead with evidence, and leave you owning the artifacts. The four phases on this page are the in-depth method for Build & Run specifically, when the work is to ship and operate AI in production. Training and Audit & Consulting are shorter engagements that share the same habits but produce different deliverables (a trained team, a sequenced roadmap).

At the end, do we own the AI: the prompts, models, and evals?

Yes. The code is in your repositories from the first commit, the IP transfer is in the contract, and the handover includes the prompts, the eval suite, and the guardrails, not just a running endpoint. The point is that your team can change the system safely after we step back, including swapping in a better or cheaper model, without depending on us.

Do you work on a fixed price or time-and-materials basis?

Both, depending on the engagement shape. Scoped deliverables (an audit, a defined-scope build) work on fixed price against a written statement of work. Open-ended engagements (a long-term build with an evolving roadmap, or a support retainer) work on T&M with a documented monthly ceiling. We are explicit about which model fits your project during phase one.

What does the team composition look like during a typical project?

Two to four senior engineers full-time on the engagement, drawn from the discipline leads you can meet on the about page. There is no junior bench, no offshore overflow, and no separate project manager. The lead engineer on the project is your single point of contact and stays accountable from scoping through delivery.

Do you sign NDAs and DPAs?

Yes. Mutual NDAs are standard before phase one begins. For projects that involve personal data, a Data Processing Agreement (DPA) is signed before any production data is touched, and our processing posture is documented in the security engineering page.

What happens if scope changes mid-project?

Scope change is a structured event, not a surprise. Change requests go through a written change-order process that reviews the technical and timeline impact, and we sign on the change before we ship against it. We do not silently absorb scope creep, and we do not pretend the change was always in scope.

Do you offer post-launch retainers?

Yes. After the defined support window, we offer ongoing engineering retainers for clients whose product we built, typically two days per week scoped to maintenance, security, and incremental feature work, with a documented monthly ceiling. We do not retainer products we did not build, because we cannot vouch for the underlying engineering.

Do you guarantee timelines?

We commit to milestones with written acceptance criteria; we do not guarantee a date against an ambiguous scope. The honest answer is that any vendor who guarantees a date against a vague scope is either lying or planning to cut corners, neither of which is the engineering we want to ship. Once scope is defined, our delivery dates are reliable because we say no to scope changes that the timeline cannot absorb.

Can we audit the code while it is being written?

Yes, you should. The code is in your repositories from the first commit, your engineers have read access, and we expect the CTO or accountable technical sponsor on your side to spot-review pull requests during the engagement. The visibility is the point.

How we work

How we work,
across every offer.

Whichever offer you start with (Training, Audit & Consulting, or Build & Run), the engagement runs on one discipline and ends the same way: your team owns the AI and runs it without us. Below is how that plays out, from the shared method to the four deliberate phases of a production build.

One path

One path to
AI autonomy.

The three offers are rungs of one ladder. You can step on at any rung; the discipline behind each is the same, and the endpoint never changes: your team masters the AI in-house.

Training gets your people fluent (Understand). Audit & Consulting tells you where AI is worth it and where it isn't (Decide). Build & Run designs, ships, and operates the AI itself (Implement → Operate). Whatever the rung, three habits run through every engagement: we scope before we act, we lead with evidence over opinion, and we leave artifacts you own, vendor-portable, with no dependency on SDEN baked into the next step.

That is what 'autonomy' means here. We are not optimizing for a long retainer; we are optimizing for the day you no longer need us. The more we work together, the less you depend on us, and every deliverable, from a training recording to a production codebase, is written to be picked up by your team without us in the room.

01
Understand
Your teams get fluent in AI, its risks, and its limits.
02
Decide
You learn where AI is worth it, where it's risky, and what to do first.
03
Implement
We design, build, and ship the AI into production.
04
Operate
We run it with you, then hand it over. You own it.

The method, in depth

The Build & Run method,
phase by phase.

When the work is to build and operate AI in production, here is exactly how it runs: four deliberate phases, each ending in artifacts you own, plus the engineering disciplines that keep the system honest and free of debt.

Phase 01

Scoping & architecture

Phase one produces a concrete decision: build, buy, or do not start. SDEN delivers the architecture, the risk register, and a go / no-go recommendation (including whether AI is even the right tool) before any production code is written.

Scoping at SDEN is not a discovery call. It is a structured investigation that produces a written problem statement naming the actual job the software has to do, an architecture and decision log explaining the technical choices we recommend (and the alternatives we considered), and a risk register that ranks what could go wrong by exploitability and business impact. For AI work it goes further: an honest feasibility read (is this a problem a model can solve reliably, or is it a rules engine wearing an AI costume?), a build-versus-buy call against off-the-shelf options, and a look at whether your data is actually ready to feed a model. The phase ends with a go / no-go we are willing to stand behind in front of your board.

We ask the questions other vendors skip. Who is the accountable decision-maker on your side? What does the work look like for the user the day after launch, not at the demo? What data does the product create, where does it live, and who is allowed to see it, and where does it sit under the EU AI Act's risk tiers? Above all: what does 'good' mean, measured? We define the eval criteria here, in phase one, so 'it works' is a number we agreed on rather than a vibe at the demo. The phase is paid, time-boxed, and short, typically one to three weeks depending on scope.

What this phase produces

Written problem statement with measurable success and eval criteria
Architecture diagram + decision log (ADRs), including the build-vs-buy and 'AI vs not' call
Risk register ranked by exploitability and business impact, with EU AI Act classification where it applies
Data-readiness read for any AI use case (sources, quality, access, retention)
Go / no-go recommendation, with the scope we would commit to

Rituals we keep

One stakeholder interview per accountable decision-maker on your side
Architecture review with the engineers who would build the project
Mid-phase checkpoint with a draft of the artifacts
Final phase review where we present the go / no-go recommendation

What we refuse during this phase

We will not skip scoping because 'we already know what we want.' We have rescued enough projects to know that statement is rarely true.
We will not recommend AI where a simpler tool wins. If a script, a rule, or an off-the-shelf product solves it, that is what the go / no-go will say.
We will not write production code in this phase. Anything we ship here is a thought-prototype, clearly marked as throwaway.

Phase 02

Design & prototyping

Phase two turns the scoping decision into something you can validate before commitment: a real prototype against real data, with the model and architecture choices made on the record.

A demo is a video. A prototype is something the user can actually use, against the actual data shape, on the actual stack we will ship to production. Phase two delivers the latter. The team designs the core flows, builds an interactive prototype for the highest-risk paths (the ones that decide whether the product is usable, not the ones that decide whether it is pretty), and validates them against real data and real users before any production code lands.

For AI, this is where the consequential choices are made and written down: which model, and why; retrieval (RAG) versus fine-tuning versus a plain prompt; a single call versus an agent; and the cost and latency budget each path has to live inside. We bias toward boring tech under the AI (frameworks, databases, and hosting the community has already debugged at scale) and we name the reason for every choice. The phase also designs the eval harness: the graded test set that will tell us, automatically and on every change, whether the AI is getting better or worse. It produces the prototype, a design system (tokens, components, accessibility baseline), and that eval/test plan for the production phase.

What this phase produces

Interactive prototype of the highest-risk flows, running on real data
Model and architecture decision (model choice, RAG / fine-tune / agent) with written rationale (ADRs)
An eval harness: a graded test set and the metrics production will measure on every change
Cost and latency budget per AI path, named up front
Design system (tokens, components, accessibility baseline)

Rituals we keep

Two user-testing sessions on the prototype before phase three begins
Weekly design review with engineering in the room
Mid-phase stakeholder demo of the live prototype
End-of-phase walkthrough of the test plan

What we refuse during this phase

We will not present a prototype that mocks the failure paths. If the demo breaks when the user makes a mistake, the prototype is incomplete.
We will not pick the biggest model by reflex. The default is the smallest model that passes the evals inside the cost and latency budget.
We will not pick a stack because it is trending. The default answer is the boring stack that the team can still maintain in three years.

Phase 03

Development & hardening

Phase three turns the validated prototype into production-grade software: in short iterations, with code review on every change, security built in from the design stage, and no technical debt left behind.

Development at SDEN runs in two-week iterations with a working build at the end of every one. Every pull request is reviewed by a second engineer before merge; the merge gate runs the test suite, the eval suite from phase two, the security scanners (SCA, SAST, secret scanning), and the type checker. Branch protection is on, signed commits are on, and no engineer (including the most senior) can bypass the checks. The result is a codebase where the second engineer to read any file already understands it, and where a change that quietly makes the AI worse fails CI instead of reaching production.

Hardening is the part most vendors skip. It is the work that turns 'it runs' into 'it runs in production for the next three years.' For AI that means guardrails on inputs and outputs, cost ceilings so a runaway loop cannot bankrupt the feature, and a security review against prompt injection and the OWASP LLM Top 10 alongside the OWASP Top 10 and the relevant ASVS level. It also means load testing against expected traffic shapes, chaos testing the failure modes the risk register flagged in phase one, and the no-technical-debt clause: we will not ship code we would not be willing to take a 2 a.m. page on. If a deadline forces a shortcut, the shortcut lands in the issue tracker as a P0 and gets paid back before the next feature lands. We document the discipline explicitly in our security engineering posture.

What this phase produces

Production-grade application in your repositories, deployed to staging
Test and eval suites covering success paths, error paths, edge cases, and AI quality
Guardrails and cost ceilings wired in (input/output checks, spend limits)
Security review against the OWASP Top 10, the OWASP LLM Top 10, and the relevant ASVS level
Load and chaos test results against the documented traffic shapes

Rituals we keep

Two-week iteration cadence with a working build at the end of each
Weekly demo to the client stakeholder, live on staging
Code review on every PR, no exceptions
End-of-iteration retrospective with the engineering team

What we refuse during this phase

We will not bypass the merge checks under deadline pressure. If the check is wrong, the check is the bug.
We will not ship an AI feature that fails its own evals, or one with no guardrails or cost ceiling. 'It looked fine in the demo' is not a release criterion.
We will not ship code with a known unmitigated high-severity vulnerability. The vulnerability gets fixed; the deadline gets renegotiated honestly.

Phase 04

Delivery & support

Phase four is a controlled production release with the operational artifacts a team needs to run the software (and the AI inside it) once SDEN's role tapers: runbook, monitoring, SLOs, on-call playbook.

Delivery is staged. The first production release lands behind a feature flag for a single tenant or a small user cohort. We watch the SLOs (and, for AI, the quality and cost numbers) against real production traffic for a defined observation window, then expand to the full audience once the data confirms the system behaves as expected. The release is not a calendar event. It is a measurable state change.

What we hand over with the release is what makes the engagement durable: a runbook for every operational task, a monitoring dashboard that exposes the SLOs we committed to plus the AI's behavior in production (quality, drift, hallucination rate, and spend, not just uptime), an on-call playbook for the incidents the risk register anticipated, an incident response template, and documentation written for the next engineer who joins your team. Crucially, the handover transfers the AI itself: the prompts, the eval suite, and the guardrails, so your team can change the system safely, not just keep it running. SDEN engagements do not end at production. They taper. We commit to a defined support window after launch (usually three to six months, scoped to the engagement) during which we operate the system jointly with your team, and during which the operational knowledge transfers, not just the code.

What this phase produces

Staged production release with feature-flag rollout
Operational runbook for every routine production task
Monitoring of SLOs and AI behavior: quality, drift, hallucination rate, and cost
On-call playbook for the incidents the risk register anticipated
Handover of the prompts, evals, and guardrails, with documentation for the next engineer

Rituals we keep

Pre-release readiness review against the published checklist
Staged rollout with documented expansion criteria
Joint on-call rotation during the support window
Monthly review of the SLO data during the support window

What we refuse during this phase

We will not declare a release 'live' before the SLOs have been observed against real traffic.
We will not abandon the team at handover. The handover is a defined process, not an email with a ZIP attached.

FAQ

Approach:
questions about engagement.

Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.

Contact the team

One path toAI autonomy.

Understand

Decide

Implement

Operate

The Build & Run method,phase by phase.

Scoping & architecture

Design & prototyping

Development & hardening

Delivery & support

Approach:questions about engagement.

Does this apply to Training and Audit too, or only to builds?

At the end, do we own the AI: the prompts, models, and evals?

Do you work on a fixed price or time-and-materials basis?

What does the team composition look like during a typical project?

Do you sign NDAs and DPAs?

What happens if scope changes mid-project?

Do you offer post-launch retainers?

Do you guarantee timelines?

Can we audit the code while it is being written?

One path to
AI autonomy.

The Build & Run method,
phase by phase.

Approach:
questions about engagement.