Enforcement

The Flowstate Agent ships in observe mode by default — it sees every request, forwards every response, blocks nothing. Enforcement turns it into a gate: the same agent applies your organisation's AI policy at request time, blocking calls that violate a rule.

This is a deliberate, ratchet-style step. Once enforcement is on, the agent becomes part of the request path — uptime, latency, and rule correctness all matter to your engineers.

Before you turn it on

Watch the data in observe mode for at least two weeks. You're looking for:

Which providers your engineers actually use. The list of allow-listed providers should match.
What models they use most. Banning a model nobody uses is harmless; banning a popular one without warning is not.
What "normal" looks like per team — calls/day, tokens/day, model mix. Enforcement thresholds (monthly caps, anomaly alerts) calibrate against this baseline.

Once the baseline is clear, you've earned the right to enforce.

What enforcement evaluates

Every captured request is evaluated against the active AI policy:

Rule type	What it checks
`PROVIDER_ALLOWLIST`	The provider hostname is in the allow-list.
`PROVIDER_DENYLIST`	The provider hostname is not in the deny-list.
`MODEL_ALLOWLIST`	The model identifier is in the allow-list.
`MODEL_DENYLIST`	The model identifier is not in the deny-list.
`MONTHLY_SPEND_CAP_PER_SUBJECT`	The subject's month-to-date observed spend on one provider is below the cap.
`BUDGET_CROSS_PROVIDER`	The subject's rolling-window aggregate spend across many providers is below the cap. In development — see callout below.
`COST_CENTRE_REQUIRED`	The session is attributed to a cost centre (i.e. the user is allocated to a project that has one).
`CONTENT_DLP`	The prompt does not match a DLP rule (requires Enterprise capture mode).

For worked examples of the spend-cap rules — including the cross-provider "John's £1000/month total" pattern — see Budgets and spend caps.

In development: BUDGET_CROSS_PROVIDER

You can author rules of this type today; the policy editor and the params schema are live. The runtime evaluator that adds up cross-provider spend and enforces the cap at request time is currently in development. Per-provider caps (MONTHLY_SPEND_CAP_PER_SUBJECT) work end-to-end now. Watch the changelog for cross-provider availability.

Rules can be scoped to:

The whole organisation.
A specific team (and its descendants).
A specific project.
A specific employee or contractor.
A specific AI service.

Turning it on

Settings → AI → Agent Policy → Enforcement → On.

The setting propagates to every agent on its next poll (within an hour). When it lands, the agent starts enforcing — there's no further restart needed.

We recommend a phased rollout: enable enforcement on a single team first, watch for a week, then expand.

What an engineer sees when blocked

A blocked call returns a structured error from the agent — never a silent drop. The error includes:

The rule that fired (e.g. MODEL_DENYLIST).
The model that was blocked.
A human-readable reason ("model claude-3-opus is not approved for production code generation").
A link back to the policy page.
Instructions for requesting an exemption (configurable per organisation).

In Claude desktop, ChatGPT, Cursor, Copilot Chat, and similar tools, this error appears in the chat surface as a regular assistant message. Engineers don't need to learn a new error format — the error reads like any other API failure they've seen.

Exemption workflow

Exemption requests route to the AI policy owner in Flowstate. The flow:

Engineer hits a blocked rule.
The agent returns the structured error with a one-line "Request exemption" link.
The link opens a Flowstate form pre-filled with the offending call's metadata (provider, model, time, team).
Engineer adds a justification.
Policy owner reviews; an exemption is either a one-off override (next 24 h) or a rule edit.

When to use observe-only mode for enforcement

There's a useful middle ground for high-impact rules: enable the rule, but set the severity to WARN. The agent forwards the request as normal and emits an alert; nothing is blocked. Once a few weeks of WARN data tell you the rule is correct, flip to BLOCK.

Severity levels:

INFO — capture-only, surfaces in dashboards.
WARN — capture + alert, but forward the request.
BLOCK — capture + block. The engineer's tool gets the structured error.

Limits

Enforcement runs on observed traffic. It does not prevent:

An engineer logging into a personal AI account on a personal device.
An engineer using a service whose hostname isn't in the agent's allow-list (the agent inspects what it knows about; unknown providers go through).
A network path that bypasses the agent (a sufficiently determined developer can always defeat a single endpoint agent — defence in depth via DNS-level controls is complementary).

Enforcement is a control on managed traffic on managed machines. It complements, but doesn't replace, identity-and-access governance at the provider level (e.g. SSO-only access on the AI provider's admin console).

Compliance posture

Enforcement is audit-logged end-to-end:

Every blocked call is recorded with the rule that fired, the user, the project, and the model.
Every exemption is recorded with the reviewer, the justification, and the duration.
Every rule edit is recorded in the audit log with diff (before → after).

The complete audit trail is available via the API for forwarding into your SIEM.

Enforcement ​

Before you turn it on ​

What enforcement evaluates ​

Turning it on ​

What an engineer sees when blocked ​

Exemption workflow ​

When to use observe-only mode for enforcement ​

Limits ​

Compliance posture ​