Audit log¶
Every request that reaches an /v1/* endpoint produces one row in
audit_events and (when audit.capture_payloads is true, the
default) one row in audit_payloads. Health, readiness, well-known,
and portal-auth flows sit outside the middleware stack and don't
generate audit rows.
The pipeline is async: the request handler enqueues into a buffered
channel; a background goroutine drains into Postgres. A stalled DB
can never inflate request latency. On a full buffer the event is
dropped and counted (logged every 1000th drop). For lossless audit,
size the buffer for your peak rate. See
Architecture › Audit pipeline
for the data-flow diagram, and Deployment › Logging
for how to correlate audit rows with the access log via request_id.
Schema¶
Two tables, one row each, joined 1:1 on audit_events.id =
audit_payloads.event_id.
audit_events¶
The indexable summary. One row per request.
| Column | Type | Notes |
|---|---|---|
id |
TEXT (UUID) | Primary key. Auto-generated when the inserter omits it. |
ts |
TIMESTAMPTZ | Request start, UTC. Indexed (DESC). |
duration_ms |
BIGINT | Total handler time including audit middleware overhead. |
request_id |
TEXT | X-Request-Id (preserved or generated). |
session_id |
TEXT | Reserved for the portal Try-It flow. |
user_subject |
TEXT | Resolved Identity.Subject. Indexed. |
user_email |
TEXT | OIDC only. |
auth_type |
TEXT | anonymous / apikey / bearer / oauth2. |
api_key_name |
TEXT | KeyName for apikey/bearer; client_id for OIDC. |
method |
TEXT | HTTP method. |
path |
TEXT | Path the request landed on. Indexed. |
route_name |
TEXT | The matched route's name (group-scoped). |
endpoint_group |
TEXT | The owning group: identity, data, etc. |
status |
INTEGER | Response status. Indexed. |
bytes_in |
INTEGER | Inbound body size (raw). |
bytes_out |
INTEGER | Outbound body size (full, including any portion truncated from capture). |
success |
BOOLEAN | 200 <= status < 400. |
error_message |
TEXT | Reserved for handlers that surface errors. |
error_category |
TEXT | Operator-defined free text. |
remote_addr |
TEXT | r.RemoteAddr. |
user_agent |
TEXT | User-Agent header. |
Indexes: (ts DESC), (route_name, ts DESC), (path, ts DESC),
(user_subject, ts DESC), (session_id, ts DESC),
(status, ts DESC).
audit_payloads¶
The detail row. Same shape as Event.Payload in Go.
| Column | Type | Notes |
|---|---|---|
event_id |
TEXT | PK; FK to audit_events.id ON DELETE CASCADE. |
request_headers |
JSONB | Map[name]=values, with redaction applied. |
request_query |
JSONB | Map[name]=values, with redaction applied. |
request_content_type |
TEXT | Inbound Content-Type. |
request_body |
BYTEA | Up to audit.max_payload_bytes; longer bodies write the prefix and set request_truncated. |
request_size_bytes |
INTEGER | Captured prefix size (not the full inbound size). |
request_truncated |
BOOLEAN | True when the inbound body exceeded the cap. |
request_remote_addr |
TEXT | Mirror of summary; convenient for joins. |
response_headers |
JSONB | Same redaction. |
response_content_type |
TEXT | Outbound Content-Type. |
response_body |
BYTEA | Same cap rules as request side. |
response_size_bytes |
INTEGER | Captured prefix size. |
response_truncated |
BOOLEAN | True when the outbound body exceeded the cap. |
replayed_from |
TEXT | When this event was a replay of another, points back. |
captured_at |
TIMESTAMPTZ | Insert time of the payload row. |
Indexes: (replayed_from) partial, GIN on request_headers and
response_headers (jsonb_path_ops, used for portal filter queries).
Redaction¶
The middleware runs every header name and query-param key through
audit.SanitizeHeaders / audit.SanitizeQuery before insert.
audit.redact_keys is a list of case-insensitive substrings; matches
have their values replaced by [redacted].
Default list:
password, token, secret, authorization, api_key, api-key,
credentials, bearer, cookie, jwt, session_id, private_key, passwd
Both api_key and api-key are present so both Authorization /
X-API-Key headers (dashes) and ?api_key=... query params
(underscore in the param name convention) get caught.
Bodies are not redacted. If a request body carries credentials in a
JSON field, store-and-forget it at your own risk; consider lowering
audit.max_payload_bytes or disabling audit.capture_payloads
entirely in privacy-sensitive deployments.
Retention¶
Default is 7 days (lower than mcp-test's 30 because api-test responses can be huge — export endpoints emit 100 MiB bodies). A periodic cleanup job lands in a future migration; for now, prune manually:
Backpressure¶
AsyncLogger wraps the inner store with a buffered channel (default
4096 events) and a per-call timeout (default 5s). On a full buffer the
event is dropped and the cumulative drop count is logged every 1000th
drop:
WARN audit buffer full; dropping events dropped_total=1
WARN audit buffer full; dropping events dropped_total=1001
If you see drops in steady-state load, raise bufferSize in
internal/server.Build or scale the database.
Querying¶
The Go side exposes audit.QueryFilter with these predicates: time
range, method, path, route_name, user_subject, session_id, event_id,
status, success, search (ILIKE on path + error_message), limit,
offset. The Postgres store builds a parameterized SQL SELECT ... FROM
audit_events WHERE … ORDER BY ts DESC, id ASC LIMIT … OFFSET … from
the filter.
The portal API wraps these filters in HTTP query params and returns paginated JSON. Direct SQL is also fine; the schema is documented above and Postgres is the source of truth.
What it proves about the gateway¶
The audit log is the assertion surface for "did the gateway forward what we expected". Common assertions:
- The gateway forwarded the credential under the right transport
(
auth_typematches what the connection promised). - The gateway did not forward a header you blocked
(
request_headersdoesn't carry it). - The gateway forwarded a custom tracing header
(
request_headers["X-Trace-Id"]is present). - The gateway returned the upstream status verbatim (compare
statushere vs the gateway-side record). - The gateway retried (multiple audit rows for one client call, visible by request_id correlation when the gateway preserves it).