Troubleshooting¶
A list of failure modes that come up often enough to be worth writing down. Every entry: symptom, what it actually means, fix.
401 Unauthorized everywhere¶
Symptom: every /v1/* call returns 401, even with a key that
worked yesterday.
Likely causes:
auth.allow_anonymous: falseand the key isn't in any store. Confirm withmake dev-anon(anonymous mode) — if that works, the issue is the credential, not the wiring.- Bad credential, not missing. A bad key does not fall back to
anonymous even when
allow_anonymous: true. Send no auth header at all to take the anonymous path. See Authentication › Anonymous mode. - File-store key value is
${APITEST_DEV_KEY}and the env var is empty. The${VAR:-default}interpolation lets you set a fallback; without a:-, the literal${VAR}survives only ifVARis set.
Diagnose:
curl -i http://localhost:8080/v1/whoami # see WWW-Authenticate header
curl -i -H "X-API-Key: $APITEST_DEV_KEY" http://localhost:8080/v1/whoami
The WWW-Authenticate response header tells you whether api-test saw
"no credential" (Bearer realm="api-test") or "bad credential"
(Bearer realm="api-test", error="invalid_token").
401 on the portal API only¶
Symptom: the SPA loads, but every /api/v1/portal/* request 401s.
Likely causes:
- No portal session cookie and no API key on the request. The portal requires one or the other. Sign in via OIDC, or paste an API key on the sign-in screen.
portal.cookie_secure: trueover plain HTTP. The browser refuses to send aSecurecookie back to a non-TLS endpoint. Either run behind TLS or flip the flag off for local dev.portal.cookie_secretis empty. The session store fails to start cleanly with no secret; check the boot log forsession store:.
/portal/ returns {"status":"banner"...} instead of the SPA¶
Symptom: visiting /portal/ in a browser shows raw JSON, not the
React UI.
Cause: internal/ui/dist/ only contains .gitkeep — the
//go:embed is empty, so the mux falls back to a stub JSON banner.
Fix:
make ui # builds ui/dist/ → internal/ui/dist/
make build # rebuild the binary so the embed picks up the bundle
make build (and make verify) refuse to build when the embed is
empty. Bare go build ./... does not — that's the path that produces
this surprise.
Audit log is empty in the portal¶
Symptom: requests succeed, but /portal/audit is empty or
out-of-date.
Likely causes:
audit.enabled: falsein config. The shipped*.dev.yamlprofile has it off; only*.live.yamlenables it.database.urlis empty. Withaudit.enabled: trueand no database, the binary fails to start; if it started, you're on the dev config.- Health, readiness, well-known, and the portal's own auth flow are
intentionally skipped — they don't generate audit rows. Only
/v1/*requests do. - The async buffer dropped the events. Check the binary's stderr for
audit buffer full; dropping events. Default depth is 4096; raise it if you hit sustained drop warnings.
OIDC login redirects loop¶
Symptom: the IdP redirects back to api-test, which redirects back to the IdP, repeatedly.
Likely causes:
oidc.issuermismatches the IdP's actual issuer claim. Visit${issuer}/.well-known/openid-configurationand confirm theissuerfield in the response matches the config exactly (including trailing slash).oidc.audiencedoesn't match the IdP's tokenaudclaim. Decode a token at jwt.io and compare.- Clock skew.
oidc.clock_skew_secondsdefaults to 30; if the binary and the IdP disagree by more, validation fails withexpornbferrors. Check the binary log foroidc:warnings.
make integration hangs or times out¶
Symptom: integration suite stalls at "starting postgres container."
Likely causes:
- Docker isn't running. The
make integrationtarget gates ondocker info; if you see a hang, you started Docker after the target gate ran. - Resource limits on the Docker VM.
testcontainerspullspostgres:16-alpine(~250 MiB) and needs ~512 MiB free. - Ryuk (the testcontainers reaper) is being blocked by a corporate
proxy. Set
TESTCONTAINERS_RYUK_DISABLED=trueif you trust your own cleanup, or whitelistquay.io/testcontainers/ryuk.
make verify passes locally but CI fails¶
Symptom: green make verify, red CI.
First check: pinned tool versions in Makefile
(GOLANGCI_LINT_VERSION, GOSEC_VERSION, SEMGREP_VERSION) must
match the versions in .github/workflows/ci.yml. CI installs from
those refs, the Makefile installs to bin/tools/. Drift = different
outcomes.
Second check: semgrep is the most likely culprit. The Makefile
warns on version drift but doesn't fail — if CI uses a newer rule
set, it can flag code the local pinned version accepts. Run
pipx install --force semgrep==<CI version> to align.
Third check: integration tests sometimes flake on docker compose
up race conditions in CI. Re-run; if it persists, it's a real bug.
Plexara can't reach api-test¶
Symptom: Plexara connection registration succeeds, but invoking the connection returns "upstream unreachable."
Likely causes:
server.base_urldoesn't match the actual reachable URL. Plexara uses this for redirect and OpenAPI server URLs; if it points atlocalhostwhile Plexara is in a different network namespace, every redirect breaks.- TLS: api-test is plain HTTP behind a TLS-terminating LB and the
Plexara connection is configured
https://.... Check that the LB is actually forwarding to api-test. - Health probe disabled. Plexara may pre-flight
/healthz; if you blocked that path in front of api-test, the connection looks dead even when/v1/*would work.
When in doubt¶
- The Architecture diagrams document the exact request flow.
- The audit log (when enabled) is the source of truth for what api-test actually saw — query it before assuming the gateway is at fault.
- File an issue with the binary's startup log (config, "listening", any WARN/ERROR lines) at https://github.com/plexara/api-test/issues.