Security
PrexorCloud has three distinct authentication paths. Each one solves a specific problem and uses different credentials. Mixing them up is the most common cause of “why is this 401?” — and getting them right is how the cluster stays trustworthy. This page is the reference for all three, plus the RBAC model, the cosign signing posture, and the threat model.
What you’ll learn
- The three auth paths: operators (JWT), daemons (mTLS), and in-server code (plugin tokens).
- How role-based access control works, what permissions exist, and how to add a custom role.
- The cosign signing model — keyless for our own releases, key-based for third-party modules — and how Rekor SET enforcement works.
- The threat model: what each control defends against and where it ends.
The three auth paths
| Caller | Credential | Lifetime | Purpose |
|---|---|---|---|
| Operator (dashboard, CLI, scripts) | JWT | configurable, default 60 min | Normal authenticated REST calls |
| Daemon | mTLS client certificate | rotating, issued by controller CA | gRPC stream to controller |
| MC plugin (server / proxy) | Plugin token (ptk_...) | per-instance, refreshable | Plugin → controller REST |
There is no OIDC, no SAML, no MFA, no passkeys in v1. The reasons are in the project’s decisions record.
Operator auth: JWT
Operators authenticate with username plus password and receive a signed JWT.
Login
POST /api/v1/auth/loginContent-Type: application/json
{ "username": "alice", "password": "..." }Returns:
{ "token": "eyJhb...", "user": { "username": "alice", "role": "OPERATOR", "..." }}Passwords are bcrypt-hashed on creation. AuthManager.login verifies
the hash, checks lockout state, and issues a JWT signed with the
configured HS256 secret. On verification failure the account-lockout
counter increments under prexor:v1:login:fail:<username>; after
security.lockout.maxAttempts failures inside
security.lockout.windowSeconds, the account is locked for
security.lockout.lockoutSeconds.
JWT shape
The JWT carries:
sub— usernamerole— the role at issue timeiat/exp— standardjti— JWT ID, used for revocation
Permissions are not in the JWT. Middleware re-resolves the role’s permission set on every request from the role store, so role changes take effect immediately without a re-login.
Refresh
POST /api/v1/auth/refresh with the current JWT issues a new JWT with
a fresh expiry. The middleware looks up the user’s current role at
refresh time. If the user has been deleted, refresh is denied.
Revocation
prexor:v1:jwt:revoked:<jti> is set on logout, password change, and
explicit revoke (POST /api/v1/auth/revoke). The TTL matches the
remaining JWT lifetime so revocations clean up automatically.
JwtAuthMiddleware checks this on every request.
In production profile, revocation is shared across controllers via
Valkey. In development it is in-memory and lost on restart.
Key rotation
JwtManager.rotate(newSecret) swaps the signing key. Verification uses
a dual-window: the new key is preferred, the previous keys are accepted
up to their configured tail. See
runbooks/rotate-secrets.md.
Password reset (optional)
Off by default. When enabled (security.passwordReset.enabled=true),
the flow is:
POST /auth/password-reset/request { email }— always returns 202 regardless of whether the email is known (no enumeration leak).- If the email matches a user,
PasswordResetManagermints a 256-bit URL-safe token, stores it underprexor:v1:pwreset:<token>with a 30-minute TTL, emails the link. - User clicks the link, posts the new password to
/auth/password-reset/complete { token, newPassword }. - The token is consumed (single-use) and the bcrypt hash swapped.
Mailers: LogMailer (default, writes to controller log) or
SmtpMailer with STARTTLS / implicit TLS / optional AUTH.
RBAC
Three built-in roles, defined in defaults/roles.yml (operator-editable):
- ADMIN —
["*"]. Wildcard, every permission. - OPERATOR — most operational permissions (groups, instances, modules, deployments, networks, events).
- VIEWER — read-only across the system.
There are roughly 30 permissions on Permission. They are not
hierarchical — groups.update does not imply groups.view. Role
definitions enumerate every permission they grant.
Custom roles: edit roles.yml (or call the role REST surface) to add a
role with a curated permission list. Wildcards inside lists are not
supported — ["*"] is the only special string and is reserved for
ADMIN-equivalent.
# Example: a role for module operators onlyroles: - name: MODULE_OPERATOR permissions: - modules.view - modules.manage - system.healthPermissions are checked in route handlers via
ctx.requirePermission(...):
reg.post("/admin/recompute", ctx -> { ctx.requirePermission(Permissions.MODULES_MANAGE); leaderboard.recompute();});Daemon auth: mTLS
Daemons authenticate to the controller’s gRPC server with client certificates issued by the controller’s internal CA. There is no shared secret.
First contact: the join token
A new daemon does not have a certificate yet. To bootstrap:
# On the controllerprexorctl token create --description "node-1" --ttl 1h# -> Token: prxn_...
# On the daemon hostsudo prexorctl setup --role daemon \ --controller-grpc <controller-host>:9090 \ --join-token prxn_...The daemon setup flow:
- Calls the unauthenticated
BootstrapService.RegisterRPC presenting the join token. - The controller verifies the token, generates a private key plus cert
via
CertificateAuthority.issue(...), returns the cert chain plus private key in the RPC response. - The daemon writes the cert and key to
data/certs/, deletes the join token from its config, switches to mTLS for all future gRPC calls.
Join tokens are single-use. Replay rejection happens server-side, so you cannot accidentally register the same daemon twice.
Cert lifecycle
mTLS material live-reloads when the on-disk cert and key change
(ReloadableServerSslContext + TlsMaterialWatcher). Operators rotate
without controller downtime.
Per-node revocation: POST /api/v1/nodes/{id}/revoke-cert (gated on
nodes.manage) adds the node’s cert serial to the revocation store.
The mTLS interceptor rejects connections with revoked serials
immediately. Recovery is to revoke and re-bootstrap with a new join
token.
Plugin auth: plugin tokens
The third path is used by code running inside MC server / proxy JVMs.
When the controller dispatches a Start to a daemon, it generates a
per-instance plugin token (ptk_<UUID>) and includes it in the
composition plan. The daemon
writes this into the instance’s environment as CLOUD_PLUGIN_TOKEN.
The cloud plugin reads the env var on startup and presents it as a
Bearer token on /api/proxy/* and /api/plugin/* REST routes.
Why a separate credential:
- Plugin tokens are per-instance — a compromised server exposes only one instance’s REST surface.
- Plugin tokens have a short TTL (15 minutes by default). The plugin refreshes proactively before expiry.
- Plugin tokens are revoked when the instance stops. There is no lingering “old plugin token still works” risk.
- Plugin tokens carry a sequence window for replay protection
(
prexor:v1:workloadseq:).
Route segregation
| Route prefix | Auth | Caller |
|---|---|---|
/api/v1/* | JWT (with specific exemptions) | Operator, dashboard, CLI |
/api/proxy/* | Plugin token | Proxy plugin (Velocity / BungeeCord) |
/api/plugin/* | Plugin token | Server plugin (Paper / Spigot / Folia) |
Plugin tokens never grant access to /api/v1/* and JWTs never grant
access to /api/proxy/* or /api/plugin/*. The middleware enforces
this — there is no “magic admin JWT” that skips plugin auth.
Public routes
A few routes skip auth entirely:
/api/v1/auth/login/api/v1/auth/password-reset/requestand/complete(when password reset is enabled; 404 otherwise)/api/v1/system/health(liveness)/api/v1/system/ready(readiness)/api/v1/system/version/api/v1/events/stream— but this stream requires an SSE ticket query parameter; the ticket exchange itself is JWT-authenticated./metrics— Prometheus exposition; deliberately not authenticated by default. Gate with a reverse-proxy ACL if needed.
Console SSE (/services/{id}/console), controller-log SSE
(/system/logs/stream), and daemon-log SSE
(/nodes/{id}/logs/stream) all use the same SSE-ticket pattern.
Cosign signing
We sign two distinct artefact families with two distinct cosign flows.
Keyless: our own releases
Each release tag (v*) signs prexorctl binaries plus container images
with cosign keyless using the GitHub Actions OIDC identity. We do
not maintain a long-lived private key. Verification proves “this
artefact was signed by that GitHub Actions workflow on this repo.”
cosign verify-blob \ --certificate-identity-regexp "^https://github.com/prexorjustin/prexorcloud/.github/workflows/release.yml@refs/tags/" \ --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \ --signature checksums.txt.sig \ --certificate checksums.txt.pem \ checksums.txt
sha256sum -c checksums.txtThe release workflow runs cosign verify against its own freshly-signed
images as the last step, so a broken signature fails CI before
operators ever see it.
Key-based: third-party modules
Module signing is a different problem — third-party authors with their own keys, not us. The verifier accepts two formats:
<jar>.cosign.bundle— the new sign-blob bundle format. Supports raw-keyed signatures, embedded-cert signatures with PKIX validation, and embedded-cert pinning against raw pubkey trust roots.<jar>.sig— legacy keyed PEM sidecar. Deprecated; supported for back-compat.
Configure under modules.signing:
modules: signing: required: true # production default mode: COSIGN_BUNDLE # COSIGN_BUNDLE | KEYED trustRoot: "config/cosign-roots.pem" rekor: policy: REQUIRE_SET # DISABLED | REQUIRE_SET publicKey: "config/rekor.pub"REQUIRE_SET enforces offline Rekor SET verification: the
controller loads Rekor’s public key locally, parses the bundle’s
SignedEntryTimestamp, reconstructs the canonical JSON of the Rekor
payload, and rejects bundles whose SET does not verify. No network
access is required.
Inclusion-proof Merkle-path verification is not implemented. SET is enough — see the decisions record.
A signature failure on install returns HTTP 422
SIGNATURE_VERIFICATION_FAILED. The integration test
CosignSignedModuleInstallTest exercises the full path.
Daemon-side enforcement
Daemons that re-receive modules over the gRPC stream can verify
signatures locally. The cloud-security/signing package is shared
between controller and daemon; the daemon writes the inbound jar plus
sidecar to a temp directory as siblings (the on-disk shape the
verifier expects) and runs verify() before commit. Configure under
daemon.modules.signing.
Threat model
The detail is in the project’s security threat model doc. The short version of what auth defends:
| Threat | Mitigation |
|---|---|
| Brute-force operator login | Lockout (configurable threshold, window, duration), shared via Valkey across HA controllers |
| Stolen JWT | Short lifetime, revocable. Caller must know the JWT is stolen for revocation to fire. |
| Stolen daemon key | Per-node revocation; mTLS interceptor enforces immediately. Recovery: revoke, re-bootstrap. |
| Stolen plugin token | Short TTL, per-instance scope, revoked on stop, sequence-window replay protection. |
| Operator account takeover via reset email | Reset token single-use, 30-minute TTL, bound to the user. Compromised email is still a real risk. |
| Compromised module supply chain | Cosign verification with optional Rekor SET enforcement. Unsigned modules reject in production. |
What auth does not defend against:
- A compromised controller host (root on the controller is game over).
- A compromised daemon host (the daemon’s instances are compromised; the cluster’s other instances are not, modulo proxy-side player data).
- Operator misuse of a legitimate ADMIN account. Audit logs reveal it after the fact; nothing prevents it.
Production hardening checklist
Within five minutes of first install:
- Change the bootstrap admin password.
- Shred the
.initial-admin-passwordfile. - Restrict
network.allowedSubnetsto operator and daemon subnets. - Terminate TLS at a reverse proxy if the controller REST is exposed
beyond a private network. Set
http.trustedProxyCidrssonetwork.allowedSubnetsevaluates the real client IP. - Enable module-signing enforcement
(
modules.signing.required=true) plus a configured trust root if you plan to install third-party modules.
The full checklist lives at Operations / Production Checklist.
Next up
- Cluster Model — what Valkey holds, what the lease-and-fencing model protects.
- Module Lifecycle — what happens at signature-verification time on install.
- Plugins — plugin token issuance and route segregation in detail.
- Operations / Authentication — the operator-facing how-to.