fix: cognix session adopt CLI for orphan sessions #84

Merged
jesse merged 5 commits from fix/dashboard-orphan-session-adopt into main 2026-05-19 15:55:33 +02:00
Owner

Summary

  • Adds cognix session adopt CLI to grant Primary on sessions written before the F-3 registry existed (legitimate never-owned case the server's one-shot repair deliberately cannot handle).
  • Hardens the adopt path against concurrency with a live server, nil-UUID owners, secret leakage in TOML parse errors, and partial-batch corruption.

Test Plan

  • cargo test -p cognix-cli — unit tests for adopt (orphan promotion, dry-run, include-closed, nil-UUID rejection, malformed session id, stale-snapshot race, production-migrated schema).
  • CLI parses session adopt --path … --owner … --include-closed --dry-run.
  • CI: fmt + clippy + test + audit + deny + coverage.

Self-Review Checklist

  • No hardcoded secrets
  • No unwrap() in library code
  • No println!/dbg! (CLI binary uses println! for user-facing output only)
  • Fail-closed on missing seed file; nil-UUID rejected from both --owner and seed
  • Per-row SAVEPOINTs so one bad orphan doesn't abort the batch
  • TOML parse errors sanitized — never echo file contents
  • max_agents resolved from CognixConfig rather than a magic number
  • Tests cover happy path, dry-run, races, malformed input, and a production-migrated schema

Notes

Trust model: any caller with write access to the SQLite file can use --owner to grant Primary to an arbitrary UUID; the DB file ACL (0o600) is the only access control on this path. Documented in both AdoptCommand and the clap --owner help.

## Summary - Adds `cognix session adopt` CLI to grant Primary on sessions written before the F-3 registry existed (legitimate never-owned case the server's one-shot repair deliberately cannot handle). - Hardens the adopt path against concurrency with a live server, nil-UUID owners, secret leakage in TOML parse errors, and partial-batch corruption. ## Test Plan - [x] `cargo test -p cognix-cli` — unit tests for adopt (orphan promotion, dry-run, include-closed, nil-UUID rejection, malformed session id, stale-snapshot race, production-migrated schema). - [x] CLI parses `session adopt --path … --owner … --include-closed --dry-run`. - [ ] CI: fmt + clippy + test + audit + deny + coverage. ## Self-Review Checklist - [x] No hardcoded secrets - [x] No `unwrap()` in library code - [x] No `println!`/`dbg!` (CLI binary uses `println!` for user-facing output only) - [x] Fail-closed on missing seed file; nil-UUID rejected from both `--owner` and seed - [x] Per-row SAVEPOINTs so one bad orphan doesn't abort the batch - [x] TOML parse errors sanitized — never echo file contents - [x] `max_agents` resolved from `CognixConfig` rather than a magic number - [x] Tests cover happy path, dry-run, races, malformed input, and a production-migrated schema ## Notes Trust model: any caller with write access to the SQLite file can use `--owner` to grant Primary to an arbitrary UUID; the DB file ACL (0o600) is the only access control on this path. Documented in both `AdoptCommand` and the clap `--owner` help.
The dashboard's per-caller RBAC filter hides any row in `sessions` that
has no matching row in `shared_sessions`, because `resolve_session_role`
cannot prove ownership. Sessions created before `register_local_session`
was wired into `create_session` (PR #50/#82) are stranded that way and
the one-shot legacy-owner repair deliberately does not adopt them.

`cognix session adopt` is the explicit operator escape hatch: it finds
orphans and inserts `shared_sessions` + a Primary `session_permissions`
grant for the local owner (or an explicit --owner uuid), all in one
transaction. Flags: --path, --owner, --include-closed, --dry-run.
Missing seed file fails closed so we never silently re-own orphans
under a fresh random identity.
fix: harden session adopt against concurrency, nil owners, and secret leaks
Some checks failed
CI / Detect Changes (pull_request) Successful in 13s
CI / Format (pull_request) Has been skipped
CI / Clippy (pull_request) Has been skipped
CI / Integration Tests (pull_request) Has been skipped
CI / Benchmarks (pull_request) Has been skipped
CI / Build (release) (pull_request) Has been skipped
CI / RSS gate (P-15) (pull_request) Has been skipped
CI / Security Scan (pull_request) Successful in 23s
CI / Test (pull_request) Failing after 28s
CI / PR Size Check (pull_request) Successful in 28s
CI / Conventional Validation (pull_request) Successful in 51s
CI / Clean Build Sample 1 (pull_request) Has been skipped
CI / Check file lengths (pull_request) Failing after 39s
CI / Clean Build Sample 3 (pull_request) Has been skipped
CI / Clean Build Sample 2 (pull_request) Has been skipped
CI / Clean Build Summary (pull_request) Has been skipped
CI / Dashboard Browser (pull_request) Successful in 2m17s
CI / Documentation (pull_request) Successful in 5m22s
CI / Check (linux-aarch64 compile-validation) (pull_request) Successful in 5m41s
CI / Deny (pull_request) Failing after 10m6s
CI / Audit (CVEs) (pull_request) Failing after 10m8s
CI / D-02 Clean Build Gate (pull_request) Failing after 15m15s
CI / Dashboard UI Build (pull_request) Successful in 15m55s
CI / Coverage (80% gate) (pull_request) Successful in 16m42s
CI / CI Report (pull_request) Successful in 7s
0b84d2d0c7
- Hold an IMMEDIATE transaction with connection busy_timeout so adopt
  cooperates with a running server instead of failing on SQLITE_BUSY.
- Wrap each orphan in its own SAVEPOINT; FK violations from a vanished
  sessions row roll back only that orphan, not the whole batch.
- Skip orphans already owned via INSERT OR IGNORE so a race with the
  server cannot grant Primary to a session another agent owns.
- Reject the nil UUID from --owner and from the seed file so a forgotten
  flag or empty seed cannot grant Primary to a sentinel identity.
- Validate every orphan session id before any writes; abort cleanly on
  malformed rows rather than corrupting the registry mid-batch.
- Sanitize TOML parse errors from the secrets file to report only file
  + offset; the seed file is expected to grow real credentials.
- Resolve max_agents from CognixConfig instead of a hardcoded constant
  so adopted sessions match what the running server would assign.
- Reuse cognix-core's default_local_state_dir + SECRETS_FILE_NAME
  (now pub) rather than redeclaring constants in the CLI.
ci: use cargo-binstall for deny and audit to avoid compile timeouts
Some checks failed
CI / Detect Changes (pull_request) Successful in 11s
CI / Integration Tests (pull_request) Has been skipped
CI / Check file lengths (pull_request) Failing after 2s
CI / Check (linux-aarch64 compile-validation) (pull_request) Failing after 2s
CI / D-02 Clean Build Gate (pull_request) Failing after 2s
CI / Benchmarks (pull_request) Has been skipped
CI / Format (pull_request) Successful in 20s
CI / Security Scan (pull_request) Successful in 22s
CI / Audit (CVEs) (pull_request) Successful in 31s
CI / Test (pull_request) Has been cancelled
CI / Deny (pull_request) Has been cancelled
CI / Clippy (pull_request) Has been cancelled
CI / Dashboard UI Build (pull_request) Has been cancelled
CI / Conventional Validation (pull_request) Has been cancelled
CI / Dashboard Browser (pull_request) Has been cancelled
CI / Documentation (pull_request) Has been cancelled
CI / Coverage (80% gate) (pull_request) Has been cancelled
CI / Build (release) (pull_request) Has been cancelled
CI / Clean Build Sample 1 (pull_request) Has been cancelled
CI / Clean Build Sample 3 (pull_request) Has been cancelled
CI / PR Size Check (pull_request) Has been cancelled
CI / RSS gate (P-15) (pull_request) Has been cancelled
CI / CI Report (pull_request) Has been cancelled
CI / Clean Build Sample 2 (pull_request) Has been cancelled
CI / Clean Build Summary (pull_request) Has been cancelled
5a609295f0
cargo install from source takes 10+ min on a cold runner, exceeding the
10-minute timeout. cargo-binstall downloads pre-compiled binaries for
cargo-deny and cargo-audit in ~20 seconds, fixing the flaky CI gate.

Also bumps timeout-minutes from 10 to 15 for the affected jobs as a
safety margin in case binstall needs to fall back to source compilation.
chore: remove stale RUSTSEC-2026-0097 advisory ignore from deny.toml
Some checks failed
CI / Detect Changes (pull_request) Successful in 17s
CI / Integration Tests (pull_request) Has been skipped
CI / Benchmarks (pull_request) Has been skipped
CI / Format (pull_request) Successful in 17s
CI / Conventional Validation (pull_request) Successful in 50s
CI / Clean Build Sample 1 (pull_request) Has been skipped
CI / Clean Build Sample 2 (pull_request) Has been skipped
CI / Check file lengths (pull_request) Failing after 31s
CI / Clean Build Sample 3 (pull_request) Has been skipped
CI / Test (pull_request) Failing after 36s
CI / Clean Build Summary (pull_request) Has been skipped
CI / Audit (CVEs) (pull_request) Successful in 37s
CI / Deny (pull_request) Successful in 43s
CI / Documentation (pull_request) Successful in 1m30s
CI / Dashboard Browser (pull_request) Successful in 1m34s
CI / Check (linux-aarch64 compile-validation) (pull_request) Successful in 2m50s
CI / Clippy (pull_request) Successful in 2m57s
CI / Dashboard UI Build (pull_request) Successful in 5m21s
CI / D-02 Clean Build Gate (pull_request) Successful in 5m40s
CI / Coverage (80% gate) (pull_request) Successful in 6m29s
CI / RSS gate (P-15) (pull_request) Successful in 5m15s
CI / Build (release) (pull_request) Successful in 6m21s
CI / PR Size Check (pull_request) Successful in 9s
CI / Security Scan (pull_request) Successful in 11s
CI / CI Report (pull_request) Successful in 5s
cf2c52adf1
rand was upgraded past the affected version range; cargo deny check now
reports advisory-not-detected, so the ignore entry is dead config.
refactor: split session.rs into session/ module to satisfy file-length gate
All checks were successful
CI / Detect Changes (pull_request) Successful in 19s
CI / Integration Tests (pull_request) Has been skipped
CI / Benchmarks (pull_request) Has been skipped
CI / Security Scan (pull_request) Successful in 33s
CI / Deny (pull_request) Successful in 59s
CI / Dashboard Browser (pull_request) Successful in 2m12s
CI / Conventional Validation (pull_request) Successful in 2m54s
CI / Clean Build Sample 1 (pull_request) Has been skipped
CI / Clean Build Sample 2 (pull_request) Has been skipped
CI / Clean Build Sample 3 (pull_request) Has been skipped
CI / Clean Build Summary (pull_request) Has been skipped
CI / Dashboard UI Build (pull_request) Successful in 10m18s
CI / Test (pull_request) Successful in 10m20s
CI / Format (pull_request) Successful in 15s
CI / Audit (CVEs) (pull_request) Successful in 20s
CI / Check file lengths (pull_request) Successful in 24s
CI / Documentation (pull_request) Successful in 1m28s
CI / Clippy (pull_request) Successful in 1m49s
CI / Check (linux-aarch64 compile-validation) (pull_request) Successful in 1m51s
CI / D-02 Clean Build Gate (pull_request) Successful in 4m40s
CI / Coverage (80% gate) (pull_request) Successful in 6m32s
CI / RSS gate (P-15) (pull_request) Successful in 5m1s
CI / Build (release) (pull_request) Successful in 6m13s
CI / PR Size Check (pull_request) Successful in 10s
CI / CI Report (pull_request) Successful in 4s
e37dfb41fe
session.rs reached 1043 lines (>800 hard ceiling), failing CI / Check file
lengths and CI / Test (which also runs the gate). Convert to a module
directory with three focused files:

- session/mod.rs  — public API: AdoptCommand, AdoptReport, run() + run-through tests
- session/db.rs   — DB internals: OrphanSession, find_orphan_sessions,
                    adopt_orphans, AdoptOutcome, try_adopt_one + 2 direct tests
- session/resolve.rs — identity resolution: resolve_owner, resolve_max_agents,
                       load_owner_seed

All 115 tests pass; clippy clean; file-length gate exits 0.
jesse merged commit 7696511413 into main 2026-05-19 15:55:33 +02:00
jesse deleted branch fix/dashboard-orphan-session-adopt 2026-05-19 15:55:34 +02:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
jesse/cognix!84
No description provided.