0.1.104 — handshake auto-derives version + commit + build_time; sidebar Inbox restored for signed-out users; Casdoor probe failures now logged

Three small fixes diagnosed by probing the deployed https://notechondria.trance-0.com/api/v1/handshake/ endpoint and finding version: "0.0.0" plus a 404 on /api/v1/auth/casdoor/config/. Each one was masking a real bug.

1. /api/v1/handshake/ never lies about the deployed build again

Old shape:

env_version = os.getenv("BACKEND_VERSION") or ""
if env_version: return env_version
# ... try a few VERSION file paths ...
return "0.0.0"   # silent fallback

The fallback to "0.0.0" was indistinguishable from a real version and silently masked stale-deploy bugs in production. It also incentivised setting BACKEND_VERSION as an env var on every deploy target, which drifts from the actual VERSION file in the image.

New shape (in backend/notechondria/api_views.py):

  • Filesystem first. _read_backend_version() tries the same candidate paths but BEFORE the env var. Filesystem is more reliable than env vars (the file is COPY'd into the image at build time; env vars require deploy-time wiring that's easy to forget).
  • BACKEND_VERSION env var becomes a runtime override, not the primary source.
  • git rev-parse --short=12 HEAD as a dev-machine fallback — if no VERSION file exists and no env var is set, return "git-<sha>" so the developer at least sees their commit on the handshake.
  • "unknown" instead of "0.0.0" when every source fails. Logged at WARNING so the operator can grep journalctl for Backend.Notechondria.Handshake/read_version.

Same pattern for the build block:

  • commit reads /home/BUILD_COMMIT → env var → git rev-parse.
  • build_time reads /home/BUILD_TIME → env var → mtime of the VERSION file (the file's mtime equals the image build time because the Dockerfile COPY's it from the repo root).
  • deploy_target stays env-driven — it's a label, not a derived fact, so the same image can serve under different identities.

_build_metadata() is now also cached on the worker (was being recomputed on every handshake request).

Dockerfile changes

backend/Dockerfile gains two build ARGs: GIT_COMMIT and BUILD_TIME. A RUN step writes them to /home/BUILD_COMMIT and /home/BUILD_TIME respectively, falling back to the current UTC timestamp when BUILD_TIME is empty. CI / Northflank can populate them via docker build --build-arg:

docker build \
  --build-arg GIT_COMMIT=$(git rev-parse HEAD) \
  --build-arg BUILD_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) \
  -f backend/Dockerfile .

Northflank template

northflank.json buildArguments now maps GIT_COMMIT from the NORTHFLANK_GIT_COMMIT build-time variable. No env-var wiring required.

northflank_start.sh

The boot script no longer writes BACKEND_VERSION / BACKEND_BUILD_COMMIT / BACKEND_BUILD_TIME env vars — the image bakes them in. Only BACKEND_DEPLOY_TARGET (label) survives as an env var.

Local sanity check

$ DJANGO_SECRET_KEY=test python -c "...; from notechondria.api_views import _read_backend_version, _build_metadata; ..."
version = '0.1.104'
build_metadata = {'version': '0.1.104', 'commit': '<sha>', 'build_time': '<utc-iso>', 'deploy_target': ''}

2. Default Inbox now shows in the sidebar for signed-out users

Reproduction:

  1. User signs in once. Their local Inbox syncs to the cloud (or the cloud auto-seeded one for them via seed_inbox_and_welcome_note).
  2. User signs out (or token expires server-side).
  3. They cold-boot the editor.

Before this fix:

  • _loadLocalState populates _courses (cached cloud) — contains the Inbox row from the prior sign-in.
  • _ensureStarterWorkspace checks hasInbox(_courses) || hasInbox(_localCourses), finds the cached cloud Inbox in _courses, and early-returns.
  • _localCourses stays empty (the seeder never ran).
  • _allCategories for signed-out users excludes _courses (per _AppShellState's getter — cached cloud rows aren't shown to the signed-out user since the editor reads as a local-only workspace until the user re-authenticates).
  • Result: _allCategories.isNotEmpty evaluates to false, and the whole Categories section in the sidebar (gated on that bool) collapses entirely. The user sees no Inbox at all.

Fix in local_starter.dart:_ensureStarterWorkspace: when signed out, only consult _localCourses for the "do we already have an Inbox?" check. If _localCourses doesn't have one, seed a fresh local Inbox via _seedStarterInboxAlongsideExisting so the sidebar is never empty for signed-out users.

When signed in, the original logic stands — a cached cloud Inbox is authoritative because it'll be in _allCategories via the spread.

3. Casdoor config-probe failures now log a debug-log breadcrumb

The probe in each app's _loadInitialData was wrapped in catch (_) { /* shadow mode or transient */ }. That was great for shadow-mode setups (where /auth/casdoor/config/ returns {configured: false} as a 200 response — no exception, no log), but it silently masked stale-deploy bugs. Specifically: the production deploy at notechondria.trance-0.com reported version: "0.0.0" and 404'd on /auth/casdoor/config/, so the SSO pill never appeared and the user had no UI breadcrumb to diagnose why.

editor_app, planner_app, and portal_app now log a WARNING line shaped per AGENTS.md §1.7:

"Casdoor SSO surface unavailable: <App>.Auth/casdoor.config.probe — <cause>."

The <App> prefix comes from the per-app canonical module list in docs/AGENTS.md. Shadow-mode setups (where the probe returns 200 with configured: false) still don't log — only actual exceptions.

Verification

  • python manage.py check → System check identified no issues (0 silenced).
  • flutter analyze from notechondria_shared, editor_app, planner_app, portal_app → zero new errors / warnings.
  • Local handshake smoke test reports the live commit + version, not 0.0.0.

Operator runbook (after this round deploys)

  1. Trigger a Northflank rebuild of the backend service. The build step picks up NORTHFLANK_GIT_COMMIT automatically via the updated northflank.json buildArguments block.
  2. Hit https://notechondria.trance-0.com/api/v1/handshake/ and confirm:
    • version matches the latest committed VERSION file.
    • build.commit is the SHA of the deployed branch.
    • build.build_time is the image build timestamp (UTC ISO).
    • build.deploy_target is "northflank".
  3. Hit https://notechondria.trance-0.com/api/v1/auth/casdoor/config/ and confirm it's no longer 404. Should return {configured: true, endpoint, client_id, organization, application, signin_url} (assuming CASDOOR_* env vars are populated on the service).
  4. Open any frontend signed out — confirm the Casdoor SSO pill, "Login via third party" outlined button, and "No account? Sign up via Casdoor" link all render. The UI debug log no longer hides probe failures.
  5. Cold-boot the editor signed out — confirm the Inbox row appears in the sidebar Categories section regardless of whether the cached _courses (cloud) has an Inbox from a prior session.

Carryover

  • User-migration to Casdoor still pending — the migrate_users_to_casdoor command exists (since 0.1.101) but hasn't been run against production. See the operator notes the agent left in chat for the access constraints.
  • The dead RegistrationWizard / EmailCodeDialog / PasswordResetDialog widgets and the per-app _SettingsPage callback fields are still around as carryover from 0.1.103.