GloriousFlywheel Dogfood Reality Gap Analysis
Date: 2026-04-25
Linear: TIN-548
Purpose
This note records a cautious source-repo dogfood reality pass after the stability, pooled-substrate, cache-authority, auth-authority, and TIN-545 hardening work completed.
The goal is not to invent a new architecture. The goal is to compare the written contract against the live repo surfaces and separate:
- proved source-repo dogfood truth
- implementation gaps
- compatibility surfaces that must not steer the product story
- historical residue that still looks executable enough to mislead agents
Current Proved Truth
At audit start, /Users/jess/git/GloriousFlywheel was clean on main at
5850bdacd9b46416a729a6002e24105a978258dd.
The latest default-branch proof package on that commit was green:
Platform Proofrun24922182403Source Bazel Proofrun24922182401Validaterun24922182399Secret Detectionrun24922182398Tranche Proof Statusrun24922182395Deploy Docsrun24922182402Publish to FlakeHubrun24922182393
The green proof matters. It does not mean every source-repo workflow already executes on the pooled GloriousFlywheel substrate.
The strongest source-repo dogfood proof is currently:
Source Bazel Proofruns ontinyland-nix- it requires
BAZEL_REMOTE_CACHE - it requires
GF_BAZEL_SUBSTRATE_MODE=shared-cache-backed - it calls
scripts/cache-attachment-contract.sh --strict - it enters
nix develop .#ci - it runs Bazel through
scripts/bazel-cache-backed.sh
That proves shared cache acceleration through the wrapper path. It does not prove universal remote execution or full remote builder offload for every local developer workload.
2026-04-26 Refresh
The latest audited default-branch proof package is stronger than the original TIN-548 snapshot, but the core boundary remains the same.
Current verified truth:
mainat7ae6b3d653199ec1dc5299f2a541a63225a9aa94passed the proof package:Source Bazel Proof,Platform Proof,Validate,Secret Detection,Deploy Docs, andPublish to FlakeHubjust attic-cache-authority-checkreports the livemainAttic cache as public-read, with anonymous metadata returning HTTP 200- the
Source Bazel Proofpassed a realBAZEL_REMOTE_CACHE=grpc://bazel-cache.nix-cache.svc.cluster.local:9092endpoint throughscripts/bazel-cache-backed.shand reported one remote cache hit - the
Platform Proofshowedrunner-dashboardfetched fromhttp://attic.nix-cache.svc.cluster.local/mainand a post-job Attic delta push - ARC listener pods for the Tinyland and Jess owner-overlay scale sets were running during the audit, and no runner payload pods were pending at that instant
Current negative truth:
- the active Bazel config still contains no
--remote_executoror equivalent Bazel remote-execution path; the implemented surface is remote cache - local developer sessions are
compatibility-local-onlyunless an operator provides a routableBAZEL_REMOTE_CACHE - Bazel external repository fetches still happen before action-cache hits can help, so TIN-643 remains real product debt
- owner-overlay scale-set names solve GitHub registration/auth boundaries, but they do not create a global concurrency policy across shared labels
- direct full-repo public visibility remains blocked by internal history and current-tree exposure; the safe route is still the scrubbed public-alpha export/mirror
So the current product status is: source-repo shared cache dogfood is green; developer-machine parity, broad cross-repo adoption, and Bazel remote execution are not complete.
Contract Baseline
The active contract is:
- GloriousFlywheel is a pooled build, cache, and runner substrate.
- Local development and CI are meant to ride the same shared substrate.
- Capability classes, not repo names, define runner taxonomy.
- Raw local Bazel and Bazelisk are not the default product path.
- If the implementation only proves cache-backed local execution, say that explicitly.
This remains the right contract. The gaps below are places where implementation, examples, or workflow reality still fall short of that contract.
Gap 1: Hosted Default-Branch Workflows Still Exist
The default-branch proof package is green, but several workflows still use
ubuntu-latest.
Current hosted surfaces:
.github/workflows/validate.yml.github/workflows/secrets-scan.yml.github/workflows/flakehub-publish.yml.github/workflows/pages.yml.github/workflows/mirror-images.yml.github/workflows/release.ymlfor release metadata creation.github/workflows/tranche-proof-status.yml
Assessment:
- This is a real dogfood gap.
- It is not the same as a failed proof.
- Some jobs may remain control-plane or third-party-action exceptions for a while, but they should be named exceptions or migrated intentionally.
- The current repo should not imply that all default-branch work already avoids hosted runners.
Likely next slice:
- Create a hosted-runner exception register and migrate the low-risk jobs first.
- Candidate low-risk migrations:
validate.yml,secrets-scan.yml, andtranche-proof-status.yml. - Leave GitHub Pages deploy, FlakeHub OIDC publish, image mirroring, and release metadata as separately evaluated control-plane jobs until proved on self-hosted lanes.
Gap 2: Active Bazel Docs Still Normalize Direct Bazel Commands
Several active docs and examples still show direct bazel build or
bazel test commands as the normal invocation after cache attachment.
Examples:
docs/build-system/bazel-targets.mddocs/architecture/bazel-version-policy.mddocs/guides/adoption-quickstart.mddocs/runners/downstream-migration-checklist.mdexamples/github/cache-backed-workflow.ymlexamples/gitlab/.gitlab-ci.ymlexamples/flake/flake.nix
Assessment:
- This is a real narrative and agent-safety gap.
- Passing
--remote_cache="$BAZEL_REMOTE_CACHE"explicitly is better than the old literal-placeholder bug, but it still teaches direct Bazel invocation as the path users copy. - The product story should center a wrapper or repo-managed entrypoint that performs the strict cache-attachment preflight before Bazel executes.
Likely next slice:
- Add or expose a reusable consumer-side cache-backed Bazel wrapper example.
- Rewrite active examples to call that wrapper or a
justrecipe, not directbazel build. - Keep raw Bazel text only in explicit compatibility/debug sections.
Gap 3: The Devshell Still Exposes A Bare bazel Wrapper
flake.nix exposes bazel as a compatibility wrapper around Bazelisk. The
comments say routine usage should go through just bazel-build-cached, but the
command is still available.
Assessment:
- This is not automatically wrong, because
scripts/bazel-cache-backed.shneeds a Bazel binary to call. - It is still an enforcement gap: the repo relies on docs and agent guidance to
stop raw
bazeluse instead of making misuse harder.
Likely next slice:
- Consider a guarded devshell
bazelshim that refuses heavy commands unlessGF_BAZEL_SUBSTRATE_MODE=shared-cache-backedandBAZEL_REMOTE_CACHEare present, with an explicit escape hatch for compatibility debugging. - Do this carefully because
bazel clean,bazel query, and wrapper-invoked commands need a clear allowance model.
Gap 4: GitLab Compatibility Surfaces Preserve Stale Cache Drift
The primary GitHub path is the current product path, but GitLab compatibility files are still live enough to mislead.
Observed drift:
.gitlab-ci.ymlstill setsBAZEL_REMOTE_CACHE=grpc://bazel-cache.attic-cache-dev.svc.cluster.local:9092.gitlab/ci/jobs/bazel-build.gitlab-ci.ymlbuilds auser.bazelrcand runs directnix develop .#ci --command bazel ...config/organization-s3.example.yamlandconfig/organization-ha.example.yamlstill useattic-cache-devnamespace examplesdocs/infrastructure/overlay-creation.mdstill teaches the old GitLab-first overlay path withattic-cache-devexamples
Assessment:
- This is a compatibility-surface gap, not the primary source-repo proof path.
- It is still dangerous because these files are active tracked examples and validate targets, not archived research notes.
- The stale endpoint is explicitly out of contract elsewhere in the repo.
Likely next slice:
- Remove hard-coded stale Bazel endpoint defaults from GitLab compatibility.
- Require operator-provided
BAZEL_REMOTE_CACHEfor GitLab Bazel jobs. - Mark overlay creation as legacy compatibility or rewrite it against the current S3 state and shared cache contract.
Gap 5: The Active Superpowers Plan Still Contains Executable Old Body Text
docs/superpowers/plans/2026-04-23-gloriousflywheel-pooled-substrate-dogfood-reset.md
has accurate progress checkpoints at the top, but the older checkbox body still
contains stale local paths and direct Bazel snippets.
Assessment:
- This is not current implementation truth.
- It remains a drift hazard because it is still presented as an implementation plan for agentic workers.
- The top checkpoint says the route has moved on, but the old body is still easy to over-follow.
Likely next slice:
- Collapse the completed plan body into a historical appendix, or add an explicit “do not execute the original checklist without reconciling against current canon” boundary.
- Move active productization work into fresh issue-backed plan surfaces.
Gap 6: Future Runner Types Are Correctly Bounded For Now
Native aarch64, riscv, Dawn-native dispatch, and localized warm-cache
guarantees for Hackage, Chapel, GPU backends, and similar toolchains are not
currently implemented as dispatch contracts.
Assessment:
- The current docs mostly classify these correctly as future-lane research.
- They should stay out of the current product contract until there is a named proof surface, runner class, owner, and cache-warming plan.
Likely next slice:
- Do not implement these inside the immediate dogfood repair lane.
- Keep them on the productization roadmap as future proof packages.
Gap 7: RBE Planning Exists, But Is Not Yet Authority
The April 26 RBE planning pass correctly identified that Bazel “remote build”
means remote execution, not only remote cache hits. It also correctly found
BUILD_WORKSPACE_DIRECTORY and shell-environment hazards in the Tofu rules.
Assessment:
- The plan is useful as a candidate sprint shape.
- It is not implementation authority.
- The NativeLink-shaped Linear scaffold created on April 26 assumes a peer backend choice before the repo has recorded that decision.
- Buildbarn, Buildfarm, BuildBuddy, and NativeLink are class-peer projects with overlapping build cache, CAS, worker, scheduling, and REAPI concerns. They are not ordinary GloriousFlywheel dependencies.
- ARC/GitHub Actions dispatch is real remote job execution, but not Bazel
action-level remote execution. It must not be counted as
--remote_executorproof. - The README must not claim remote build until a default-branch proof shows
actual remote processes through
--remote_executor. - See RBE Sprint Gate for the execution boundary.
Likely next slice:
- Add or annotate the Linear RBE work with an architecture-decision gate.
- Keep TIN-650 as the nearer-term developer-machine cache attachment proof.
- Treat a backend-neutral REAPI adapter, NativeLink, BuildBuddy, Buildbarn, Buildfarm, or deferral as candidates until the architecture decision is recorded.
Recommended Execution Order
- Add a repeatable source-repo dogfood contract audit. It should fail on unclassified hosted workflows, stale cache endpoints in live surfaces, and direct Bazel examples in active docs.
- Migrate the lowest-risk hosted workflows onto shared lanes. Start with validation/status jobs, not third-party publish/deploy jobs.
- Rewrite consumer Bazel examples around a wrapper entrypoint. Make direct Bazel commands compatibility/debug-only.
- Repair GitLab compatibility drift. Remove stale endpoints and require the same cache-variable contract.
- Evaluate a guarded devshell Bazel shim. Do this only after wrapper docs and CI checks are settled.
- Gate RBE work through a backend decision and minimum executor proof. Do not wire runner env vars, worker images, or public claims before the proof contract is explicit.
Non-Goals For This Audit
- Do not treat downstream blocked repos as proof criteria for the source repo.
- Do not invent repo-specific runner labels to close gaps.
- Do not claim remote execution where the implementation only proves shared cache acceleration.
- Do not move all hosted workflows in one broad PR without separating third-party control-plane risk.