GF REAPI Cell
gf-reapi-cell is the first GloriousFlywheel-owned remote execution proof cell.
It exists to turn the RBE lane from planning language into a runnable REAPI
endpoint without adopting BuildBuddy, Buildbarn, Buildfarm, or NativeLink as the
product authority.
Current status:
- implements Capabilities, ByteStream, CAS, Action Cache, Execution, and WaitExecution
- stores CAS and action-cache data on the service-local filesystem, scoped by
validated REAPI
instance_name(default,system, orspoke-<slug>) - can optionally enforce JWT-backed tenant authorization for CAS, AC,
ByteStream, Execute, and WaitExecution using
Authorization: Bearer <jwt>; the default deployment remainsGF_REAPI_AUTHZ_MODE=offuntil rollout - builds through
nix build .#gf-reapi-cell - exposes an OCI image package as
nix build .#gf-reapi-cell-image - publishes through
.github/workflows/publish-gf-reapi-cell.ymltoghcr.io/tinyland-inc/gf-reapi-cellby digest - carries a minimal worker runtime envelope including
/bin/sh,/usr/bin/env, Node 22, Python 3, glibc, the/lib64/ld-linux-x86-64.so.2loader bridge, the compiler C++ runtime needed by hermetic Linux tool inputs such as rules_nodejs Node, and common POSIX archive tools needed by first-proof Bazel actions - carries Chromium in the browser-capable image path for the proved bounded Playwright static-site smoke class and the proved public consumer Puppeteer static-output smoke class; additional browser target classes still need forced proof before promotion, and browser binaries must come from the pinned Browser Runtime Authority, not from npm lifecycle downloads
- can be exercised through the manual
.github/workflows/gf-reapi-cell-proof.ymlworkflow orscripts/run-gf-reapi-cell-proof.sh - emits worker, platform, action digest, and command evidence in execution logs
- is intended only for the explicit proof wrapper,
scripts/bazel-rbe-proof.sh - is not wired into
.bazelrc, ordinary Just build recipes, ARC runners, or public adoption docs as the default path
The initial countable proof target was //app:build. PR #564 proved that target
through the explicit wrapper with worker image
sha256:be2832171ac69cc9a2d012b3c789e8b765afb7cae0df8f7e9677dd6d8542dbc0;
Bazel reported 2308 processes: 1439 internal, 869 remote, and both
app/sveltekit_sync and app/vite_build exited 0 on the REAPI worker. PR #565
made the proof strict by default with GF_RBE_PROOF_FORCE_EXECUTION=true,
--remote_accept_cached=false, fresh-window worker logs, and cache-hit-only
rejection. A proof counts only when Bazel is invoked with a non-empty executor
endpoint, the explicit wrapper passes both cache and executor flags, and logs
show an action running through the REAPI worker rather than a cache hit or
remote CI runner.
PR #582 added explicit build/test proof selection and proved //app:unit_tests
on the default branch with GF_RBE_PROOF_BAZEL_COMMAND=test. Main run
25601913985 reported 1249 processes: 722 internal, 527 remote, 20 Vitest
test files, 168 passing tests, and remote worker evidence for
external/bazel_tools/tools/test/test-setup.sh app/unit_tests_/unit_tests.
Main run 25602726443 proved //:deployment_bundle with the Bazel build
command, forced execution, 7 processes: 6 internal, 1 remote, and remote
worker evidence for the rules_pkg build_tar action that writes
deployment_bundle.tar.gz.
PR #605 fixed a proof-cell response-contract bug discovered while testing
//examples/hello-go:hello_test: the cell was inlining every declared output
file into ActionResult.OutputFile.contents instead of honoring only
ExecuteRequest.InlineOutputFiles. The fixed cell image,
sha256:bb5455a038bdbff2560f22491c131c2163d3089ffafedee08f937d63f35fa848,
removed the prior 4 MiB Execute-response failure. The follow-up Go proof run
25632300253 then reached real rules_go remote actions before failing in
GoStdlib runtime/cgo with cc: no such file or directory. Run
25634296833 then proved the intentionally pure-Go pure = "on" target with
bazel_command=test, forced execution, 11 remote processes, and remote
test-setup evidence for //examples/hello-go:hello_test. After the worker
image carried the C/C++ wrapper closure, run 25649628233 proved the separate
cgo-backed //examples/hello-go-cgo:cgo_test target with remote
runtime/cgo, cgo compile, link, and test-setup evidence.
Storage Boundary
Do not back this v0 CAS/action-cache with the current RustFS service. RustFS
still has bucket-index reliability debt and remains guarded interim
infrastructure for existing cache/state checks. The disqualification is about
the current evidence, not the product ambition: RustFS returned NoSuchBucket
for existing bucket data/metadata and recovered only after a controlled restart,
and there is no proved signed non-restart repair runbook for that recurrence.
The proof cell should use separate ephemeral local storage for the first
execution proof, then graduate to a separately designed CAS/action-cache
authority only after backend reliability, auth, retention, tenant isolation,
write admission, restore, and observability are selected and proved.
The durability seam for that graduation now exists in code: CAS and
action-cache persistence flow through a provider-neutral BlobStore interface
(internal/cell/blobstore.go) selected by GF_REAPI_BLOBSTORE_BACKEND. The
default local backend preserves the historical service-local filesystem
layout byte-for-byte. The s3 backend (GF_REAPI_S3_ENDPOINT,
GF_REAPI_S3_BUCKET, GF_REAPI_S3_*) targets S3-compatible endpoints without
selecting a provider. The live GloriousFlywheel storage plane already uses
RustFS for existing self-hosted S3-compatible cache/state paths. That does not
automatically promote the current RustFS topology to CAS/AC authority; TIN-1147
must prove repair, restore, retention, failure-domain behavior, and bucket-index
coherence before any RustFS-backed CAS/action-cache namespace is trusted. The
S3 client is dependency-free (Go stdlib plus an AWS SigV4 signer), so no vendor
SDK enters the data path and the Nix vendorHash is unchanged. S3 credentials
need bucket reachability plus object GET, PUT, and HEAD; /readyz
performs a signed HeadBucket. The default backend stays local until an
operator selects and proves an S3 endpoint and namespace for CAS/action-cache.
Age-based garbage collection (W1.3/TIN-1460) is also wired: setting
GF_REAPI_BLOB_TTL (a Go duration, e.g. 168h) enables a background sweeper
that evicts CAS and action-cache entries not recently accessed within the TTL,
scoped strictly to instances/<name>/{cas,ac} so execution scratch, the AC
audit log, and quarantine markers are never touched. GF_REAPI_GC_INTERVAL
sets the sweep cadence (default 1h). The local backend implements the sweep;
the S3 backend deliberately does not — object expiry there belongs to bucket
lifecycle/ILM rules, so the sweeper logs that it is relying on backend policy
and exits. Sweep activity is exported as gf_reapi_gc_runs_total,
gf_reapi_gc_evicted_objects_total, gf_reapi_gc_evicted_bytes_total, and
gf_reapi_gc_errors_total. /readyz returns 503 (with the failing reason)
when the configured blob backend is unreachable.
GC is reconciled with the Bazel client cache lease (W1.3/TIN-1460): Bazel trusts
that a referenced blob stays in the CAS for --experimental_remote_cache_ttl
(default 3h, set explicitly on the ci-cached config). If GC evicted a blob
inside that window it would break a build mid-flight, so when the deployment
declares the served lease via GF_REAPI_MIN_CLIENT_CACHE_TTL, the cell refuses
to start unless GF_REAPI_BLOB_TTL >= that floor. just gf-reapi-cell-lease-gc-reconcile-check is the static CI mirror of that guard.
The first W1.4/TIN-1461 size-bound primitive is present for the local backend:
GF_REAPI_CAS_MAX_BYTES accepts a byte count with optional Ki/Mi/Gi/Ti
suffix. It requires GF_REAPI_MIN_CLIENT_CACHE_TTL, evicts only CAS blobs older
than that lease floor, orders eligible candidates by LRU, and reconciles durable
quota counters after reclamation. It emits
gf_reapi_size_eviction_runs_total,
gf_reapi_size_evicted_objects_total,
gf_reapi_size_evicted_bytes_total,
gf_reapi_size_eviction_errors_total, and the
gf_reapi_evicted_while_referenced_total tripwire that should remain zero by
construction. Sharded/replicated topology, p99-tuned TTL from live dashboards,
and threading request contexts into blob I/O remain separate gates.
The first W4.4/TIN-1475 quota primitive is present as an in-process policy
behind GF_REAPI_QUOTAS. The policy is JSON with a default object and
optional instances map keyed by REAPI instance_name; each entry can set
maxConcurrentExecutions and maxBlobBytes, where zero means unlimited.
Incoming CAS blob-size breaches and Execute concurrency breaches return gRPC
ResourceExhausted and increment
gf_reapi_quota_rejected_total{dimension="execution_concurrency|blob_size"}.
maxCasBytes and maxAcEntries (W4.6/TIN-1718) add durable per-tenant
limits on total stored CAS bytes and action-cache entries. Their counters are
seeded from an authoritative backend scan at startup, maintained live on each
new write (dedup and overwrites do not double-count), and reconciled after every
GC sweep, so the accounting survives process restarts. Breaches emit
gf_reapi_quota_rejected_total{dimension="cas_bytes|ac_entries"}, and current
usage is exported as the gf_reapi_tenant_cas_bytes / gf_reapi_tenant_ac_entries
gauges (the data source for a per-tenant quota dashboard panel, which lives in
the observability lane). Durable accounting requires a usage-scannable backend:
the local backend supports it, while configuring a byte/entry quota on the
s3 backend fails startup because that retention belongs to bucket policy.
The first W4.3/TIN-1474 executor-pool admission primitive is present behind
GF_REAPI_EXECUTOR_POOLS. The policy is JSON with an optional propertyName
(default Pool), a default rule, and optional instances overrides keyed by
REAPI instance_name; each rule lists allowedPools. When a rule has pools,
Execute loads the stored Action, reads its Action.platform exec property,
and rejects missing, duplicated, or unauthorized pool values before AC lookup or
execution. Rejections return gRPC PermissionDenied/InvalidArgument and
increment gf_reapi_executor_pool_rejected_total{reason="missing|unauthorized|ambiguous"}.
Admitted executions then pass through the first scheduler/placement seam, which
records gf_reapi_scheduler_enqueued_total,
gf_reapi_scheduler_started_total,
gf_reapi_scheduler_completed_total, gf_reapi_scheduler_queued,
gf_reapi_scheduler_inflight, and
gf_reapi_scheduler_queue_seconds_bucket/_sum/_count by REAPI
instance_name and pool before acquiring a scheduler worker lease. The first
operator-facing W5.3 fairness view lives in
docs/monitoring/gf-reapi-fairness-dashboard.json; it derives p95 queue time,
max/median tenant skew, queued/running executions, throughput, and worker-pool
saturation from those metrics. The first runner-side W5.4 TTFCH view lives in
docs/monitoring/gf-runner-ttfch-dashboard.json and is fed by
.github/workflows/ttfch-probe.yml rather than by the cell itself.
GF_REAPI_WORKER_POOLS adds the first local
worker-pool dispatch guardrail: a JSON policy with a default rule and
optional pools overrides keyed by admitted executor-pool name, where slots
bounds concurrent local worker leases for that pool and optional static
workers give the lease concrete worker identity/provenance. Saturated pools
queue before worker start, and /metrics exports gf_reapi_worker_pool_slots,
gf_reapi_worker_pool_available_slots, and
gf_reapi_worker_pool_registered_workers. This is measurable scheduler
plumbing and a real local worker-pool inventory boundary.
GF_REAPI_WORKER_REGISTRY_TTL adds the first live heartbeat registry seam.
When enabled with GF_REAPI_WORKER_REGISTRY_TOKEN, proof-cell workers can post
authenticated heartbeats to /worker/heartbeat; non-expired live workers are
preferred for scheduler leases and reflected in Execute worker provenance.
Aggregate known/live/stale/leased/available counts are exported under
gf_reapi_worker_registry_*. This is still an in-memory, single-cell scheduler
registry. It proves live worker identity and heartbeat plumbing, not
distributed remote dispatch or a durable worker-control plane.
Within that proof-local boundary, CAS and action-cache entries are now keyed by
REAPI instance_name. Empty request fields map to default; explicit
spoke-<slug> traffic lands under that spoke’s local namespace; ByteStream
uses the leading path segment for instance routing. This closes the first
routing primitive for TIN-1472.
The first W4.2/TIN-1473 authz primitive is also now present in code, but it is
explicitly opt-in. GF_REAPI_AUTHZ_MODE=warn|enforce enables validation of
RSA-signed OIDC-shaped bearer JWTs from configured JWKS issuers. In enforce
mode, the cell maps RPCs to scopes (cas:Read, cas:Write,
actioncache:Read, actioncache:Write, remoteexecution:Run) and rejects a
request whose token tenant/scope set does not authorize the request
instance_name. Execute can run with remoteexecution:Run, but it only reads
or populates the action cache when the same caller also has the corresponding
actioncache:Read or actioncache:Write scope. This is still not the full
tenant model. The first Bazel credential-helper slice is present as
gf-reapi-credhelper: it implements Bazel’s get protocol, reads a
short-lived JWT from GF_REAPI_CREDENTIAL_HELPER_TOKEN_FILE,
GF_REAPI_CREDENTIAL_HELPER_TOKEN, or the default projected-token path, and
returns an Authorization: Bearer header with an expiry one minute before the
token’s exp claim. That makes projected-token and explicit-token callers able
to exercise the authz middleware without putting credentials into .bazelrc.
Token exchange, default read-only policy, full IAM/OIDC tenant-claim rollout,
remote worker dispatch, and multi-replica behavior remain separate production
gates.
The first W2.1/TIN-1462 AC writer-attestation primitive is also present in code
behind GF_REAPI_AC_WRITE_ATTESTATION_MODE=off|warn|enforce. When enabled,
actioncache:Write is necessary but not sufficient: the caller’s validated JWT
sub must also match GF_REAPI_AC_WRITE_TRUSTED_SUBJECTS. In enforce mode,
direct UpdateActionResult writes from untrusted subjects return
PermissionDenied; Execute still returns the execution result but refuses to
populate the action cache when the writer subject is not trusted. The cell emits
gf_reapi_ac_write_rejected_total for rejected attempts and structured log
lines with the source RPC, instance name, tenant, subject, JTI, action digest,
and reject reason. This is AC-only by design; CAS writes remain governed by
digest verification and tenant authz.
The first W2.3/TIN-1464 AC audit-log primitive is present as a cell-owned JSONL
append log. By default, rows land at
${GF_REAPI_STORE_ROOT}/audit/ac-writes.jsonl; GF_REAPI_AC_AUDIT_LOG_PATH
can point at an absolute path, a store-root-relative path, or off. Each row
records timestamp, source RPC, REAPI instance_name, JWT tenant/subject/JTI,
worker image digest, platform digest, action digest, outcome, gRPC code, reject
reason, and attestation mode. Accepted AC writes fail closed if the audit append
fails. Rejected writes keep the original PermissionDenied behavior while the
cell logs any audit append error. This is durable local evidence and an
in-process tenant query primitive; the operator Resource Usage API,
30-day retention policy, and dashboard surface remain follow-on production
gates.
The first W2.2/TIN-1463 AC entry-tagging primitive is present in code. The
cell strips caller-supplied GF platform tags and stores a server-attached tag in
ActionResult.execution_metadata.auxiliary_metadata with
worker_image_digest, platform_digest, platform_digest_recipe_version,
written_at, writer_subject, and writer_token_id. Direct
GetActionResult reads fail closed with FailedPrecondition when the tag is
missing, malformed, written by a different worker image digest, or mismatched
against the action platform digest. Execute treats the same refusal as an
unsafe cache miss and re-runs the action instead of returning a poisoned AC hit.
Refusals increment gf_reapi_ac_platform_tag_mismatch_total and append a
refused AC audit row. This is still a first primitive: heterogeneous worker
allow-lists, platform-recipe migration policy, and dashboard surfacing remain
follow-on gates.
The first W2.4/TIN-1465 nuke-key primitive is also present for surgical AC
invalidation. scripts/gf-reapi-ac-nuke-key.py nuke requires an
instance_name, an action_digest, and a matching AC audit row by default.
When executed, it backs up and removes exactly one
instances/<instance_name>/ac/<hash>-<size>.pb entry, appends a nuke-key event,
and writes a quarantine tombstone under
instances/<instance_name>/ac-quarantine/<hash>-<size>.json. The server checks
that tombstone on AC writes: direct UpdateActionResult fails with
FailedPrecondition, while Execute returns the remote execution result but
does not populate the quarantined AC key. rollback restores from the backup
and removes the tombstone. See
gf-reapi-cell AC Nuke-Key Runbook.
The first W2.5/TIN-1466 chaos gate is also present as a dedicated workflow and
contract test. just gf-reapi-ac-attestation-chaos-check runs the
TestActionCacheAttestationChaosRejectsNonAttestedWriterWithPermissionDenied
probe and validates the nightly/path-triggered workflow wiring. The probe gives
an authenticated caller actioncache:Write tenant:spoke-alpha but uses a
subject outside GF_REAPI_AC_WRITE_TRUSTED_SUBJECTS. Expected result:
UpdateActionResult returns gRPC PermissionDenied (HTTP 403 equivalent), no
AC entry is created, gf_reapi_ac_write_rejected_total increments, and the AC
audit log records one outcome=rejected / reject_reason=untrusted_subject
row. Authentication failures such as no token or wrong audience still fail
before AC attestation; this chaos gate is the non-attested writer case named by
TIN-1466.
For the live lab lane, the intended deployment boundary is:
- namespace:
gf-rbe - node class:
stingor another explicit compute-expansion KVM/worker lane when available - storage: node-local proof PVC such as
local-path-sting-fast-ephemeral - resource envelope:
4 CPU / 8Girequested and16 CPU / 16Gilimited for the single-cell proof deployment, because web TypeScript fanout can run many remote actions concurrently inside the proof cell. The memory limit preserves the accepted TypeScript proof envelope; the lower request keeps the pod schedulable while ARC runner pods drain after adjacent proof jobs. - residency policy: the committed manifest and local/operator script default
remain scale-to-zero between proof runs;
--applyproof runs temporarily scale the deployment to one replica and wait for rollout. The GitHubgf-reapi-cell-proof.ymlworkflow default keeps the cell resident after a successful apply so GF dogfoods a stable REAPI endpoint for TTFCH and follow-on proofs. Operators can opt back into teardown with thescale_to_zero_after_proofinput. - no RustFS bucket, Attic bucket, or OpenTofu state bucket reuse
The scale-to-zero policy is the TIN-1249 capacity boundary for the committed
manifest and for local/operator bounded proof windows. It exists because a
resident gf-reapi-cell on sting can block the sting-pinned
tinyland-nix-heavy lane used by Platform Proof if its request is treated as
a standing reservation. The current GF-operated workflow intentionally differs:
while TTFCH and RBE dogfood are active, its workflow default keeps the cell
resident so hourly probes and back-to-back proofs do not race a missing service
endpoint. If capacity pressure returns, operators should either dispatch with
scale_to_zero_after_proof=true or move the cell behind the next scheduler /
worker-pool placement gate rather than letting TTFCH silently measure an absent
endpoint.
The reference manifest is deploy/gf-rbe/gf-reapi-cell.yaml. It is guarded by
just gf-reapi-cell-manifest-check, which verifies the proof namespace, local
proof storage class, digest-pinned image shape, platform identity, ingress
boundary, deny-egress policy, and idle replica policy. The proof-window capacity
assumption is guarded by just gf-reapi-cell-capacity-policy-check. The
publication/rendering path is guarded by
just gf-reapi-cell-publish-contract-check.
The image publication path writes registry credentials to a temporary Docker
authfile and passes that file to the nix2container.copyTo/skopeo copy path.
Do not replace that with --dest-creds or another token-bearing argv form:
runner process listings are operator-visible during live incident work, and the
publisher must keep registry write credentials out of child process arguments.
The proof-cell image carries only a bounded worker compatibility runtime. It
includes common POSIX tools, Node, Python, glibc, the Nix C/C++ wrapper
closure, the C++ runtime, zlib, and now the Chromium runtime path so
remote actions can run the currently proved JavaScript, packaging, pure-Go,
cgo-backed Go, Rust, C++, browser/web, and private app typecheck
target-class probes.
The current Worker Toolchain Model records this
runtime envelope. That does not make every language target eligible: target
classes still need forced proof evidence before promotion, and missing worker
runtime dependencies must stay recorded as blockers in
config/rbe-target-eligibility.json.
Render the manifest with the published gf-reapi-cell image digest before
applying it:
GF_REAPI_CELL_DIGEST="$(just gf-reapi-cell-resolve-digest --tag latest)"
GF_REAPI_CELL_DIGEST="${GF_REAPI_CELL_DIGEST}" \
bash ./scripts/render-gf-reapi-cell-manifest.sh | kubectl apply -f -
latest is only a lookup convenience. Proofs and manifests must keep recording
the resolved immutable digest, not a floating tag. The resolver reads GitHub
Packages/GHCR package metadata and rejects cosign signature objects as proof
image inputs.
The end-to-end proof harness records the rendered manifest, Kubernetes status, Bazel output, profile, and worker logs under an evidence directory:
GF_RBE_PROOF_MODE=explicit \
GF_RBE_PROOF_FORCE_EXECUTION=true \
GF_RBE_PROOF_BAZEL_COMMAND=build \
GF_REAPI_CELL_DIGEST=sha256:<published digest> \
bash ./scripts/run-gf-reapi-cell-proof.sh --apply --target //app:build
GF_RBE_PROOF_FORCE_EXECUTION=true is the default workflow posture. It passes
--remote_accept_cached=false, adds --nocache_test_results for test proofs,
injects a non-secret GF_RBE_PROOF_NONCE action environment value so
cache-warm target classes get fresh action keys, and rejects proof runs that
only show remote cache hits, because cache-hit continuity is not fresh
remote-worker evidence.
The current proof cell does not advertise Bazel remote cache compression, so
the proof harness also passes --noremote_cache_compression after consumer
configs. Broad/default RBE must either keep that compatibility override or
teach the production cell to advertise and serve compressed remote cache
traffic before accepting consumer .bazelrc compression defaults.
After a successful run, verify the downloaded artifact before citing it as countable evidence:
bash ./scripts/verify-gf-reapi-proof-artifact.sh \
--evidence-dir /path/to/gf-reapi-cell-proof \
--target //:public_vendor_handoff_fixture \
--require-force-execution \
--require-injected-repo was110_vendor_blobs \
--require-platform gloriousflywheel-rbe-linux-x86_64 \
--require-image-digest sha256:<published digest>
The manual workflow runs this verifier before uploading the artifact.
For future remote test candidates, set the workflow bazel_command input or
GF_RBE_PROOF_BAZEL_COMMAND=test. A build-mode proof of a js_test target is
not remote test evidence.
The first web proof is target=//docs-site:playwright_chromium_smoke with
bazel_command=test on browser-capable image digest
sha256:a567696e341f6eb0589ece9efd6014a2133a4f10831bdad31e8dd84055eff8a0.
Run 25712694947 reported 2549 processes: 1489 internal, 1060 remote,
remote sveltekit_sync, remote vite_build, remote
external/bazel_tools/tools/test/test-setup.sh docs-site/playwright_chromium_smoke_/playwright_chromium_smoke, and
docs-site Playwright Chromium smoke passed with /bin/chromium. The test uses
playwright-core against a system Chromium from the worker image; it must not
download a browser during the action. The smoke harness must also create
writable HOME, XDG_CONFIG_HOME, and XDG_CACHE_HOME directories in remote
scratch space before launching Chromium; the non-root proof cell runs with a
read-only root filesystem, and Chromium crashpad fails without a writable
profile/cache home.
The first public consumer Puppeteer proof is
target=//:puppeteer_chromium_smoke in tinyland-inc/omux.xoxd.ai on the
same browser-capable image digest. Run 25826953857 reported
3162 processes: 1 action cache hit, 1043 remote cache hit, 1982 internal, 137 remote, remote sveltekit_sync, remote vite_build, remote
external/bazel_tools/tools/test/test-setup.sh puppeteer_chromium_smoke_/puppeteer_chromium_smoke, and a passing
puppeteer-core smoke with /bin/chromium. That proof only counts because the
consumer disabled Puppeteer lifecycle browser downloads and launched the pinned
worker Chromium by explicit executable path.
The first public consumer standalone SvelteKit/Vite build proof is
target=//:build in tinyland-inc/omux.xoxd.ai on the same browser-capable
image digest. Run 25891956165 reported
3155 processes: 1 action cache hit, 1173 remote cache hit, 1978 internal, 4 remote, recorded proof nonce 20260514T234057Z-25891956165-1, and showed
remote lifecycle-hook actions for @tailwindcss/oxide and esbuild, remote
sveltekit_sync, and remote vite_build. That proof only counts because the
harness injected the non-secret GF_RBE_PROOF_NONCE action environment value
and the artifact verifier rejected cache-hit-only evidence.
The public omux Playwright static-output proof is
target=//:playwright_chromium_smoke in tinyland-inc/omux.xoxd.ai on the
same browser-capable image digest. Run 25897326537 reported 3162 processes: 1 action cache hit, 1174 remote cache hit, 1982 internal, 6 remote, recorded
proof nonce 20260515T024138Z-25897326537-1, and showed remote
@tailwindcss/oxide and esbuild lifecycle hooks, remote sveltekit_sync,
remote vite_build, remote external/bazel_tools/tools/test/test-setup.sh playwright_chromium_smoke_/playwright_chromium_smoke, remote
generate-xml.sh, and a passing Playwright Chromium smoke with
/bin/chromium. That proof only promotes one public consumer target class; it
does not prove broad Playwright, Vitest browser mode, hosted E2E, or Firefox
(WebKit is now proved separately for one consumer static-smoke class via run
27330688866).
The public jesssullivan.github.io consumer proofs add a second Puppeteer
class, a Playwright runtime-smoke class, and a SvelteKit/Vite build-smoke
class on the same browser-capable image. Runs 25777472760, 25894297074,
and 25779597385 reported 2331 processes: 1477 internal, 855 remote each
with bazel_command=test, forced execution, and remote test-setup evidence
for puppeteer_chromium_smoke, playwright_chromium_smoke, and
sveltekit_vite_build_smoke. The Playwright proof recorded proof nonce
20260515T005745Z-25894297074-1 and remote
test-setup.sh playwright_chromium_smoke_/playwright_chromium_smoke with
exit_code=0. These are target-class proofs only: the Puppeteer and
Playwright proofs depend on disabled browser lifecycle downloads, and the
SvelteKit/Vite proof does not imply publication, deployment, or hosted E2E.
The public jesssullivan.github.io Vitest refresh is
target=//:types_unit_tests on the same browser-capable image digest. Run
25892939448 reported 2331 processes: 1477 internal, 855 remote, recorded
proof nonce 20260515T001050Z-25892939448-1, and showed remote npm
extraction, remote lifecycle-hook actions for esbuild, sharp, and
puppeteer, and remote test-setup.sh types_unit_tests_/types_unit_tests
with exit_code=0. This is a public SvelteKit/Vite/Vitest unit-test target
class only, not broad/default web RBE.
Browser runtime authority for that proof is Chromium
138.0.7204.49 from pkgs.chromium at locked nixpkgs revision
9b008d60392981ad674e04016d25619281550a9d, exposed as
GF_RBE_CHROMIUM_EXECUTABLE=/bin/chromium in the worker image. Playwright and
Puppeteer target classes must consume that or another explicit browser
authority by executablePath; they must not run playwright install,
Puppeteer postinstall Chrome downloads, or npm/pnpm lifecycle browser downloads
inside REAPI actions.
Workflow Consumer Canary
The manual GF REAPI Cell Proof workflow can also run a public-input consumer
canary against the WAS-110 firmware workspace. Dispatch it with:
image_digest: publishedgf-reapi-cellimage digesttarget://:public_vendor_handoff_fixtureconsumer_repository:Jesssullivan/8311-was-110-firmware-builderconsumer_ref:mainby default; override only for consumer repositories whose default branch is still different.was110_public_handoff:trueforce_execution:true
For the Darwin proof wrapper, resolve the same immutable digest before running readiness against the operator-provided macOS REAPI endpoint:
GF_REAPI_CELL_DIGEST="$(just gf-reapi-cell-resolve-digest --tag latest)"
just darwin-rbe-proof-readiness \
--image-digest "${GF_REAPI_CELL_DIGEST}" \
--target //build/macos:darwin_package_release_artifacts_unsigned \
--bazel-command build \
--remote-executor grpcs://<macos-reapi-host>:8980 \
--consumer-repository tinyland-inc/tummycrypt \
--check-gh-workflow \
--probe-endpoint
That path checks out the consumer repo, uses the consumer-owned public input
pinning scripts to materialize and verify vendor-blobs/public-community-repo,
passes it as
GF_BAZEL_INJECT_REPOSITORIES=was110_vendor_blobs=/absolute/vendor/repo, and
sets GF_RBE_PROOF_BAZEL_CONFIG= so the proof does not require
GloriousFlywheel’s .bazelrc inside the consumer workspace.
For private consumer repositories, set require_consumer_app_token=true. The
Actions workflow then requires the existing tranche-proof GitHub App secrets
and mints a repository-scoped checkout token for supported owners
(tinyland-inc and Jesssullivan) with contents: read. The token is used
only for the consumer checkout and is not persisted into the checked-out
workspace. If those secrets are absent, the workflow fails before
actions/checkout instead of producing a misleading RBE failure. Public
consumer proofs should leave
require_consumer_app_token=false; those checkouts use the workflow’s normal
read token and do not mint an App token.
If the GitHub App permission update is not available yet, operators may choose
the explicit alternate authority with
consumer_checkout_authority=repo-scoped-deploy-key or
consumer_checkout_authority=owner-scoped-secret instead of
require_consumer_app_token=true. The deploy-key path is preferred when the
operator can create read-only deploy keys on the consumer repositories. It only
supports fixed per-repo secrets:
GF_REAPI_CONSUMER_CHECKOUT_SSH_KEY_TINYLAND_DEV and
GF_REAPI_CONSUMER_CHECKOUT_SSH_KEY_MASSAGEITHACA. The token path only
supports the fixed owner-scoped secrets
GF_REAPI_CONSUMER_CHECKOUT_TOKEN_TINYLAND_INC and
GF_REAPI_CONSUMER_CHECKOUT_TOKEN_JESSSULLIVAN; the token must be a
repository-scoped read credential for the consumer repo. It is proof-only,
still uses persist-credentials: false, and must not be replaced by a broad
PAT or workflow input secret.
If the token mint step fails with The permissions requested are not granted to this installation., the GitHub App installation does not currently grant
repository Contents: Read-only for the requested consumer repo. That is an
App permission and installation-approval blocker, not a Bazel target, worker,
or remote-execution failure. After the App permission is updated and the
installation is approved, rerun the same dispatch command and evaluate only the
new proof artifact.
Current private consumer evidence is explicit. MassageIthaca run 25928429263
used consumer_checkout_authority=repo-scoped-deploy-key, checked out the
private repo, forced execution, reported 3319 remote processes, and passed
//:booking_operation_unit_tests; that promotes one private booking-operation
Vitest class only. MassageIthaca run 25938855554 then used the same
repo-scoped deploy-key authority, forced execution, proof nonce
20260515T200641Z-25938855554-1, and passed //:svelte_check_test with
3319 remote processes plus remote sveltekit_sync_bin_/sveltekit_sync_bin,
test-setup.sh svelte_check_test_/svelte_check_test, and generate-xml.sh
evidence. That promotes one private SvelteKit/svelte-check class only.
MassageIthaca run 25948484331 used the same repo-scoped deploy-key authority,
forced execution, proof nonce 20260516T005553Z-25948484331-1, and passed
//:tsc_noemit_test with 3319 remote processes plus remote
sveltekit_sync_bin_/sveltekit_sync_bin, test-setup.sh tsc_noemit_test_/tsc_noemit_test, generate-xml.sh, and a 24.2s passing
TypeScript no-emit action. That promotes one private TypeScript no-emit class
only. MassageIthaca run 25953478878 used the same repo-scoped deploy-key
authority, forced execution, proof nonce 20260516T050753Z-25953478878-1, and
passed //:playwright_tmd_smoke with 3318 remote processes plus remote
sveltekit_sync_bin_/sveltekit_sync_bin, vite_build_bin_/vite_build_bin,
test-setup.sh playwright_tmd_smoke_/playwright_tmd_smoke,
generate-xml.sh, and a 4.5s passing Playwright TMD smoke action. That
promotes one private browser-smoke class only.
MassageIthaca run 25983800544 used the same repo-scoped deploy-key
authority, forced execution, proof nonce 20260517T064447Z-25983800544-1, and
passed //:sveltekit_node_build with 3193 remote processes plus remote
lifecycle-hook execution for esbuild, msw, and sharp, remote
sveltekit_sync_bin_/sveltekit_sync_bin, remote
vite_build_bin_/vite_build_bin, proof artifact verifier success, and
Kubernetes restart evidence that stayed at 0. That promotes one private
SvelteKit/Vite production-build class only.
tinyland.dev run 25928429273 also checked out through the
repo-scoped deploy-key path, then failed before target analysis because
the private tinyland-schemas archive URL returned 404 Not Found to
Bazel’s unauthenticated external repository fetch. The v0.2.4 tag/release
exists; the missing piece is private external-input auth, verified distdir
placement, or a future approved mirror/repository-cache authority. PR #682
added the codeload handoff and forced remote-first proof lane, and tinyland.dev
PR #401 fixed the Grafana Vitest Kubernetes-environment assertion. Main proof
25935041748 then passed //packages/tinyland-grafana:test with
1531 processes: 468 remote cache hit, 1059 internal, 4 remote, verified
tummycrypt_tinyland_schemas:0.2.4 in the proof distdir, and remote
test-setup.sh packages/tinyland-grafana/test_/test evidence. That promotes
one private tinyland.dev Grafana package Vitest target class only; it is not
durable private mirror authority or broad tinyland.dev RBE.
The first root tinyland.dev app typecheck proof attempt after the source target
was cleaned is not a successful target-class proof. Main run 25969813133
checked out tinyland-inc/tinyland.dev@main through the GitHub App path,
materialized the private tinyland-schemas distdir input, analyzed
//:app_typecheck, and reported 5472 processes: 2346 remote cache hit, 2988 internal, 138 remote. It failed because the gf-reapi-cell pod OOMKilled under
the old 4Gi memory limit while remote TsProject actions were running. That
run is capacity/observability evidence for the proof cell, not acceptance for
the //:app_typecheck target class.
After the proof-cell memory envelope was corrected, main run 25970619559
passed tinyland-inc/tinyland.dev //:app_typecheck with GitHub App checkout
authority, the verified private tummycrypt_tinyland_schemas:0.2.4 distdir
handoff, forced execution, proof nonce 20260516T191944Z-25970619559-1,
5578 processes: 1 action cache hit, 2567 remote cache hit, 2955 internal, 56 remote, remote TypeScript tsc, remote Svelte build tool, remote Vite build
tool, remote app_typecheck_tool, proof artifact verifier success, and
Kubernetes restart evidence that stayed at 0 before and after the proof.
That promotes one private tinyland.dev root app typecheck target class only;
it is not all tinyland.dev builds, all tinyland.dev tests, browser E2E, the
Vite production build class, durable private mirror/repository-cache authority,
broad/default web RBE, or CAS/action-cache backend suitability.
Run 25978934708 then passed tinyland-inc/tinyland.dev //:app_build with
GitHub App checkout authority, the same verified private
tummycrypt_tinyland_schemas:0.2.4 distdir handoff, forced execution, proof
nonce 20260517T021820Z-25978934708-1, 6146 processes: 3125 remote cache hit, 2959 internal, 62 remote, remote TypeScript package fanout, remote
JsRunBinary app_build.log, proof artifact verifier success, and Kubernetes
restart evidence that stayed at 0 before and after the proof. That promotes
one private tinyland.dev root Vite/SvelteKit production-build target class
only; it is not all tinyland.dev builds/tests, browser E2E, deployed app
behavior, durable private mirror/repository-cache authority, broad/default web
RBE, or CAS/action-cache backend suitability.
Run 25981546207 then passed
tinyland-inc/tinyland.dev //packages/tinyland-activitypub:test with GitHub
App checkout authority, the same verified private
tummycrypt_tinyland_schemas:0.2.4 distdir handoff,
workspace_path=consumer-workspace, forced execution, proof nonce
20260517T044208Z-25981546207-1, 728 processes: 1 action cache hit, 299 remote cache hit, 415 internal, 14 remote, remote esbuild lifecycle-hook
execution, remote TypeScript tsc for packages/tinyland-content-types,
remote test-setup.sh packages/tinyland-activitypub/test_/test, remote
generate-xml.sh, proof artifact verifier success, and Kubernetes restart
evidence that stayed at 0 before and after the proof. That promotes one
private tinyland.dev ActivityPub package Vitest target class only; it is not
all tinyland.dev package tests, browser E2E, deployed app behavior, durable
private mirror/repository-cache authority, broad/default web RBE, or
CAS/action-cache backend suitability.
Run 25984827370 then passed
tinyland-inc/tinyland.dev //packages/tinyland-a11y-engine:typecheck with
GitHub App checkout authority, the same verified private
tummycrypt_tinyland_schemas:0.2.4 distdir handoff,
workspace_path=consumer-workspace, forced execution, proof nonce
20260517T073751Z-25984827370-1, consumer checkout commit
3730c6966d5e069cff92abc7c606fca9db5b54af, 553 processes: 223 remote cache hit, 328 internal, 2 remote, remote esbuild lifecycle-hook
execution, remote TypeScript tsc for packages/tinyland-color-utils, proof
artifact verifier success, and Kubernetes restart evidence that stayed at 0
before and after the proof. That promotes one private tinyland.dev package
TypeScript typecheck target class only; it is not all tinyland.dev package
typechecks, all TypeScript, Vite/SvelteKit builds, durable private
mirror/repository-cache authority, broad/default web RBE, or CAS/action-cache
backend suitability.
Operators can render or dispatch the workflow through:
just gf-reapi-cell-proof-dispatch -- \
--image-digest sha256:<published digest> \
--target //packages/tinyland-grafana:test \
--bazel-command test \
--workspace-path consumer-workspace \
--consumer-repository tinyland-inc/tinyland.dev \
--consumer-ref main \
--consumer-checkout-authority repo-scoped-deploy-key \
--tinyland-schemas-private-handoff \
--apply
Add --dry-run to print the gh workflow run command without dispatching.
The dispatch itself is not RBE evidence; only the uploaded proof artifact can
promote a target class.
--tinyland-schemas-private-handoff mints a GitHub App token scoped to
tinyland-inc/tinyland-schemas, downloads the GitHub codeload tag archive
that matches the BCR-recorded archive/refs/tags/v0.2.4.tar.gz sha256 and
tinyland-schemas-0.2.4/ prefix, verifies that sha256, and places it in
BAZEL_DISTDIR before Bazel starts. It is proof-run staging only: it is not
durable mirror authority, repository-cache retention, CAS/action-cache
authority, or broad/default RBE.
Branch proof 25930423009 showed the first live gate for this path: the
gloriousflywheel GitHub App installation lacked contents: read, so token
minting for tinyland-inc/tinyland-schemas failed before Bazel started. After
that permission was approved, main proof 25932330729 minted the token but
showed that the github.com/.../archive/refs/tags/... web URL still returned
404 to the installation token. The handoff therefore fetches the equivalent
direct codeload tag archive and still verifies the BCR-recorded sha256 before
Bazel sees the file. Branch proof 25932703830 then reached the target and the
test passed, but it did not count as RBE because tinyland.dev’s cache-backed
.bazelrc selected sandboxed,worker,local spawn strategy and remote-local
fallback. Branch proof 25933145419 then forced remote-first spawn strategy
and disabled remote-local fallback. That run reached
//packages/tinyland-grafana:test, reported 1531 processes: 1 action cache hit, 468 remote cache hit, 1059 internal, 4 remote, and executed the Vitest
test action on gf-reapi-cell-ff5f7699f-2td2v, but the remote test failed with
exit_code=1 because tests/grafana-config.test.ts hit
GRAFANA_SERVICE_ACCOUNT_TOKEN is not set under kubernetes environment
semantics. tinyland.dev PR #401 fixed that test-environment assumption, and
main proof 25935041748 passed after the merge with the same codeload handoff
and forced remote-first execution. The codeload distdir handoff is still
proof-run staging, not durable mirror, repository-cache, CAS/action-cache, or
broad RBE authority.
This canary is public-community input evidence only. Private WAS-110 blobs still require a separate private input and worker trust boundary before they can count as product RBE evidence.
Operator Invocation
The proof wrapper is non-default and requires an explicit opt-in:
GF_RBE_PROOF_MODE=explicit \
GF_BAZEL_SUBSTRATE_MODE=executor-backed \
BAZEL_REMOTE_CACHE=grpc://bazel-cache.nix-cache.svc.cluster.local:9092 \
BAZEL_REMOTE_EXECUTOR=grpc://gf-reapi-cell.gf-rbe.svc.cluster.local:8980 \
bash ./scripts/bazel-rbe-proof.sh --target //app:build
For a checked-out consumer repo, keep the same proof wrapper and pass the consumer workspace explicitly:
GF_RBE_PROOF_MODE=explicit \
GF_BAZEL_SUBSTRATE_MODE=executor-backed \
GF_RBE_PROOF_BAZEL_CONFIG= \
BAZEL_REMOTE_CACHE=grpc://bazel-cache.nix-cache.svc.cluster.local:9092 \
BAZEL_REMOTE_EXECUTOR=grpc://gf-reapi-cell.gf-rbe.svc.cluster.local:8980 \
GF_BAZEL_INJECT_REPOSITORIES=was110_vendor_blobs=/absolute/vendor/repo \
bash ./scripts/bazel-rbe-proof.sh \
--workspace /absolute/consumer/workspace \
--target //:public_vendor_handoff_fixture
GF_RBE_PROOF_BAZEL_CONFIG= intentionally omits --config=ci-cached for
consumer workspaces that do not define GloriousFlywheel’s .bazelrc config.
The normal product path is still scripts/bazel-cache-backed.sh; without
BAZEL_REMOTE_EXECUTOR, it remains the shared cache-backed contract. With
BAZEL_REMOTE_EXECUTOR and GF_BAZEL_SUBSTRATE_MODE=executor-backed, the same
wrapper can exercise the opt-in executor-backed path. The landed //app:build,
//app:unit_tests, //:deployment_bundle, //docs-site:build,
//docs-site:playwright_chromium_smoke, the public Puppeteer/SvelteKit
and Playwright consumer proofs, including the public omux Playwright static
smoke, the private MassageIthaca SvelteKit/svelte-check, TypeScript no-emit,
Playwright TMD smoke, and SvelteKit/Vite production-build proofs, and WAS-110
public-input proofs are real RBE implementation evidence, but they are still
narrow target-class proofs, not a product claim that GloriousFlywheel broadly
provides Bazel remote execution.
The latest MassageIthaca TypeScript no-emit proof is
Jesssullivan/MassageIthaca //:tsc_noemit_test. Run 25948484331 used
consumer_checkout_authority=repo-scoped-deploy-key, checked out the private
repo, used bazel_command=test, forced execution, proof nonce
20260516T005553Z-25948484331-1, and the browser-capable worker image
recorded in the manifest. Bazel reported 7662 processes: 4 action cache hit, 4343 internal, 3319 remote; worker logs show remote lifecycle-hook execution
for esbuild, sharp, @sparticuz/chromium, msw, and
@vercel/speed-insights, remote sveltekit_sync_bin_/sveltekit_sync_bin,
remote external/bazel_tools/tools/test/test-setup.sh tsc_noemit_test_/tsc_noemit_test, and remote generate-xml.sh. The test
passed in 24.2s. This proves one private TypeScript no-emit target class
only, not all
MassageIthaca tests, Playwright/Puppeteer browser tests, deployed flows, or
broad/default web RBE.
The latest MassageIthaca Playwright TMD proof is
Jesssullivan/MassageIthaca //:playwright_tmd_smoke. Run 25953478878 used
consumer_checkout_authority=repo-scoped-deploy-key, checked out consumer
commit 08555e16b9ee0504b1b23e6373b5b6bbfb799f5f, used
bazel_command=test, forced execution, proof nonce
20260516T050753Z-25953478878-1, and the browser-capable worker image
recorded in the manifest. Bazel reported 7670 processes: 3 action cache hit, 4352 internal, 3318 remote; worker logs show remote
sveltekit_sync_bin_/sveltekit_sync_bin, remote
vite_build_bin_/vite_build_bin, remote
external/bazel_tools/tools/test/test-setup.sh playwright_tmd_smoke_/playwright_tmd_smoke, and remote generate-xml.sh. The
test passed in 4.5s. This proves one private Playwright TMD browser-smoke
target class only, not all MassageIthaca tests, deployed flows, or
broad/default web RBE.
The latest MassageIthaca production-build proof is
Jesssullivan/MassageIthaca //:sveltekit_node_build. Run 25983800544 used
consumer_checkout_authority=repo-scoped-deploy-key, checked out consumer
commit e06a70d12417f04568092a62e225b6c6595c3b39, used
bazel_command=build, forced execution, proof nonce
20260517T064447Z-25983800544-1, and the browser-capable worker image
recorded in the manifest. Bazel reported 7379 processes: 2 action cache hit, 4186 internal, 3193 remote; worker logs show remote lifecycle-hook execution
for esbuild, msw, and sharp, remote
sveltekit_sync_bin_/sveltekit_sync_bin, and remote
vite_build_bin_/vite_build_bin. The proof artifact verifier passed and
Kubernetes restart evidence stayed at 0. This proves one private
SvelteKit/Vite production-build target class only, not all MassageIthaca
builds/tests, deployed booking E2E, image publication, durable private mirror
authority, or broad/default web RBE.
The latest tinyland.dev package typecheck proof is
tinyland-inc/tinyland.dev //packages/tinyland-a11y-engine:typecheck. Run
25984827370 used consumer_checkout_authority=github-app,
workspace_path=consumer-workspace, checked out consumer commit
3730c6966d5e069cff92abc7c606fca9db5b54af, staged the verified private
tummycrypt_tinyland_schemas:0.2.4 distdir input, used
bazel_command=build, forced execution, proof nonce
20260517T073751Z-25984827370-1, and the browser-capable worker image
recorded in the manifest. Bazel reported 553 processes: 223 remote cache hit, 328 internal, 2 remote; worker logs show remote esbuild lifecycle-hook
execution and remote TypeScript tsc for packages/tinyland-color-utils. The
proof artifact verifier passed and Kubernetes restart evidence stayed at 0.
This proves one private package TypeScript typecheck target class only, not all
tinyland.dev packages, all TypeScript, durable private mirror authority, or
broad/default web RBE.
Main run 25608601158 proved //docs-site:build with the Bazel build
command, forced execution, 2529 processes: 1483 internal, 1046 remote, and
remote JsRunBinary evidence for docs-site/.svelte-kit and
docs-site/build. This proves static docs-site rendering only, not docs
publication or deployment. Its earlier default-branch proof attempt, run
25607350105, remains inventory evidence only because it failed before remote
execution on the old parent-package markdown glob.