SafeDisk AI

Worker File Spill Disk Full Silent Truncation

When a workflow worker spills large payloads to disk, a swallowed write error can be worse than a crash: downstream nodes may receive a short preview while operators believe the full value was persisted.

Agent workflow storage

Turn silent spill failures into visible, bounded fallback behavior.

Preserve the runtime fallback if that is intentional, but log the failed write with the key, spill path, exception, payload size, and session storage health. Then test ENOSPC, permission denial, and missing directory separately.

df -h "$SESSION_DATA_DIR"; df -i "$SESSION_DATA_DIR"; test -w "$SESSION_DATA_DIR"

Policy pilot

Make one silent spill failure testable.

$99 for one workflow or agent incident: failure taxonomy, logging contract, and regression checklist.

Request $29 incident triage

No payload contents, secrets, or private workflow logs. A public-safe summary is enough.

First Response Runbook

  1. Keep the fallback explicit: downstream nodes should know whether they received full payload, file reference, or truncated preview.
  2. Log spill-write failure with the buffer key, intended path, payload byte size, and exception details.
  3. Classify ENOSPC, EDQUOT, EACCES, EPERM, EIO, and missing session data directory separately.
  4. Emit a metric for spill failures so operators can alert before workflows silently degrade.
  5. Add tests for disk full, permission denied, and invalid path; assert the warning and fallback contract.
  6. Keep payload contents out of logs. Log size, key, path, and error, not the value itself.
Copy-ready issue reply

Use this when file-spill errors are swallowed.

The goal is to preserve operator visibility while keeping private payloads out of logs.

I would keep the fallback, but make the fallback contract visible. A swallowed spill-write error means downstream nodes may receive a truncated preview while operators think the full value was persisted.

Acceptance checks I would add:

- Inject ENOSPC/EACCES/EIO on the session data directory and assert a warning includes key name, intended spill path, payload size, and exception.
- Assert the downstream payload explicitly marks whether it is full inline data, a file reference, or a truncated preview.
- Add a metric/count for spill failures so operators can alert before many transitions degrade silently.
- Keep payload contents out of logs; log size/path/error only.
- Add one regression where disk fills mid-run and the workflow still exposes the degraded state.
Request policy review

Paid Scope

The $29 incident triage reviews one failure and returns the safest next diagnostic step. The $99 team pilot turns one representative silent spill incident into a logging contract, alert rule, and regression checklist for your workflow or agent runtime.