Why We Automated 3 Drafts/Day but Refused to Automate Publishing

Problem Statement #

We needed a way to convert daily repository activity into reproducible technical drafts without compromising credibility. The immediate risk in end-to-end automation is silent publication of weak or misinterpreted content when evidence is thin or ambiguous. Today’s environment highlights this tension: working tree change signals are empty while commit history shows activity, which can lead a generator to overreach.

Observed environment:

LOCAL_MODE=0; node=v22.16.0; platform=darwin
git status: clean working tree
git diff (all forms): none
HEAD: bcaf89fcf12c811b29b99e76a475526860573296
git log since 2026-01-07T00:00:00: 15 commits (metadata available, content not enumerated here)
git log --name-only includes:
blogs/20260107_01_Shipping-a-Feature-with-0-Tracked-Line-Changes-Why-Reproducibility-Improves-but-Coverage-Does-Not.blog.json
blogs/20260107_01_Shipping-a-Feature-with-0-Tracked-Line-Changes-Why-Reproducibility-Improves- (truncated in source)

Constraints:

Drafts must be generated strictly from repository evidence in the current workspace.
The system is configured to target 3 drafts per day but must honor evidence scarcity (skip policy).
Reproducibility artifacts are required (.md + .blog.json).

Options Considered #

1) Full automation (generate and auto-publish)

Description: On a schedule, produce N posts/day and publish immediately if formatting passes.
Benefits considered: Maximum throughput; minimal human load.
Risks: Credibility erosion from misinterpretation; propagation of weak titles; difficulty rolling back once syndicated; no chance to apply editorial judgment on low-signal days.
Status: Rejected.

2) Manual-only process (no automation)

Description: Human authors write and publish based on their assessment of changes.
Benefits: High judgment control; zero accidental publication risk.
Risks: Inconsistent cadence; reduced coverage of routine changes; lower reproducibility and auditability; higher human cost.
Status: Rejected.

3) Partial automation with human-in-the-loop publishing

Description: Automate draft generation under a v1.1 spec with a skip policy, KPI-to-article mapping, title quality gate, and misinterpretation QA; publication remains manual.
Benefits: Deterministic drafts and artifacts; human review retains final control; trust can be earned gradually.
Risks: Additional handoff cost; slower end-to-end time; possible skipped days.
Status: Selected.

4) Single artifact output (.md only)

Description: Emit one human-readable file per draft.
Benefits: Fewer files; simpler diffs.
Risks: Loss of machine-readable provenance; harder to audit, replay, or aggregate.
Status: Rejected.

5) Paired artifacts (.md + .blog.json)

Description: Emit a Markdown draft and a structured metadata file for provenance and gating signals.
Benefits: Reproducibility, verifiable inputs/outputs, easier policy enforcement.
Risks: Increases file count; higher risk of merge noise.
Status: Selected.

Decision #

We separated generation responsibility from publication responsibility. Drafts are automated; publishing is explicitly manual.
We implemented a skip policy so the generator may produce zero drafts when evidence is insufficient, even with a 3 drafts/day target.
We enforced a title quality gate; drafts with weak or ambiguous titles are non-publishable until corrected.
We adopted paired reproducibility artifacts (.md + .blog.json) for every draft.
We bound generation to repository evidence only (working tree and recent log) and added a misinterpretation QA step tied to that evidence.

Rationale #

The contrarian choice was to reject the obvious approach—automating publishing—because the failure mode is asymmetric. A single low-signal day that auto-publishes a confident but incorrect article can degrade trust more than several skipped days. Today’s signals illustrate the point: repository diff metrics report none, while the log reports 15 commits and at least one blog artifact path. Under such ambiguity, an automated publisher would be forced to make an editorial decision without context from CI or production telemetry, which this report does not include.

Separating generation from publication creates a clear control point. The generator is deterministic and bounded by evidence it can prove. The human-in-the-loop gate absorbs ambiguity: when titles are weak, when KPI-to-article mapping is indirect, or when log metadata conflicts with working tree diffs. The title gate is a simple but effective proxy for narrative clarity and avoids accidental promotion of drafts with unclear claims. The skip policy is a trust mechanism: producing nothing is preferable to fabricating significance.

Choosing paired artifacts over a simpler single file makes audits possible. The .blog.json records the inputs (e.g., commit SHAs, generation parameters, gating outcomes), while the .md captures the human-facing narrative. This increases file volume but provides the necessary substrate for reproducibility and later analysis.

Trade-offs #

Throughput vs. credibility: Manual publication imposes a human bottleneck. On high-volume days, not all drafts will be published. This is intentional but reduces aggregate output.
File count vs. auditability: Two artifacts per draft increase repository noise and may create merge conflicts. The benefit is transparent provenance.
Cadence vs. evidence fidelity: The 3 drafts/day target is a ceiling, not a quota. The skip policy means some days may yield zero drafts, or just one, as suggested by today’s log entries. This can make the stream appear uneven.
Simplicity vs. control: A single, end-to-end job would be simpler operationally. Our separation introduces orchestration complexity (gates, QA, handoffs).
Known risks:
Title gate false negatives may block otherwise valid drafts.
Over-reliance on working tree and local log can miss context from external systems (CI runs, production telemetry), which this report does not include.
Determinism may drift across environments if Node.js or platform differences slip through; we pinned node=v22.16.0 today, but cross-machine reproducibility is not proven here.
The source list shows a truncated file path for one blog artifact; evidence is incomplete and could reflect logging or parsing issues.

This concludes today’s record of self-evolution. The interpretation of these observations is left to the reader.

Why We Automated 3 Drafts/Day but Refused to Automate Publishing

Problem Statement#

Options Considered#

Decision#

Rationale#

Trade-offs#

Problem Statement #

Options Considered #

Decision #

Rationale #

Trade-offs #