fix(aok): three critical fixes from run #24824838890 critic review by Copilot · Pull Request #28106 · github/gh-aw

Copilot · 2026-04-23T14:03:04Z

Three critical bugs identified in the critic review of consolidated Agentic Optimization Kit run #24824838890. Each fix is a separate commit, clearly isolated by root cause.

Fix 1 — Percentage formatting (`f2015a6`)

Bug: The escalation issue body contained "875%" where "87.5%" was meant (7 failures out of 8 runs). The agent computed 87.5 correctly but dropped the decimal point, producing a nonsensical value.

Root cause: Phase 6 had no formatting instruction for failure rate percentages.

Fix: Added an explicit formula to the "Issue must:" line in Phase 6:

"divide failures by total runs and multiply by 100, then format to one decimal place (e.g., 7/8 = 87.5%, not 875%). Verify every percentage is in [0%, 100%]."

Also added a matching Guardrails bullet that applies the same rule to the discussion body.

Fix 2 — Period date accuracy (`85cfd7c`)

Bug: The executive summary stated "Period: 2026-04-22 to 2026-04-23 (7-day snapshot)" — a 1-day span incorrectly labelled as a 7-day window.

Root cause: The prompt template used YYYY-MM-DD to YYYY-MM-DD placeholders with no authoritative data source. The agent inferred start/end from the earliest/latest run timestamps in the dataset rather than computing from the current date. The --start-date -7d flag used in the download step was never captured anywhere the agent could read it.

Fix:

At the top of the "Download Copilot workflow logs" step, compute PERIOD_START=$(date -u -d "7 days ago" +%Y-%m-%d) and PERIOD_END=$(date -u +%Y-%m-%d) and write them to /tmp/gh-aw/token-audit/period.env.
Add period.env to the Data Inputs list so the agent knows the file exists.
Update the discussion template line to: "PERIOD_START to PERIOD_END (7-day window) — read the exact start and end dates from /tmp/gh-aw/token-audit/period.env via cat; do not infer these dates from run timestamps".

Fix 3 — Delivery status transparency (`98750f4`)

Bug: The agent's completion summary stated "Phase 6: Created escalation issue for Smoke Copilot (P1) and Architecture Diagram Generator (P2)" — but the safe_outputs job failed with a GitHub API rate limit (HTTP 403, retried 4×). No issue was ever posted. The agent reported false success.

Root cause: The safeoutputs MCP create_issue/create_discussion returns success when it queues the action, not when GitHub delivers it. The actual delivery happens in the downstream safe_outputs job after the agent session ends and can fail independently. The agent had no awareness of this.

Fix:

Added a Delivery note block at the end of Phase 6 explaining the async queue model and instructing the agent to say "submitted for delivery via safe-outputs" instead of "created" in completion summaries.
Added a matching Guardrails bullet to reinforce this across the full session.

All three changes are in .github/workflows/agentic-optimization-kit.md only. The workflow compiles cleanly (0 errors, 0 warnings). Code review passed with no comments.

🤖 Smoke CI run #24839692815 completed successfully on 2026-04-23T14:04:11Z.

Generated by Smoke CI · ● 397.9K · ◷

…ypos The escalation issue generated in run #24824838890 contained "875%" where "87.5%" was meant (7/8 run failures). This happened because there was no explicit instruction to format failure rates correctly. Two guardrails added to agentic-optimization-kit.md: 1. In Phase 6 "Issue must:" line — explicit formula (failures/runs×100, 1 decimal place, range-checked to [0%,100%]). 2. In the Guardrails section — a short restatement for the discussion. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/7e192d3c-abd2-4d17-8f65-21635650eb32 Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Run #24824838890 published "Period: 2026-04-22 to 2026-04-23 (7-day snapshot)" — a 1-day span labelled as 7 days. The agent inferred dates from the run dataset timestamps instead of computing them arithmetically. Changes in agentic-optimization-kit.md: - Add date computation at top of "Download logs" step: writes PERIOD_START (today-7d) and PERIOD_END (today) to period.env - Add period.env to the Data Inputs inventory so the agent knows the file exists - Update discussion template from placeholder "YYYY-MM-DD to YYYY-MM-DD" to an explicit instruction: read dates from period.env via cat and do not infer them from run timestamps Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

…ess claims Run #24824838890's agent concluded "Created escalation issue for Smoke Copilot (P1)" — but the safe_outputs job failed with a GitHub API rate limit, so no issue was ever posted. The agent saw the safeoutputs MCP return success (it only queues the action), so it had no way to know delivery had failed. Changes in agentic-optimization-kit.md: - Add a Delivery note block at the end of Phase 6: explains that create_issue/create_discussion via safeoutputs MCP only queues the item; the actual API call happens in the downstream safe_outputs job. Instructs the agent to say "submitted for delivery via safe-outputs" rather than "created" in completion summaries. - Add a matching Guardrails bullet: "Safe-output delivery is asynchronous — use submitted for delivery instead of created." Agent-Logs-Url: https://github.com/github/gh-aw/sessions/7e192d3c-abd2-4d17-8f65-21635650eb32 Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>

Copilot AI and others added 3 commits April 23, 2026 13:53

Copilot AI assigned Copilot and mnkiefer Apr 23, 2026

Copilot created this pull request from a session on behalf of mnkiefer April 23, 2026 14:04 View session

Copilot finished work on behalf of mnkiefer April 23, 2026 14:05

Copilot AI requested a review from mnkiefer April 23, 2026 14:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(aok): three critical fixes from run #24824838890 critic review#28106

fix(aok): three critical fixes from run #24824838890 critic review#28106
Copilot wants to merge 3 commits intomainfrom
copilot/compare-latest-runs-agents

Copilot AI commented Apr 23, 2026 •

edited by github-actions Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix 1 — Percentage formatting (f2015a6)

Fix 2 — Period date accuracy (85cfd7c)

Fix 3 — Delivery status transparency (98750f4)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 23, 2026 •

edited by github-actions Bot

Loading

Fix 1 — Percentage formatting (`f2015a6`)

Fix 2 — Period date accuracy (`85cfd7c`)

Fix 3 — Delivery status transparency (`98750f4`)