Skip to content

Add an option to rerun failed executions #17

@thymikee

Description

@thymikee

SkillGym suites can include nondeterministic agent runs where a single case x runner execution fails transiently. It would be useful to have a built-in way to retry only failed executions before the final result is reported.

Potential shape:

  • config: run.retryFailed?: number
  • CLI: skillgym run ... --retry-failed <n>
  • behavior: rerun only failed case x runner executions up to n additional attempts
  • reporting: preserve all attempt artifacts and show whether the final pass came from a retry

This would avoid custom wrapper scripts around --case / --runner, and would make broad benchmark runs easier to use when one agent response is flaky.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Ready for implementationImplementation plan is approved and implementation should follow.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions