Repeatability
High
The task is structurally identical every time: poll an API, classify results by a fixed taxonomy, write a JSON file to S3. No judgment varies between runs.
Ambiguity Tolerance
High
Success criteria are explicit — specific failure reason categories, 6-hour polling interval, 30-day window, S3 export, retry logic, exponential backoff, and logging are all named. An agent can verify its own output against these requirements.
Data & Tool Availability
High
Stripe has a well-documented REST API with a Python SDK, and boto3 handles S3 writes. The agent needs API keys and IAM credentials passed in, but the tooling itself is mature and well-represented in training data.
Error Cost
Low
This is a read-only reporting script — it fetches and classifies data, it does not initiate charges or mutations. A buggy script produces a wrong report, not a financial incident, and can be corrected and re-run.
Human Judgment Required
Low
Classification of Stripe failure codes is deterministic — Stripe returns machine-readable reason strings. No taste, ethics, or relationship context is needed to write or run this script.