Evolution Arena uses battle-tested AI to make your system prompts stronger. Submit, battle, evolve.
Most prompt engineering is guesswork. We apply structured selection pressure — the same mechanism that's been producing better results for 3.8 billion years.
Drop in your current system prompt and a couple of questions that represent how you actually use it.
Two variants of your prompt compete on your questions. They respond, rebut, and get scored by an impartial AI judge.
The winner survives. The best ideas from the loser get absorbed. A stronger version enters the next round.
All battle learnings distilled into one clean, improved prompt. Ready to drop in wherever you need it.
Every battle round is scored by an independent AI judge on 5 criteria:
Actual outputs from real runs — not hand-picked examples. Click a demo below to explore the results.
Submit and poll. No SDKs required — just standard HTTP. Integrate prompt evolution directly into your CI/CD or agent pipelines.
# 1. Submit your prompt curl -X POST /optimize \ -H "Content-Type: application/json" \ -d '{ "system_prompt": "You are a support agent...", "test_questions": [ "My order is 2 weeks late", "I want a refund now" ], "n_rounds": 3 }' # Returns immediately: { "job_id": "a3f9b2...", "status": "pending" }
# 2. Poll for results curl /jobs/a3f9b2... # When done: { "status": "done", "result": { "winner_prompt": "You are a support agent who leads with empathy, then resolution...", "score_delta": 6.3, "improvement_notes": "...", "rounds": [...] } }
No surprises. A 3-round optimization costs fractions of a cent in API calls.