|
|
4 days ago | |
|---|---|---|
| .. | ||
| .claude-plugin | 4 days ago | |
| skills | 4 days ago | |
| LICENSE | 4 days ago | |
| README.md | 4 days ago | |
Competition math solver with adversarial verification.
Self-verification gets fooled. A verifier that sees the reasoning is biased toward agreement. arXiv:2503.21934 ("Proof or Bluff") showed 85.7% self-verified IMO success drops to <5% under human grading.
17/18 IMO+Putnam 2025 problems solved, 0 false positives, 2 novel proofs found. See the skill's eval data in the anthropic monorepo.
/plugin install math-olympiad@claude-plugins-official
> Solve this IMO problem: [statement]
The skill auto-triggers on "IMO", "Putnam", "olympiad", "verify this proof", etc.