Sakana Fugu vs Claude Opus 4.8 vs GPT-5.5 – Direct Coding Comparison
Sakana Fugu Ultra reports 73.7% on SWE-Bench Pro, beating both Claude Opus 4.8 and GPT-5.5. There is a structural reason to be skeptical of that number β Fugu routes queries to those exact models internally. Here is the full breakdown.





