Single, no cache
2.30B NPS
gigantua 1.6
A competition for the fastest move generator in the world. Engines submit a UCI-compatible binary, get validated against the standard perft corpus, and are ranked by throughput across four threading and cache configurations.
Last benchmarked:
Fastest engine in each of the four benchmark configurations. Throughput is the mean of best NPS across the six canonical positions — each timed at the deepest depth it can complete inside a fixed per-position budget.
Single, no cache
2.30B NPS
gigantua 1.6
Single, with cache
6.39B NPS
tgct_engine 0.1.15-r2r
Multi, no cache
33.2B NPS
perft_cpu_2026 main
Multi, with cache
361.2B NPS
perft_cpu_2026 main
One row per engine version. Bold cells mark the fastest engine in that mode. A dash (—) means the engine did not run that configuration. Click a column header to sort — missing values always sort last.
| Engine | Language | Version | Single, no cache | Single, with cache | Multi, no cache | Multi, with cache |
|---|---|---|---|---|---|---|
| gigantua | C++ | 1.6 | 2,304,487,198 (2.30B) | — | — | — |
| perft_cpu_2026 | C++ | main | 1,935,787,602 (1.94B) | 4,806,229,072 (4.81B) | 33,219,331,350 (33.2B) | 361,240,881,390 (361.2B) |
| chessbit | C++ | main | 1,895,360,906 (1.90B) | — | — | — |
| zerologic | C++ | 1 | 1,654,727,109 (1.65B) | 1,654,526,743 (1.65B) | — | — |
| mperft | C | 0be02ae | 1,541,075,983 (1.54B) | 4,751,986,393 (4.75B) | 26,822,399,079 (26.8B) | 171,925,899,207 (171.9B) |
| tgct_engine | C# | 0.1.15-r2r | 913,757,226 (913.8M) | 6,388,399,775 (6.39B) | 31,743,545,721 (31.7B) | 174,302,933,252 (174.3B) |
| caps | Rust | main | 789,354,941 (789.4M) | — | 17,394,610,319 (17.4B) | — |
| stockdory | C++ | cd237fc32a | 715,360,169 (715.4M) | — | 8,801,601,199 (8.80B) | — |
| clover | C++ | 9.1 | 652,228,686 (652.2M) | — | 647,103,120 (647.1M) | — |
| xiphos | C | master | 621,677,116 (621.7M) | — | — | — |
| cozy-chess | Rust | 0.3.4 | 602,090,829 (602.1M) | — | — | — |
| rose | C++ | main | 593,455,467 (593.5M) | — | — | — |
| potential | C | 3.40.88 | 566,946,992 (566.9M) | — | — | — |
| pawnocchio | Zig | 2.0-dev-a8f09e7 | 488,658,466 (488.7M) | — | — | — |
| raphael | C++ | 4.2.0-dev-87999df | 470,541,469 (470.5M) | — | — | — |
| chessnut | C++ | master | 463,630,235 (463.6M) | — | — | — |
| berserk | C | 13 | 461,515,307 (461.5M) | — | — | — |
| halogen | C++ | 13 | 431,256,530 (431.3M) | — | — | — |
| maestro | C++ | 1.2 | 401,516,637 (401.5M) | — | — | — |
| stash-bot | C | master | 378,347,944 (378.3M) | — | — | — |
| shakmaty | Rust | 0.27.3 | 370,826,239 (370.8M) | — | — | — |
| prune | C++ | dev | 352,422,736 (352.4M) | — | — | — |
| stormphrax | C++ | 7.0.108 | 352,063,841 (352.1M) | — | — | — |
| stockfish | C++ | 18 | 295,489,879 (295.5M) | — | — | — |
| jordanbray-chess | Rust | 3.2.0 | 284,905,153 (284.9M) | — | — | — |
| horsie | C++ | 1.1.7 | 254,493,846 (254.5M) | — | — | — |
| viridithas | Rust | 20.0.0-dev | 193,644,727 (193.6M) | — | — | — |
| quanticade | C | main | 124,070,057 (124.1M) | — | — | — |
| rubichess | C++ | 20260308 | 52,487,313 (52.5M) | — | 657,069,262 (657.1M) | — |
| ethereal | C | 14.40 (PEXT) | 46,946,368 (46.9M) | — | — | — |
| plentychess | C++ | 7.0.66 | 18,064,943 (18.1M) | — | — | — |
Pick an engine, mode, and position to see how NPS scales with depth — and where subprocess startup amortizes into raw move-generation throughput.
Loading run data…
Three steps from binary to ranked entry. The full harness, position set, and verification rules live in the PerftWar directory.
Any move generator that speaks the standard UCI go perft N protocol is eligible. Open the door on Discord to register and coordinate the upload.
Every entry runs through perftcheck against ~28,000 known-correct positions. Any wrong node count disqualifies the run — speed is the tiebreaker among engines that already pass.
Validated engines are run on each of six TGCT-canonical positions, walking depths from d1 upward under a per-position wall-clock budget. We record the NPS at the deepest depth that completes, then mean those across positions for each of the four threading / cache configurations.
A few things to keep in mind when reading these numbers:
Preview hardware
Until the bare-metal Ubuntu reference host is online, some entries are captured on developer hardware — look for version tags like -local. Expect absolute NPS to shift once the reference rig takes over.
Subprocess overhead
Each measurement spawns a fresh process so cache state is deterministic. At depth 4 the spawn cost can dominate — particularly for very small positions like pos3 (43k nodes). Depth 6 numbers are the more reliable cross-engine signal.
Partial mode coverage
Engines opt in to the modes they support. The reference engine currently exposes only cached configurations; a future build will fill in the no-cache modes once that switch lands.
Strict correctness first
Every subprocess's stdout is scanned for the expected node count. A single mismatch fails the whole engine run — no NPS is recorded for a disqualified entry.
The board is in preview with two engines. If you've got a move generator you'd like to put in the ring — in any language, any architecture, any style — drop into the project Discord and let us know.
The reference engine in MoveGen/ gives one starting point — but the leaderboard isn't tied to any one architecture. Magic bitboards, PEXT, classical ray-scan, 0×88 — all welcome.