Skip to content

CPU Leaderboard

Preview — 31 engines

A competition for the fastest move generator in the world. Engines submit a UCI-compatible binary, get validated against the standard perft corpus, and are ranked by throughput across four threading and cache configurations.

Last benchmarked:

Best in mode

Fastest engine in each of the four benchmark configurations. Throughput is the mean of best NPS across the six canonical positions — each timed at the deepest depth it can complete inside a fixed per-position budget.

Single, no cache

2.30B NPS

gigantua 1.6

Single, with cache

6.39B NPS

tgct_engine 0.1.15-r2r

Multi, no cache

33.2B NPS

perft_cpu_2026 main

Multi, with cache

361.2B NPS

perft_cpu_2026 main

Full results

One row per engine version. Bold cells mark the fastest engine in that mode. A dash (—) means the engine did not run that configuration. Click a column header to sort — missing values always sort last.

Engine Language Version Single, no cache Single, with cache Multi, no cache Multi, with cache
gigantua C++ 1.6 2,304,487,198 (2.30B)
perft_cpu_2026 C++ main 1,935,787,602 (1.94B)4,806,229,072 (4.81B)33,219,331,350 (33.2B)361,240,881,390 (361.2B)
chessbit C++ main 1,895,360,906 (1.90B)
zerologic C++ 1 1,654,727,109 (1.65B)1,654,526,743 (1.65B)
mperft C 0be02ae 1,541,075,983 (1.54B)4,751,986,393 (4.75B)26,822,399,079 (26.8B)171,925,899,207 (171.9B)
tgct_engine C# 0.1.15-r2r 913,757,226 (913.8M)6,388,399,775 (6.39B)31,743,545,721 (31.7B)174,302,933,252 (174.3B)
caps Rust main 789,354,941 (789.4M)17,394,610,319 (17.4B)
stockdory C++ cd237fc32a 715,360,169 (715.4M)8,801,601,199 (8.80B)
clover C++ 9.1 652,228,686 (652.2M)647,103,120 (647.1M)
xiphos C master 621,677,116 (621.7M)
cozy-chess Rust 0.3.4 602,090,829 (602.1M)
rose C++ main 593,455,467 (593.5M)
potential C 3.40.88 566,946,992 (566.9M)
pawnocchio Zig 2.0-dev-a8f09e7 488,658,466 (488.7M)
raphael C++ 4.2.0-dev-87999df 470,541,469 (470.5M)
chessnut C++ master 463,630,235 (463.6M)
berserk C 13 461,515,307 (461.5M)
halogen C++ 13 431,256,530 (431.3M)
maestro C++ 1.2 401,516,637 (401.5M)
stash-bot C master 378,347,944 (378.3M)
shakmaty Rust 0.27.3 370,826,239 (370.8M)
prune C++ dev 352,422,736 (352.4M)
stormphrax C++ 7.0.108 352,063,841 (352.1M)
stockfish C++ 18 295,489,879 (295.5M)
jordanbray-chess Rust 3.2.0 284,905,153 (284.9M)
horsie C++ 1.1.7 254,493,846 (254.5M)
viridithas Rust 20.0.0-dev 193,644,727 (193.6M)
quanticade C main 124,070,057 (124.1M)
rubichess C++ 20260308 52,487,313 (52.5M)657,069,262 (657.1M)
ethereal C 14.40 (PEXT) 46,946,368 (46.9M)
plentychess C++ 7.0.66 18,064,943 (18.1M)

Drill down by run

Pick an engine, mode, and position to see how NPS scales with depth — and where subprocess startup amortizes into raw move-generation throughput.

Loading run data…

How it works

Three steps from binary to ranked entry. The full harness, position set, and verification rules live in the PerftWar directory.

1

Submit your engine

Any move generator that speaks the standard UCI go perft N protocol is eligible. Open the door on Discord to register and coordinate the upload.

2

Compliance check

Every entry runs through perftcheck against ~28,000 known-correct positions. Any wrong node count disqualifies the run — speed is the tiebreaker among engines that already pass.

3

Benchmark and rank

Validated engines are run on each of six TGCT-canonical positions, walking depths from d1 upward under a per-position wall-clock budget. We record the NPS at the deepest depth that completes, then mean those across positions for each of the four threading / cache configurations.

Caveats & methodology notes

A few things to keep in mind when reading these numbers:

  • Preview hardware

    Until the bare-metal Ubuntu reference host is online, some entries are captured on developer hardware — look for version tags like -local. Expect absolute NPS to shift once the reference rig takes over.

  • Subprocess overhead

    Each measurement spawns a fresh process so cache state is deterministic. At depth 4 the spawn cost can dominate — particularly for very small positions like pos3 (43k nodes). Depth 6 numbers are the more reliable cross-engine signal.

  • Partial mode coverage

    Engines opt in to the modes they support. The reference engine currently exposes only cached configurations; a future build will fill in the no-cache modes once that switch lands.

  • Strict correctness first

    Every subprocess's stdout is scanned for the expected node count. A single mismatch fails the whole engine run — no NPS is recorded for a disqualified entry.

Interested in entering?

The board is in preview with two engines. If you've got a move generator you'd like to put in the ring — in any language, any architecture, any style — drop into the project Discord and let us know.

The reference engine in MoveGen/ gives one starting point — but the leaderboard isn't tied to any one architecture. Magic bitboards, PEXT, classical ray-scan, 0×88 — all welcome.