From: merge-script Date: Fri, 15 May 2026 19:55:48 +0000 (+0000) Subject: Merge bitcoindevkit/bdk#2048: feat(core): add skiplist to CheckPoint for faster traversal X-Git-Url: http://internal-gitweb-vhost/%22bdk_chain/blockdata/locktime/hashes/serde/struct.WeightedUtxo.html?a=commitdiff_plain;p=bdk Merge bitcoindevkit/bdk#2048: feat(core): add skiplist to CheckPoint for faster traversal e095266afa2d80be0c6d2ef9d140ba2c67078a1d feat(core): Optimize `CheckPointIter` using pskip (志宇) 0257c549d0602bfcf016be713f281443a0d85a78 feat(core)!: Remove unused methods of `CheckPointEntry` (志宇) 86c052e5cb1096cb282b78b72e4df8fa9d76b4c6 test(core): add random-access skiplist benchmark (志宇) d9ce149db413c44c31d8719580876f7bbc31d277 feat(core): add skiplist to CheckPoint for faster traversal (志宇) 08d1bdee2fae549610abed2db705586305231e0c feat(chain,electrum): Make clippy happy (志宇) Pull request description: ### Description #### Summary Add a skiplist to `CheckPoint` using Bitcoin Core's `CBlockIndex::pskip` pattern, adapted to operate on checkpoint indices rather than block heights. Result: O(log n) lookups for `get`, `floor_at`, and `range`, with no tuning constant. Per-node memory is unchanged — `Option>` is niche-optimized to 8 bytes whether or not it's populated. The pskip approach was suggested by @ValuedMammal in https://github.com/bitcoindevkit/bdk/pull/2048#issuecomment-4328710770 and prototyped in https://github.com/ValuedMammal/block-graph/blob/master/block-graph/src/checkpoint.rs. Credit for the core idea goes there. This PR takes a different implementation path; the rationale is below. #### Design Every `CheckPoint` node carries one Arc skip pointer to a deterministically chosen ancestor at index `skip_index(i)`. The chosen targets give skip distances that grow exponentially as you walk back, yielding O(log n) traversal — ~17 hops at 1M blocks vs ~1M for a linear walk. `get`, `floor_at`, and `range` all reduce to "walk back to the highest checkpoint at or below `target_height`", factored into one private `walk_to_floor` helper. Each public method is a thin wrapper. #### Bench numbers (criterion) Absolute pskip times, with a linear-walk baseline at 1M blocks for scale: ``` get_1000_middle 227 ns get_10000_near_end 397 ns floor_at_10000 565 ns random_access_skiplist_1m 1.36 µs (vs 3.84 ms linear walk over the same chain) ``` #### Caveats - Insert is ~50 ns slower per push than a non-skiplist `CheckPoint` because every push wires its skip pointer via an O(log n) `ancestor_by_index` walk. For real-world per-block chain extension this is negligible (well under 1 µs/block), but bulk-rebuild paths (`insert_sparse_1000` → ~3.5 µs for 1000 sequential pushes) pay it linearly. ### Notes to the reviewers #### History This PR went through three design iterations. Including the rationale here so reviewers don't have to reconstruct it from the commit graph: **1. Fixed-stride skiplist (initial proposal).** First version added a single skip pointer every `CHECKPOINT_SKIP_INTERVAL` indices, each pointing exactly that many positions back. `INTERVAL` was originally `100`, then bumped to `1000` after benchmarking — at mainnet-scale chains (~1M blocks) the cost-optimal stride for an O(n/k + k) walk is `k ≈ √n ≈ 1000`. This delivered ~50–80× speedups over linear walks but still left lookups at O(√n) (~2,000 hops at 1M blocks) and forced a tuning constant. **2. Bitcoin Core's `pskip` (current).** @ValuedMammal pointed out that Bitcoin Core's `CBlockIndex::pskip` pattern achieves O(log n) lookups with no tuning constant by giving every node a single skip pointer to a *deterministically chosen* ancestor (computed from the index's bit pattern). Crucially, this comes at no extra per-node memory cost: `Option>` was already 8 bytes per node thanks to niche-optimization, so populating the field on every node uses the same space as populating it on every 1000th. Switched to this approach; the synthetic skiplist benches went from ~80% improvement to ~94% over linear, and the tuning constant disappeared. **3. Reuse the pskip walker.** Once `ancestor_by_index` (the O(log n) walk underlying pskip) existed, it became natural to plug it into anywhere `CheckPoint`/`CheckPointIter` was advancing through the chain by a known amount. `get`, `range`, `floor_at`, `floor_below`, and `Iterator::nth` / `last` all use it now; `count` and `size_hint` derive from the `index` counter directly. Unused traversal methods on `CheckPointEntry` were removed in the same pass since they were duplicating `CheckPoint`'s own surface and had no callers in the workspace. #### Diffs from @ValuedMammal's prototype The [block-graph prototype](https://github.com/ValuedMammal/block-graph/blob/master/block-graph/src/checkpoint.rs) gets the high-level idea right (every node carries one deterministic skip pointer; index-based formula so sparse chains work) but has a few choices I wanted to address differently: **1. `n & (n - 1)` underflow.** The prototype handles it by converting `u32 → i32 → u32` with `try_into().expect(...)`, relying on signed two's-complement to give `0 & -1 == 0`. This PR uses `n & n.wrapping_sub(1)` directly on `u32`, which produces the same `0 & u32::MAX == 0` result without the round-trip conversion. **2. Traversal heuristic.** The prototype's `get` uses simple greedy ("take skip if it doesn't undershoot, else prev"). That's correct but **not genuinely O(log n)** — for targets near genesis on a long chain, it degrades to many small geometric descents stacked end-to-end. This PR uses Bitcoin Core's `GetAncestor` heuristic verbatim: take skip unless the *predecessor's* skip would have been a strictly bigger jump that still doesn't undershoot. **3. Single internal helper for `get`/`range`/`floor_at`.** The public surface (`get`, `range`, `floor_at`) all reduces to "walk back to the highest checkpoint at or below `target_height`". This PR factors that into one private `walk_to_floor` function; each public method becomes a thin wrapper. The prototype duplicates the skip-walk logic. **4. `Drop` cleanup.** With every node holding a skip Arc, the manual unwind loop in `Drop for CPInner` (existing fix for #1634) is extended with `node.skip.take()` so skip-pointer drops happen on the manual loop rather than triggering ancestor drops via the implicit recursive path. ### Changelog ```md Added: - `O(log n)` skiplist on `CheckPoint` (Bitcoin Core's pskip pattern). Speeds up `get`, `floor_at`, `range`, and full-chain iteration. - `Iterator::nth` / `last` / `count` / `size_hint` overrides on `CheckPointIter` and `CheckPointEntryIter`, plus `ExactSizeIterator` impls. `nth` and `last` are now `O(log n)`; `count` and `size_hint` are `O(1)`. Removed: - Unused traversal methods on `CheckPointEntry`: `iter`, `get`, `range`, `floor_at`, `floor_below`. They had no callers in the workspace and duplicated `CheckPoint`'s own surface. ``` ### Checklists #### All Submissions: * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) #### New Features: * [x] I've added tests for the new feature * [x] I've added docs for the new feature ACKs for top commit: nymius: ACK e095266afa2d80be0c6d2ef9d140ba2c67078a1d Tree-SHA512: 15d71abe5959d4928d44bd60cbd323f24197b3fb8de36f774f4f45e517fe67b470df7193f69b7d95e3854bf921c55659236009fd3569289e0063b227e66b0cbf --- f9ea5e188c2e182543258c224d0a4e735fcc584d