Cache‑Optimized IPv6 LPM Leverages AVX‑512 and Linearized B‑Tree in Real BGP Tests

benchmarks vector-db

2026-04-20 | Source: HN | Original article

A new open‑source library, planb‑lpm, delivers a cache‑friendly IPv6 longest‑prefix‑match (LPM) engine that leverages Intel’s AVX‑512 SIMD extensions. The core of the design is a 9‑ary linearized B‑plus tree packed into 64‑byte cache‑line aligned nodes, with each leaf holding eight keys. Lookup proceeds as a pure predecessor search: at every internal level a single AVX‑512 vpcmpuq instruction followed by a popcnt determines the child node, and the same operation on the leaf pinpoints the matching prefix. The author’s GitHub read‑me shows the algorithm expands each IPv6 prefix into a start‑end interval on the upper 64 bits, sorts the 2 × N boundaries, and resolves nesting with a stack so that every elementary interval knows its active next‑hop. Benchmarks run on real‑world BGP tables—over 800 k IPv6 prefixes—report lookup rates exceeding 30 Mpps on a single Xeon Scalable processor while keeping latency under 30 ns. Compared with prior CPU‑only solutions and even GPU‑accelerated engines, the AVX‑512 implementation cuts memory traffic by up to 40 % thanks to its cache‑line‑friendly layout. Why it matters is twofold. First, IPv6 traffic is climbing as carriers retire legacy IPv4 address pools, and high‑speed routers must sustain line‑rate lookups on ever‑larger routing tables. Second, modern data‑center CPUs now ship with AVX‑512, turning a previously niche instruction set into a mainstream performance lever. A software router that can exploit those wide vectors without resorting to specialized ASICs or GPUs narrows the gap between commodity servers and carrier‑grade gear. What to watch next are integration efforts with the DPDK and VPP ecosystems, where a plug‑in could bring the engine into production‑grade packet‑processing pipelines. The community is also probing porting the algorithm to ARM’s SVE vector set, which would broaden its relevance to heterogeneous cloud environments. If the early performance claims hold up under diverse workloads, planb‑lpm could become a de‑facto reference for IPv6 LPM on general‑purpose hardware.

Sources

Back to AIPULSEN