Skip to main content

Industrial Bitmap Solutions in C++

Bitmaps are a fundamental data structure in industrial systems, especially for high-performance filtering, analytics, and large-scale set operations. Choosing the right bitmap implementation is critical for balancing memory efficiency, dynamic updates, SIMD acceleration, serialization, and rank/select capabilities. This article summarizes the most widely used C++ bitmap libraries and approaches, highlighting their strengths, limitations, and typical use cases.


CRoaring / Roaring Bitmaps

CRoaring, also known as Roaring Bitmaps, is a compressed, dynamic, SIMD-optimized bitmap library widely used in production systems such as ClickHouse, Druid, Lucene, and Pinot. Its core strength lies in efficiently handling sparse and mixed-density datasets while providing dynamic operations (set/unset) and fast logical operations (AND/OR/XOR/NOT).

CRoaring also supports serialization, enabling cross-platform storage and fast cold-start initialization. Its block-based container structure allows hybrid density handling and extendable rank/select operations. CRoaring is particularly suitable for large-scale bitsets in analytics or OLAP systems.


BitMagic

BitMagic is a comprehensive bitmap engine in C++, designed for medium to large-scale dynamic sets. Unlike simple bitsets, BitMagic provides dense bitvectors with optional run/cluster compression and automatically adapts internal containers based on density. It supports dynamic updates, SIMD acceleration, binary serialization, and partial rank/select functionality.

BitMagic excels in scenarios requiring complex boolean combinations, medium-scale in-memory filtering, or high-performance industrial set operations. Its flexible API allows precise control over memory layout, making it suitable for performance-sensitive applications.


Boost Dynamic Bitset

Boost Dynamic Bitset offers a dynamic, dense bitset implementation in pure C++. It supports set/unset operations and moderate-size dynamic sets but does not provide compression, SIMD acceleration, or built-in serialization. Rank/select and hybrid density handling are also not supported.

Boost Dynamic Bitset is best suited for small to medium-scale dynamic dense sets, where the performance requirements are moderate, and advanced bitmap features are unnecessary.


std::bitset and vector<bool>

The standard C++ std::bitset and vector<bool> provide lightweight bitsets for fixed-size or small-scale dense datasets. While vector<bool> offers some dynamic sizing capabilities, neither supports compression, SIMD optimization, serialization, rank/select, or hybrid density handling.

These implementations are suitable for low-scale filtering, fixed-size dense sets, and lightweight internal logic operations.


WAH / EWAH / Concise

Older bitmap formats such as WAH, EWAH, and Concise provide compressed representations but lack dynamic update support, SIMD acceleration, and reliable serialization. These legacy formats are rarely used in modern industrial systems.


Custom Raw Memory Bitmaps

For ultra-high-performance scenarios, many industrial systems implement custom raw memory bitmaps. These bitmaps operate directly on contiguous memory blocks, enabling manual SIMD optimization, precise memory layout control, and custom serialization (including mmap-based zero-copy loading).

Custom raw memory bitmaps are ideal for billion-scale filtering, high-concurrency applications, and scenarios where maximum performance is required. However, they require careful implementation to handle memory management, thread safety, and rank/select operations.


Industrial Comparison Table

Library / FormatCompressedDynamicSIMDSerializationRank/SelectHybrid DensityNotes
CRoaring✔️✔️✔️✔️Extendable✔️Large-scale sparse/mixed bitsets
BitMagic❌ (dense efficient, optional run/cluster)✔️✔️✔️Partial✔️Medium-scale dynamic filtering, complex operations
Boost Dynamic Bitset✔️Small to medium dynamic dense sets
std::bitset / vector<bool>❌ / partialFixed-size, small dense sets
WAH / EWAH / Concise✔️❌ / limitedLegacy compressed formats
Custom Raw Memory❌ (manual)✔️✔️Custom/mmapCustomCustomUltra high-performance, fully controlled memory

This format keeps the article readable with narrative paragraphs for each library while placing the key industrial comparison in a concise table for quick reference.