跳到主要内容

CRoaring

CRoaring, also known as Roaring Bitmaps, is a compressed, dynamic, SIMD-optimized bitmap library widely used in production systems such as ClickHouse, Druid, Lucene, and Pinot. Its core strength lies in efficiently handling sparse and mixed-density datasets while providing dynamic operations (set / unset) and fast logical operations (AND / OR / XOR / NOT).

The library supports serialization, enabling cross-platform storage, fast cold-start initialization, and network transmission. Its block-based container structure allows hybrid density handling and extendable rank/select operations, making it suitable for large-scale analytical and real-time systems.

Key Features

  • Compressed storage: Efficiently handles sparse and dense regions within the same bitmap.
  • Dynamic operations: Supports set, unset, flip, and bulk operations.
  • SIMD acceleration: Optimized for modern CPUs for high-performance bitwise operations.
  • Serialization: Portable binary format for storage and transmission.
  • Rank/Select support: Partial via extensions, useful for counting and positional queries.
  • Hybrid density handling: Automatically adapts internal containers based on density.

Typical Use Cases

  • High-performance filtering and analytics in OLAP systems
  • Large-scale event tracking or exposure logging
  • Sparse indexing in search engines or database engines
  • Bitset-based set operations in recommendation or CTR systems

C++ Example

#include <roaring/roaring.hh>
#include <iostream>

int main() {
// Create a Roaring bitmap
Roaring r1;
r1.add(1);
r1.add(2);
r1.add(1000);

// Create another bitmap
Roaring r2;
r2.add(2);
r2.add(3);
r2.add(1000);

// Union of two bitmaps
Roaring r_union = r1 | r2;

// Intersection
Roaring r_intersection = r1 & r2;

// Serialize to a buffer
size_t serialized_size = r_union.getSizeInBytes();
char* buffer = new char[serialized_size];
r_union.write(buffer);

// Deserialize from buffer
Roaring r_loaded = Roaring::read(buffer);
delete[] buffer;

// Iterate over set bits
std::cout << "Bits in union: ";
for (uint32_t value : r_loaded) {
std::cout << value << " ";
}
std::cout << std::endl;

return 0;
}

Explanation:

  • Roaring::add sets individual bits.
  • Bitwise operations like | (union) and & (intersection) are provided.
  • write and read enable serialization.
  • The bitmap supports range-based iteration over set bits.