Skip to main content
implementedInspired by Diatom frustule (silica-shell microalgae)

Diatom Bitmap

Counterpart to Roaring Bitmap (alternative)

Research in progress

The data and performance claims on this page reflect ongoing research, implementation, and experimentation. They may change as benchmarks are rerun and formal papers are finalized.

Key Properties

  • Density-derived Domain Boundaries replace fixed 2^16 arithmetic partitioning
  • Cooperative Thermoregulation via Hot/Warm/Cold thermal FSM per container
  • Four container types: Array, Bitmap, RunLength, ColdCompressed (LZ4/Zstd)
  • Boundary drift under tick budget splits hot containers, merges cold ones
  • Binary serialization format v1 with round-trip fidelity
  • Snap-on adaptations: TemporalStrata, Summary, Symbiosis

Operation Complexity

Side-by-side comparison with the classical counterpart.

Operation complexity comparison
OperationDiatom BitmapRoaring Bitmap (alternative)
insertO(log c + log k)O(log c)
containsO(log c + log k)O(log c)
union (aligned)O(c)O(c)
intersectionO(c)O(c)
cardinalityO(1)O(c)
tick (maintenance)O(c)N/A
from_sortedO(n)O(n)
serializationO(n)O(n)

Interface Preview

diatom_bitmap.rs
rust
1pub struct DiatomBitmap { /* internal */ }
2
3impl DiatomBitmap {
4    // Construction
5    pub fn new(config: DiatomConfig) -> Self;
6    pub fn from_sorted(values: &[u32], config: DiatomConfig) -> Self;
7
8    // Membership
9    pub fn insert(&mut self, value: u32) -> bool;
10    pub fn contains(&self, value: u32) -> bool;
11    pub fn remove(&mut self, value: u32) -> bool;
12    pub fn insert_range(&mut self, start: u32, end: u32);
13    pub fn remove_range(&mut self, start: u32, end: u32);
14
15    // Cardinality & stats
16    pub fn cardinality(&self) -> u64;       // O(1), cached
17    pub fn is_empty(&self) -> bool;
18    pub fn min(&self) -> Option<u32>;
19    pub fn max(&self) -> Option<u32>;
20    pub fn container_count(&self) -> u32;
21    pub fn thermal_counts(&self) -> (u32, u32, u32); // (hot, warm, cold)
22
23    // Set operations
24    pub fn and(&self, other: &Self) -> Self; // intersection
25    pub fn or(&self, other: &Self) -> Self;  // union
26    pub fn xor(&self, other: &Self) -> Self; // symmetric difference
27    pub fn andnot(&self, other: &Self) -> Self; // difference
28    pub fn and_cardinality(&self, other: &Self) -> u64;
29    pub fn or_cardinality(&self, other: &Self) -> u64;
30
31    // Maintenance & introspection
32    pub fn tick(&mut self) -> MaintenanceReport;
33    pub fn clear(&mut self);
34    pub fn boundary_positions(&self) -> Vec<u32>;
35
36    // Iteration & serialization
37    pub fn iter(&self) -> DiatomIter;
38    pub fn to_bytes(&self) -> Vec<u8>;
39    pub fn from_bytes(data: &[u8], config: DiatomConfig) -> Result<Self, SerializeError>;
40}
41
diatom_bitmap.ts
typescript
1class DiatomBitmap {
2  constructor(config?: DiatomBitmapConfig);
3  async init(): Promise<this>;
4  static fromSorted(values: Uint32Array, config?: DiatomBitmapConfig): Promise<DiatomBitmap>;
5  static fromBytes(data: Uint8Array, config?: DiatomBitmapConfig): Promise<DiatomBitmap>;
6
7  // Membership
8  insert(value: number): boolean;
9  contains(value: number): boolean;
10  remove(value: number): boolean;
11  insertRange(start: number, end: number): void;
12  removeRange(start: number, end: number): void;
13
14  // Cardinality & stats
15  readonly cardinality: number;       // O(1)
16  readonly isEmpty: boolean;
17  readonly min: number | undefined;
18  readonly max: number | undefined;
19  readonly containerCount: number;
20  readonly tickGeneration: number;
21  readonly thermalCounts: { hot: number; warm: number; cold: number };
22
23  // Set operations
24  union(other: DiatomBitmap): DiatomBitmap;
25  intersection(other: DiatomBitmap): DiatomBitmap;
26  difference(other: DiatomBitmap): DiatomBitmap;
27  symmetricDifference(other: DiatomBitmap): DiatomBitmap;
28  andCardinality(other: DiatomBitmap): number;
29  orCardinality(other: DiatomBitmap): number;
30
31  // Analysis
32  rank(value: number): number;
33  select(k: number): number | undefined;
34  isSubset(other: DiatomBitmap): boolean;
35  isDisjoint(other: DiatomBitmap): boolean;
36  jaccardDistance(other: DiatomBitmap): number;
37
38  // Maintenance & serialization
39  tick(): void;
40  clear(): void;
41  toArray(): Uint32Array;
42  toBytes(): Uint8Array;
43  [Symbol.iterator](): Iterator<number>;
44}
45

Where This Matters

Apache Druid

Bitmap Index

Improvement comparison
TodayWith Diatom BitmapHow
Fixed 2^16 container boundariesData-derived boundariesHistogram valley detection for natural density splits
No workload adaptationThermal FSM per containerHot containers stay decompressed; cold containers compress
Uniform storage cost for old dataWorkload-dependent cold-data reductionLZ4/Zstd cold compression with reheating on active paths
Manual tuning requiredSelf-optimizing under tick cycleBoundary drift splits hot containers, merges cold ones

Interactive Simulation

Discussion

Questions, ideas, or experiences with Diatom Bitmap? Join the discussion.