Skip to main content
Mutuus Team8 min read

Nacre Array: What If Your Vec Had a Spine?

Mother-of-pearl is 3,000 times tougher than the crystals it's made of. The secret is layered organization with flexible mortar between rigid segments. We built an array that works the same way.

organicsdesign-revealnacre-arrayarrays
Share
Skim
Read
Deep Dive

In our last post, we described the limitations of flat sequential storage: uniform treatment of hot and cold data, O(n) inserts and splits, no per-element metadata. This post introduces the Nacre Array, a segmented, self-compressing dynamic array inspired by the microstructure of mother-of-pearl.

Nacre, the iridescent lining inside mollusk shells, is one of the most studied materials in biomimetics. It's made of aragonite, a form of calcium carbonate. Aragonite by itself is brittle. But nacre is roughly 3,000 times more fracture-resistant than pure aragonite. The difference is architecture.

Nacre alternates thin, rigid crystalline layers with even thinner layers of organic polymer. The rigid layers handle compressive load. The flexible mortar absorbs and redirects crack energy. When a crack hits nacre, it can't propagate straight through. It's deflected along the mortar interfaces, dissipating energy across a much larger area. Hardness from the crystals. Toughness from the layers.

The insight we extracted: a structure that differentiates its regions based on function can outperform one that's uniformly optimized for a single property.

From Biology to Computation

The Nacre Array maps three biological features to computational properties: layered segments, self-describing headers, and natural fracture planes.

Aragonite layers become segments. Instead of one contiguous allocation, the Nacre Array stores elements in fixed-capacity segments (default: 4096 elements). Each segment is independent. It has its own thermal state, its own compression status, and its own summary statistics.

Organic mortar becomes element headers. Every element in a Nacre Array carries an ElementHeader, lightweight metadata recording creation time and access information. This is the self-describing property that Vec lacks. The structure knows something about each element, not just its value.

Natural fracture planes. In nacre, the mortar-crystal interfaces are natural crack-arrest points. In the Nacre Array, segment boundaries are natural split points. Splitting at a segment boundary is O(1): no element copying, just pointer reassignment. Compare this to Vec::split_at, which copies half the array.

The OrganicElement trait is the contract that elements must satisfy:

pub trait OrganicElement: Sized + Clone {
    fn header(&self) -> &ElementHeader;
    fn header_mut(&mut self) -> &mut ElementHeader;
}

This is the minimal cost of self-description: each element carries a header. In exchange, the structure gains the ability to scan by metadata (scan_headers), track thermal state per segment, and make compression decisions based on actual access patterns rather than assumptions.

The Thermal State Machine

Each segment transitions independently through four thermal states: Hot, Warm, Cold, and Compressed. The transitions are driven by a tick-based maintenance model, not wall-clock time.

When a segment is accessed, it becomes Hot. Left untouched, it transitions through Warm and Cold as maintenance ticks accumulate. The cooldown thresholds are configurable:

  • Hot: Recently accessed. Fast, mutable, uncompressed.
  • Warm: Cooling down. Still uncompressed, but approaching compression eligibility.
  • Cold: No recent access. Eligible for compression.
  • Compressed: Cold data encoded with LZ4 or Zstd. Read access triggers decompression back to Hot.

This is the property that addresses cold data bloat. A year-old event log doesn't need January taking the same memory as December. Segments that nobody reads compress. Segments that get queried stay hot. The array manages this lifecycle internally, without external compaction threads.

The maintenance model is tick-based rather than time-based. You call tick() when it makes sense for your workload: after every N operations, on a timer, or manually. The tick walks the segment list, advances state machine transitions, and compresses cold segments that have been cold long enough. The MaintenanceReport returned by each tick tells you exactly what happened: how many segments compressed, merged, or split.

This decoupling from wall-clock time matters for two reasons. First, it makes the behavior deterministic and testable. You can write a test that pushes 1,000 elements, ticks 50 times, and verifies exactly which segments compressed. Second, it makes the structure WASM-compatible. No background threads, no timers, no OS dependencies.

The cooldown configuration is exposed as a builder pattern:

let arr = NacreArray::new(4096)
    .with_cooldown(CooldownConfig {
        hot_to_warm: 10,
        warm_to_cold: 20,
        cold_to_compressed: 5,
    })
    .with_segment_cache(true);

Fracture Planes and O(1) Splitting

Segment boundaries are fracture planes. Splitting at a fracture plane costs O(1). The nacre metaphor pays its largest dividend here.

In Vec, splitting an array of 100,000 elements at the midpoint means copying 50,000 elements into a new allocation. In the Nacre Array, splitting at a segment boundary reassigns segment pointers. No elements are copied. The left array keeps its segments up to the split point. The right array takes the rest.

You can ask the structure where its fracture planes are:

let planes = arr.fracture_planes(); // e.g., [512, 1024, 1536, ...]
let nearest = arr.nearest_fracture(750); // -> 512 or 1024
let right = arr.split_at(nearest); // O(1)

This is not a minor optimization. For systems that partition data by time window, key range, or workload (event logs, time-series databases, stream processors), the ability to split a collection without copying changes what's architecturally possible.

What It Costs

Self-description is not free. Element headers, segment metadata, and the cumulative length index all add overhead. We measure the full cost.

Lookup overhead. Vec::get is a single pointer offset, O(1) in the truest sense. NacreArray::get resolves which segment contains the target index, then dereferences into that segment. We use a cumulative length index with binary search, giving O(log S) where S is the number of segments. A segment locality cache can help sequential and localized access patterns, but it hurts pure random access.

On the corrected benchmark harness, at 100,000 elements with the current 4096-element default, pure random lookup is about 8.1x slower than Vec with the cache disabled (8.33 ns versus 1.03 ns). That is a real tax. The return is not raw read speed. The return is structural behavior: splitting without copying payloads, segment-level scans, and thermal lifecycle management.

Memory overhead. Each element carries an ElementHeader. Each segment carries state metadata and a summary. The cumulative length index adds one usize per segment. This overhead is fixed per element/segment, not per access.

Maintenance overhead. The tick() call walks the segment list, O(segments) not O(elements). For 100,000 elements with the current 4096-element default, that's about 25 segments to visit. In practice, most segments need no action on any given tick.

The implementation story still matters, but the benchmark story had to be corrected:

  1. Linear segment scan is gone. Lookup now uses a cumulative length index with binary search.
  2. The segment cache is workload-sensitive. It helps sequential and localized patterns, but hurts pure random access.
  3. Insert/remove now maintain cumulative suffix metadata incrementally, and split only renormalizes the moved right-half cumulative metadata. That means the structural wins are real, but the design still pays metadata work that a flat Vec never has to carry.

The older 15.8x -> 2.55x optimization narrative came from the pre-fix harness and is no longer the number we should publish. The correct publication number is the current one: 8.33 ns random lookup for Nacre (cache off) versus 1.03 ns for Vec at 100K.

Where It Wins

Three operations where segmented architecture dominates: insert, split, and metadata-only scanning. Full benchmark numbers are coming in a dedicated post.

Insert at position. Vec shifts n - k elements. Nacre Array shifts within one segment, then updates cumulative suffix metadata. At 100,000 elements, inserting in the middle is about 8.3x faster for Nacre: 31.6 microseconds versus 263.2 microseconds.

Split at fracture. Against the corrected baseline (Vec::split_off), splitting 100,000 elements at a fracture plane is 22.6 microseconds for Nacre versus 532.1 microseconds for Vec. That is a real 23.6x win, not a benchmark artifact.

Scan by metadata. scan_headers in the current implementation still traverses live elements to inspect their headers, so it is near parity with the equivalent Vec collect. scan_segments is the truly structural capability: querying segment summaries without touching payloads.

Using It

The Nacre Array is available in Rust (mutuus-nacre crate) and TypeScript via WASM (@mutuus/core).

In Rust, any type implementing OrganicElement can be stored. In TypeScript via WASM, values are number (f64). The WASM bridge uses type erasure with a concrete WasmElement wrapping f64 and an ElementHeader.

import { initWasm, NacreArray } from '@mutuus/core';

await initWasm();
const arr = await new NacreArray({ segmentCapacity: 256 }).init();
arr.push(42);
arr.push(1337);
console.log(arr.length); // 2
console.log(arr.get(0)); // 42

// After many elements and ticks...
const report = arr.tick();
console.log(report.segmentsCompressed); // segments that crossed into Compressed

The full API surface, including fracture planes, cooldown configuration, and compression statistics, is documented in the Library.

What's Next

A dedicated benchmark post with the full numbers: push, get (random, sequential, Zipfian, localized), insert, split, iteration, and scanning. Including the operations where Vec wins.

We've been running nine Criterion benchmark groups across multiple workload profiles and sizes. Vec is still faster for raw random access. Iteration favors Vec by a moderate margin. But for insert, split, and scanning workloads, the segmented architecture opens a gap that widens with scale.

The biology suggested the design. The benchmarks are validating which parts of that design pay for themselves and which remain overhead. We'll show all of it.

Related Posts

Discussion