Mutuus TeamMarch 26, 20268 min read

Nacre Array: What If Your Vec Had a Spine?

Mother-of-pearl is 3,000 times tougher than the crystals it's made of. The secret is layered organization with flexible mortar between rigid segments. We built an array that works the same way.

organicsdesign-revealnacre-arrayarrays

Skim

Read

Deep Dive

In our last post, we described the limitations of flat sequential storage: uniform treatment of hot and cold data, O(n) inserts and splits, no per-element metadata. This post introduces the Nacre Array, a segmented, self-compressing dynamic array inspired by the microstructure of mother-of-pearl.

Nacre, the iridescent lining inside mollusk shells, is one of the most studied materials in biomimetics. It's made of aragonite, a form of calcium carbonate. Aragonite by itself is brittle. But nacre is roughly 3,000 times more fracture-resistant than pure aragonite. The difference is architecture.

Nacre alternates thin, rigid crystalline layers with even thinner layers of organic polymer. The rigid layers handle compressive load. The flexible mortar absorbs and redirects crack energy. When a crack hits nacre, it can't propagate straight through. It's deflected along the mortar interfaces, dissipating energy across a much larger area. Hardness from the crystals. Toughness from the layers.

The insight we extracted: a structure that differentiates its regions based on function can outperform one that's uniformly optimized for a single property.

From Biology to Computation

The Nacre Array maps three biological features to computational properties: layered segments, self-describing headers, and natural fracture planes.

Aragonite layers become segments. Instead of one contiguous allocation, the Nacre Array stores elements in fixed-capacity segments (default: 4096 elements). Each segment is independent. It has its own thermal state, its own compression status, and its own summary statistics.

Organic mortar becomes element headers. Every element in a Nacre Array carries an ElementHeader, lightweight metadata recording creation time and access information. This is the self-describing property that Vec lacks. The structure knows something about each element, not just its value.

Natural fracture planes. In nacre, the mortar-crystal interfaces are natural crack-arrest points. In the Nacre Array, segment boundaries are natural split points. Splitting at a segment boundary is O(1): no element copying, just pointer reassignment. Compare this to Vec::split_at, which copies half the array.

The OrganicElement trait is the contract that elements must satisfy:

rust

pub trait OrganicElement: Sized + Clone {
    fn header(&self) -> &ElementHeader;
    fn header_mut(&mut self) -> &mut ElementHeader;
}

This is the minimal cost of self-description: each element carries a header. In exchange, the structure gains the ability to scan by metadata (scan_headers), track thermal state per segment, and make compression decisions based on actual access patterns rather than assumptions.

The Thermal State Machine

Each segment transitions independently through four thermal states: Hot, Warm, Cold, and Compressed. The transitions are driven by a tick-based maintenance model, not wall-clock time.

When a segment is accessed, it becomes Hot. Left untouched, it transitions through Warm and Cold as maintenance ticks accumulate. The cooldown thresholds are configurable:

Hot: Recently accessed. Fast, mutable, uncompressed.
Warm: Cooling down. Still uncompressed, but approaching compression eligibility.
Cold: No recent access. Eligible for compression.
Compressed: Cold data encoded with LZ4 or Zstd. Read access triggers decompression back to Hot.

This is the property that addresses cold data bloat. A year-old event log doesn't need January taking the same memory as December. Segments that nobody reads compress. Segments that get queried stay hot. The array manages this lifecycle internally, without external compaction threads.

The maintenance model is tick-based rather than time-based. You call tick() when it makes sense for your workload: after every N operations, on a timer, or manually. The tick walks the segment list, advances state machine transitions, and compresses cold segments that have been cold long enough. The MaintenanceReport returned by each tick tells you exactly what happened: how many segments compressed, merged, or split.

This decoupling from wall-clock time matters for two reasons. First, it makes the behavior deterministic and testable. You can write a test that pushes 1,000 elements, ticks 50 times, and verifies exactly which segments compressed. Second, it makes the structure WASM-compatible. No background threads, no timers, no OS dependencies.

The cooldown configuration is exposed as a builder pattern:

rust

let arr = NacreArray::new(4096)
    .with_cooldown(CooldownConfig {
        hot_to_warm: 10,
        warm_to_cold: 20,
        cold_to_compressed: 5,
    })
    .with_segment_cache(true);

Fracture Planes and O(1) Splitting

Segment boundaries are fracture planes. Splitting at a fracture plane costs O(1). The nacre metaphor pays its largest dividend here.

In Vec, splitting an array of 100,000 elements at the midpoint means copying 50,000 elements into a new allocation. In the Nacre Array, splitting at a segment boundary reassigns segment pointers. No elements are copied. The left array keeps its segments up to the split point. The right array takes the rest.

You can ask the structure where its fracture planes are:

rust

let planes = arr.fracture_planes(); // e.g., [512, 1024, 1536, ...]
let nearest = arr.nearest_fracture(750); // -> 512 or 1024
let right = arr.split_at(nearest); // O(1)

This is not a minor optimization. For systems that partition data by time window, key range, or workload (event logs, time-series databases, stream processors), the ability to split a collection without copying changes what's architecturally possible.

What It Costs

Self-description is not free. Element headers, segment metadata, and the cumulative length index all add overhead. We measure the full cost.

Lookup overhead. Vec::get is a single pointer offset, O(1) in the truest sense. NacreArray::get resolves which segment contains the target index, then dereferences into that segment. We use a cumulative length index with binary search, giving O(log S) where S is the number of segments. A segment locality cache can help sequential and localized access patterns, but it hurts pure random access.

On the corrected benchmark harness, at 100,000 elements with the current 4096-element default, pure random lookup is about 8.1x slower than Vec with the cache disabled (8.33 ns versus 1.03 ns). That is a real tax. The return is not raw read speed. The return is structural behavior: splitting without copying payloads, segment-level scans, and thermal lifecycle management.

Memory overhead. Each element carries an ElementHeader. Each segment carries state metadata and a summary. The cumulative length index adds one usize per segment. This overhead is fixed per element/segment, not per access.

Maintenance overhead. The tick() call walks the segment list, O(segments) not O(elements). For 100,000 elements with the current 4096-element default, that's about 25 segments to visit. In practice, most segments need no action on any given tick.

The implementation story still matters, but the benchmark story had to be corrected:

Linear segment scan is gone. Lookup now uses a cumulative length index with binary search.
The segment cache is workload-sensitive. It helps sequential and localized patterns, but hurts pure random access.
Insert/remove now maintain cumulative suffix metadata incrementally, and split only renormalizes the moved right-half cumulative metadata. That means the structural wins are real, but the design still pays metadata work that a flat Vec never has to carry.

The older 15.8x -> 2.55x optimization narrative came from the pre-fix harness and is no longer the number we should publish. The correct publication number is the current one: 8.33 ns random lookup for Nacre (cache off) versus 1.03 ns for Vec at 100K.

Where It Wins

Three operations where segmented architecture dominates: insert, split, and metadata-only scanning. Full benchmark numbers are coming in a dedicated post.

Insert at position. Vec shifts n - k elements. Nacre Array shifts within one segment, then updates cumulative suffix metadata. At 100,000 elements, inserting in the middle is about 8.3x faster for Nacre: 31.6 microseconds versus 263.2 microseconds.

Split at fracture. Against the corrected baseline (Vec::split_off), splitting 100,000 elements at a fracture plane is 22.6 microseconds for Nacre versus 532.1 microseconds for Vec. That is a real 23.6x win, not a benchmark artifact.

Scan by metadata. scan_headers in the current implementation still traverses live elements to inspect their headers, so it is near parity with the equivalent Vec collect. scan_segments is the truly structural capability: querying segment summaries without touching payloads.

Using It

The Nacre Array is available in Rust (mutuus-nacre crate) and TypeScript via WASM (@mutuus/core).

In Rust, any type implementing OrganicElement can be stored. In TypeScript via WASM, values are number (f64). The WASM bridge uses type erasure with a concrete WasmElement wrapping f64 and an ElementHeader.

typescript

import { initWasm, NacreArray } from '@mutuus/core';

await initWasm();
const arr = await new NacreArray({ segmentCapacity: 256 }).init();
arr.push(42);
arr.push(1337);
console.log(arr.length); // 2
console.log(arr.get(0)); // 42

// After many elements and ticks...
const report = arr.tick();
console.log(report.segmentsCompressed); // segments that crossed into Compressed

The full API surface, including fracture planes, cooldown configuration, and compression statistics, is documented in the Library.

What's Next

A dedicated benchmark post with the full numbers: push, get (random, sequential, Zipfian, localized), insert, split, iteration, and scanning. Including the operations where Vec wins.

We've been running nine Criterion benchmark groups across multiple workload profiles and sizes. Vec is still faster for raw random access. Iteration favors Vec by a moderate margin. But for insert, split, and scanning workloads, the segmented architecture opens a gap that widens with scale.

The biology suggested the design. The benchmarks are validating which parts of that design pay for themselves and which remain overhead. We'll show all of it.

March 28, 20267 min read

Nacre Array Benchmarks: The Numbers

We promised the full benchmark numbers. Here they are: every operation, every scale, wins and losses. Vec is still faster at some things. Nacre Array is faster at others. The crossover points tell the real story.

April 7, 20265 min read

What If Your Bitmap Knew Its Own Density?

Bitmap indexes power analytics queries across billions of rows. But the dominant implementation treats all data the same: fixed boundaries, uniform containers, no adaptation. What if the bitmap could see its own density distribution?

April 16, 20266 min read

What Happens When Bitmap Boundaries Listen to the Data?

What if bitmap container boundaries adapted to data density instead of falling at fixed intervals? We benchmarked 14 operations at three scales to find out. The results surprised us in both directions.

March 17, 20265 min read

Why Nature Builds Better Arrays

Every system that stores sequential data eventually hits the same wall: cold data costs the same as hot, inserts require shifting everything, and there's no way to split without copying. What if the structure itself knew the difference?

March 19, 20266 min read

Why Your Data Structure Doesn't Have a Metabolism

Data structures are benchmarked at birth, compared as static objects, and optimized for a single moment in time. But biological systems invest in costly metabolic machinery that pays off across a lifecycle. What if the overhead isn't the problem, but the investment?

Nacre Array: What If Your Vec Had a Spine?

From Biology to Computation

The Thermal State Machine

Fracture Planes and O(1) Splitting

What It Costs

Where It Wins

Using It

What's Next

Related Posts

Nacre Array Benchmarks: The Numbers

What If Your Bitmap Knew Its Own Density?

What Happens When Bitmap Boundaries Listen to the Data?

Why Nature Builds Better Arrays

Why Your Data Structure Doesn't Have a Metabolism

Discussion