SSABLESelectivity-Aware Bi-Level Engine

The vector database that makes filtered queries the fast path — not the exception.

SABLE is the engine underneath VectorAmp: a filter-native, billion-scale vector index that treats the WHERE clause as a first-class part of the query plan. Built in Rust with AVX-512 kernels, and validated to Recall@10 ≥ 0.95 on a single 1B-vector node.

Filter-native query planningPer-list micrographsBillion-scale on commodity hardwareAVX-512 SIMD kernels
02 — Benchmarks

Numbers from the SABLE whitepaper.

Single-node results on stock AWS instances. Full methodology, query workloads, and ablations live in the whitepaper.

Filtered p95 vs DiskANN
28–101×faster
100M vectors, 10% selectivity · R@10 ≥ 0.96 maintained
Billion-scale recall
≥ 0.95R@10
1B vectors, single node · commodity AWS hardware
RAM vs HNSW
300×smaller
1M @ d=1536: 80 MB vs 25 GB · billions paged from NVMe
Unfiltered p95 (1M)
5.1 ms
R@10 = 0.941 · m5.2xlarge, d=1536
03 — Why SABLE

Generic vector stores were retrofitted. SABLE wasn't.

Four design choices made early — and held to — that compound at scale.

[ 01 ]

Filter-native, not filter-aware.

Most vector stores treat metadata filters as a post-processing step that throws away precomputed work. SABLE plans the filter and the ANN graph traversal as a single query — selectivity-aware, with shard-local statistics. Selective filters get faster as your corpus grows, not slower.

Worst caseEqual to unfilteredCommon caseLess wasted work than post-filter
[ 02 ]

Shard-aware at the engine layer.

Sharding is a first-class concept in SABLE — not a wrapper around per-shard search. The query coordinator routes by metadata keys you specify, prunes shards by filter, and merges with bounded recall guarantees. Add a shard, capacity scales linearly.

Vectors per shardWorkload-sizedPruningMetadata + zone-map
[ 03 ]

Tuned end-to-end with real workloads.

SABLE is tuned for the retrieval patterns VectorAmp itself runs on: filtered semantic search, RAG with citations, recommendation, and deduplication. Practical defaults for product workloads, not synthetic-benchmark theater.

Index coreOPQ · IVF-PQ · per-list micrographsDefaultsSABLE by default in every dataset
[ 04 ]

Storage that respects your bill.

Hot paths should stay fast, warm data should be compact, and cold data should not dominate your bill. SABLE is designed around explicit storage tiers and predictable read amplification.

CompressionPQ-aware optionsCold-tier readsCache-aware paths
04 — Architecture

The engine, top to bottom.

SABLE is a compact cold-tier vector index with filter-aware planning built into the storage layout, not bolted on after retrieval.

Query plannerLayer 04 · Selectivity
PlannerPer-list selectivity
ModeLinear ADC or graph beam
FiltersEquality + numeric ranges
MergeTop-k candidate heap
SABLE indexLayer 03 · ANN
OPQRotated vectors
IVF-PQCompressed residual codes
MicrographsPer-list NSG-like traversal
SIMDAVX-512 · NEON paths
Filter metadataLayer 02 · Pre-gating
BitmapsPer-list categorical filters
SketchesNumeric range pruning
StatsCardinality estimates
Planner inputChoose scan vs graph
PersistenceLayer 01 · Durable
L0Mutable write buffer
L1Disk-backed SABLE index
DLSTMemory-mapped PQ codes
WALDurable inserts + deletes
05 — SDK

Search a dataset with the SDKs.

Python and TypeScript clients use the same dataset-scoped search API as the platform. Pass text or vectors, top-k, metadata filters, and advanced range filters directly.

search.pyVectorAmp · Python SDK
# filtered semantic searchfrom vectoramp import VectorAmp
client = VectorAmp(api_key="va_...")dataset = client.datasets.get("dataset_id")
results = dataset.search(    text="migration plan risks",    top_k=20,    filters={"team": "platform", "region": "eu"},    advanced_filters=[{"field": "year", "op": "gte", "value": 2025}],    include_documents=True,)
search.tsVectorAmp · TS SDK
// same query, TypeScriptimport { VectorAmp } from "@vectoramp/vectoramp";
const client = new VectorAmp({ apiKey: process.env.VECTORAMP_API_KEY });const dataset = await client.datasets.get("dataset_id");
const results = await dataset.search({  queryText: "migration plan risks",  topK: 20,  includeMetadata: true,  filter: { team: "platform", region: "eu" }});
06 — How SABLE compares

Same workload. Different math.

SABLE vs DiskANN on the published whitepaper benchmark: 1M vectors at d=1536 (DBpedia-OpenAI) and 100M vectors at d=96 (Deep-1B per shard). Single-node, p95 latency and Recall@10 against ground truth.

 Filter-nativeSABLEDisk-resident graphDiskANNIn-memory graphHNSW
Unfiltered p95 (1M, R@10≈0.94)5.1 msnp=16, rerank=4175 msdisk-residentFast in-RAMat huge memory cost
Filtered p95 (1M, ~11% selectivity)14–20 mspredicate pre-gates lists132.9 msfilter as post-stepDegrades sharplyfilter masks during traversal
Filtered p50 (100M, 10% selectivity)24–59 ms28–101× faster than DiskANN1.3–3.1 sdisk seeks dominateDoes not fit~640 GB RAM at d=96
Memory @ 1M, d=153680 MBPQ codes + micrographsDisk-resident25 GBgraph in RAM
Memory @ 1B, d=96Page cache~448 GB on NVMe, paged on demandDisk-resident~640 GB RAMrequired
Recall@10 ceiling0.994 → 0.95+ @ 1B0.897 (1M)High, but memory-bound
07 — Engineered for scale

What SABLE is built to hold.

Validated single-node ceilings from the whitepaper. Fan out to multiple shards through the coordinator for higher throughput and larger corpora.

1 B
Vectors per node · validated
≥ 0.95
Recall@10 at 1B
50–100 ms
End-to-end p95 · billion-scale filtered
AVX-512
SIMD kernels · NEON for Graviton
08 — Get SABLE

Two ways to use it.

Start on managed Cloud today. Enterprise is for teams that need materially higher usage limits and commercial support while we build bring-your-own-cloud for a later release.

Enterprise · high utilization

SABLE Enterprise

For teams pushing beyond standard plan limits: larger workloads, higher query volume, heavier embedding generation, and commercial terms sized around real utilization. Bring-your-own-cloud is in development, but not available today.

  • Limitless datasets within negotiated fair-use terms
  • Higher embedding generation and query throughput
  • Custom usage limits, contracts, and billing terms
  • Same managed VectorAmp Cloud deployment model today
  • Named support and solutions engineering
Is SABLE the same engine VectorAmp uses?+

Yes — bit-for-bit. The VectorAmp platform runs on SABLE. Buying SABLE standalone gets you the same engine, the same indexing pipeline, and the same query planner — minus the application layer (canvases, agents, intelligence UI). If you outgrow SABLE-only and want the platform on top, your indexes carry over.

What does "filter-native" actually mean?+

Most vector stores retrieve candidates first, then apply metadata filters afterward. If your filter rejects 90% of candidates, the engine has to over-fetch and hope enough survive. SABLE stores filter metadata alongside each IVF list, estimates per-list selectivity, prunes impossible lists, then chooses the right path — linear ADC scan for selective filters or micrograph beam search for dense/unfiltered lists.

What index types are supported?+

SABLE is the index VectorAmp creates for datasets: an OPQ-rotated IVF-PQ foundation, per-list micrographs for unfiltered traversal, and per-list bitmap plus quantile-sketch pre-gating for filters. We benchmark against systems like HNSW and DiskANN, but we do not expose those as customer-selectable index modes in the current SDKs.

Can I run SABLE in my own VPC?+

Not today. Current Enterprise plans run on managed VectorAmp Cloud with higher usage limits and commercial support. Bring-your-own-cloud is in development for a later release, but custom Kubernetes or VPC-isolated deployments are not generally available right now.

What does pricing look like?+

SABLE is included in the VectorAmp Cloud plans. The trial is free with hard-capped usage; paid plans include monthly vector-query and vector-write allowances with metered usage above them. See the pricing page for the current plan lineup, or contact sales for Enterprise terms.

The vector layer should be the fast layer.

Stop paying the post-filter tax. Run your filtered workload against the same engine that powers VectorAmp Intelligence.