What does filter-native mean?

Filter-native means filter metadata is stored with the vector index and planned before distance work. SABLE estimates per-list selectivity, prunes impossible lists, then chooses linear ADC scan for selective filters or micrograph beam search for dense lists.

SSABLESelectivity-Aware Bi-Level Engine

The vector database that makes filtered queries the fast path — not the exception.

Q: Is SABLE the same engine VectorAmp uses?

Yes. VectorAmp runs on SABLE. SABLE standalone provides the same indexing pipeline and query planner without the application-layer canvas, agents, and intelligence UI.

Q: Can I run SABLE in my own VPC?

Not today. Current Enterprise plans use managed VectorAmp Cloud with higher usage limits and commercial support. Bring-your-own-cloud is in development for a later release.

SABLE is the engine underneath VectorAmp: a filter-native, billion-scale vector index that treats the WHERE clause as a first-class part of the query plan. Built in Rust with AVX-512 kernels, and validated to Recall@10 ≥ 0.95 on a single 1B-vector node.

Get started →See benchmarks Read the docs →

Filter-native query planningPer-list micrographsBillion-scale on commodity hardwareAVX-512 SIMD kernels

02 — Benchmarks

Numbers from the SABLE whitepaper.

Single-node results on stock AWS instances. Full methodology, query workloads, and ablations live in the whitepaper.

Filtered p95 vs DiskANN

28–101×faster

100M vectors, 10% selectivity · R@10 ≥ 0.96 maintained

Billion-scale recall

≥ 0.95R@10

1B vectors, single node · commodity AWS hardware

RAM vs HNSW

300×smaller

1M @ d=1536: 80 MB vs 25 GB · billions paged from NVMe

Unfiltered p95 (1M)

5.1 ms

R@10 = 0.941 · m5.2xlarge, d=1536

03 — Why SABLE

Generic vector stores were retrofitted. SABLE wasn't.

Four design choices made early — and held to — that compound at scale.

[ 01 ]

Filter-native, not filter-aware.

Most vector stores treat metadata filters as a post-processing step that throws away precomputed work. SABLE plans the filter and the ANN graph traversal as a single query — selectivity-aware, with shard-local statistics. Selective filters get faster as your corpus grows, not slower.

Worst caseEqual to unfilteredCommon caseLess wasted work than post-filter

[ 02 ]

Shard-aware at the engine layer.

Sharding is a first-class concept in SABLE — not a wrapper around per-shard search. The query coordinator routes by metadata keys you specify, prunes shards by filter, and merges with bounded recall guarantees. Add a shard, capacity scales linearly.

Vectors per shardWorkload-sizedPruningMetadata + zone-map

[ 03 ]

Tuned end-to-end with real workloads.

SABLE is tuned for the retrieval patterns VectorAmp itself runs on: filtered semantic search, RAG with citations, recommendation, and deduplication. Practical defaults for product workloads, not synthetic-benchmark theater.

Index coreOPQ · IVF-PQ · per-list micrographsDefaultsSABLE by default in every dataset

[ 04 ]

Storage that respects your bill.

Hot paths should stay fast, warm data should be compact, and cold data should not dominate your bill. SABLE is designed around explicit storage tiers and predictable read amplification.

CompressionPQ-aware optionsCold-tier readsCache-aware paths

04 — Architecture

The engine, top to bottom.

SABLE is a compact cold-tier vector index with filter-aware planning built into the storage layout, not bolted on after retrieval.

Query plannerLayer 04 · Selectivity

PlannerPer-list selectivity

ModeLinear ADC or graph beam

FiltersEquality + numeric ranges

MergeTop-k candidate heap

SABLE indexLayer 03 · ANN

OPQRotated vectors

IVF-PQCompressed residual codes

MicrographsPer-list NSG-like traversal

SIMDAVX-512 · NEON paths

Filter metadataLayer 02 · Pre-gating

BitmapsPer-list categorical filters

SketchesNumeric range pruning

StatsCardinality estimates

Planner inputChoose scan vs graph

PersistenceLayer 01 · Durable

L0Mutable write buffer

L1Disk-backed SABLE index

DLSTMemory-mapped PQ codes

WALDurable inserts + deletes

05 — SDK

Search a dataset with the SDKs.

Python and TypeScript clients use the same dataset-scoped search API as the platform. Pass text or vectors, top-k, metadata filters, and advanced range filters directly.

search.pyVectorAmp · Python SDK

# filtered semantic searchfrom vectoramp import VectorAmp
client = VectorAmp(api_key="va_...")dataset = client.datasets.get("dataset_id")
results = dataset.search(    text="migration plan risks",    top_k=20,    filters={"team": "platform", "region": "eu"},    advanced_filters=[{"field": "year", "op": "gte", "value": 2025}],    include_documents=True,)

search.tsVectorAmp · TS SDK

// same query, TypeScriptimport { VectorAmp } from "@vectoramp/vectoramp";
const client = new VectorAmp({ apiKey: process.env.VECTORAMP_API_KEY });const dataset = await client.datasets.get("dataset_id");
const results = await dataset.search({  queryText: "migration plan risks",  topK: 20,  includeMetadata: true,  filter: { team: "platform", region: "eu" }});

06 — How SABLE compares

Same workload. Different math.

SABLE vs DiskANN on the published whitepaper benchmark: 1M vectors at d=1536 (DBpedia-OpenAI) and 100M vectors at d=96 (Deep-1B per shard). Single-node, p95 latency and Recall@10 against ground truth.

	Filter-nativeSABLE	Disk-resident graphDiskANN	In-memory graphHNSW
Unfiltered p95 (1M, R@10≈0.94)	5.1 msnp=16, rerank=4	175 msdisk-resident	Fast in-RAMat huge memory cost
Filtered p95 (1M, ~11% selectivity)	14–20 mspredicate pre-gates lists	132.9 msfilter as post-step	Degrades sharplyfilter masks during traversal
Filtered p50 (100M, 10% selectivity)	24–59 ms28–101× faster than DiskANN	1.3–3.1 sdisk seeks dominate	Does not fit~640 GB RAM at d=96
Memory @ 1M, d=1536	80 MBPQ codes + micrographs	Disk-resident	25 GBgraph in RAM
Memory @ 1B, d=96	Page cache~448 GB on NVMe, paged on demand	Disk-resident	~640 GB RAMrequired
Recall@10 ceiling	0.994 → 0.95+ @ 1B	0.897 (1M)	High, but memory-bound

07 — Engineered for scale

What SABLE is built to hold.

Validated single-node ceilings from the whitepaper. Fan out to multiple shards through the coordinator for higher throughput and larger corpora.

1 B

Vectors per node · validated

≥ 0.95

Recall@10 at 1B

50–100 ms

End-to-end p95 · billion-scale filtered

AVX-512

SIMD kernels · NEON for Graviton

08 — Get SABLE

Two ways to use it.

Start on managed Cloud today. Enterprise is for teams that need materially higher usage limits and commercial support while we build bring-your-own-cloud for a later release.

Managed · available

SABLE Cloud

Fully managed on AWS. The same SABLE engine that powers the VectorAmp Intelligence Platform, surfaced through the VectorAmp API and SDKs. Trial available without a credit card.

Tenant-scoped datasets and API keys
Managed sharding and rebuild orchestration
Usage-based plans with hard-capped trial
SDKs in Python, TypeScript, Go, Rust, Java, Ruby
REST API and MCP server

Talk to sales →

Enterprise · high utilization

SABLE Enterprise

For teams pushing beyond standard plan limits: larger workloads, higher query volume, heavier embedding generation, and commercial terms sized around real utilization. Bring-your-own-cloud is in development, but not available today.

Limitless datasets within negotiated fair-use terms
Higher embedding generation and query throughput
Custom usage limits, contracts, and billing terms
Same managed VectorAmp Cloud deployment model today
Named support and solutions engineering

Contact sales →

Is SABLE the same engine VectorAmp uses?+

Yes — bit-for-bit. The VectorAmp platform runs on SABLE. Buying SABLE standalone gets you the same engine, the same indexing pipeline, and the same query planner — minus the application layer (canvases, agents, intelligence UI). If you outgrow SABLE-only and want the platform on top, your indexes carry over.

What does "filter-native" actually mean?+

Most vector stores retrieve candidates first, then apply metadata filters afterward. If your filter rejects 90% of candidates, the engine has to over-fetch and hope enough survive. SABLE stores filter metadata alongside each IVF list, estimates per-list selectivity, prunes impossible lists, then chooses the right path — linear ADC scan for selective filters or micrograph beam search for dense/unfiltered lists.

What index types are supported?+

SABLE is the index VectorAmp creates for datasets: an OPQ-rotated IVF-PQ foundation, per-list micrographs for unfiltered traversal, and per-list bitmap plus quantile-sketch pre-gating for filters. We benchmark against systems like HNSW and DiskANN, but we do not expose those as customer-selectable index modes in the current SDKs.

Can I run SABLE in my own VPC?+

Not today. Current Enterprise plans run on managed VectorAmp Cloud with higher usage limits and commercial support. Bring-your-own-cloud is in development for a later release, but custom Kubernetes or VPC-isolated deployments are not generally available right now.

What does pricing look like?+

SABLE is included in the VectorAmp Cloud plans. The trial is free with hard-capped usage; paid plans include monthly vector-query and vector-write allowances with metered usage above them. See the pricing page for the current plan lineup, or contact sales for Enterprise terms.

The vector layer should be the fast layer.

Stop paying the post-filter tax. Run your filtered workload against the same engine that powers VectorAmp Intelligence.

Talk to engineering