LoraDB v0.2: vector values for connected AI context
LoraDB v0.2 adds first-class VECTOR values.
You can now construct vectors in Cypher, store them as node or relationship properties, pass them in as parameters through every binding, and run exhaustive similarity search against them. The value type, the wire format, the function surface, and the binding helpers all landed together so vectors behave like every other typed value in the engine.
What this release is not is a vector-index product. There is no approximate nearest-neighbour search, no built-in embedding generation, and no plugin compatibility layer. Those are deliberately out of scope for v0.2. The goal here is to make embeddings comfortable inside the graph model — to ship the foundation that an index-backed retrieval path will eventually sit on.
What Changed
The short list:
-
VECTORis a first-class value type, alongside scalars, lists, maps, temporal values, and spatial points. -
A new
vector(value, dimension, coordinateType)constructor. -
Six supported coordinate types:
FLOAT/FLOAT64FLOAT32INTEGER/INT/INT64/INTEGER64/SIGNED INTEGERINTEGER32/INT32INTEGER16/INT16INTEGER8/INT8
-
Storage as node and relationship properties.
-
toIntegerList(v)andtoFloatList(v)for converting coordinates back to lists. -
vector_dimension_count(v)andsize(v)for introspection. -
vector.similarity.cosine(a, b)— bounded to[0, 1]. -
vector.similarity.euclidean(a, b)— bounded to[0, 1]. -
vector_distance(a, b, metric)— signed distance under one of six metrics. -
vector_norm(v, metric)— Euclidean or Manhattan norm. -
Parameter support through every binding.
-
A canonical tagged wire shape:
{
"kind": "vector",
"dimension": 3,
"coordinateType": "FLOAT32",
"values": [0.1, 0.2, 0.3]
}
Language support is included in this release for:
- the Rust core;
- Node.js / TypeScript (
crates/lora-node); - WebAssembly (
crates/lora-wasm); - Python (
crates/lora-python); - Go (
crates/lora-go); - Ruby (
crates/lora-ruby).
Each binding ships a vector(...) helper and an isVector /
is_vector / IsVector / vector? guard, so callers do not need to
build the tagged object by hand.
Why Vectors In A Graph Database
A graph database and a vector store usually get treated as two products. The graph stores relationships; the vector store retrieves similar items; the application glues them together.
For most AI workloads that is the wrong shape. You want to retrieve candidates by similarity and use the graph to rank, filter, or explain them. When the embedding lives on a separate service, every query needs a round trip and every piece of context needs a join by hand.
Putting VECTOR into LoraDB as a value type collapses that
separation. The embedding is a property on the same node that carries
the label, the text, and the relationships. You score with
vector.similarity.cosine(...) and walk with MATCH in the same
Cypher.
That is the whole argument. Similarity finds candidates. The graph explains them.
Using VECTOR Values
Creating A Vector Property
CREATE (d:Doc {
id: 1,
title: 'Onboarding checklist',
embedding: vector([0.1, 0.2, 0.3], 3, FLOAT32)
})
The third argument can be a bare identifier (INTEGER, FLOAT32,
INT8) or a string literal ('INTEGER', 'SIGNED INTEGER'). Bare
identifiers are rewritten to string literals by the analyzer only in
this specific argument position, so normal variable resolution is
unaffected elsewhere in the query.
Passing A Vector As A Parameter
Generate the embedding in your application, pass it in with the language helper, and use it like any other parameter:
import { vector } from "@loradb/lora-node";
const query = vector(embedding, 384, "FLOAT32");
await db.execute(
`MATCH (d:Doc)
RETURN d.id AS id
ORDER BY vector.similarity.cosine(d.embedding, $q) DESC
LIMIT 10`,
{ q: query },
);
The same pattern works in Python (vector(values, dimension, coordinate_type)), Go (lora.Vector(values, dimension, coordinateType)),
Ruby (LoraRuby.vector(values, dimension, coordinate_type)), and the
raw FFI / JSON path used by the C ABI.
Bulk Insert
The canonical way to load many embeddings is a single UNWIND over a
parameter list of rows. Each row is a map with scalar fields and a
tagged vector. The engine fans the list into per-row CREATEs without
round-tripping each one:
import { vector } from "@loradb/lora-node";
const batch = docs.map((doc) => ({
id: doc.id,
title: doc.title,
embedding: vector(doc.embedding, 384, "FLOAT32"),
}));
await db.execute(
`UNWIND $batch AS row
CREATE (:Doc {id: row.id, title: row.title, embedding: row.embedding})`,
{ batch },
);
The same shape in Python:
from lora_python import Database, vector
db = Database.create()
batch = [
{
"id": doc["id"],
"title": doc["title"],
"embedding": vector(doc["embedding"], 384, "FLOAT32"),
}
for doc in docs
]
db.execute(
"""
UNWIND $batch AS row
CREATE (:Doc {id: row.id, title: row.title, embedding: row.embedding})
""",
{"batch": batch},
)
Two things are worth knowing about this pattern. First, each vector is
stored as its own property on its own node — the batch is a list of
maps, not a list of vectors, so the property rule (no lists of
vectors) is satisfied by construction. Second, UNWIND runs the whole
batch in one query, so the per-row overhead is a map extraction and a
CREATE, not a full parse + plan + execute cycle per document.
Exhaustive kNN
MATCH (d:Doc)
RETURN d.id AS id
ORDER BY vector.similarity.cosine(d.embedding, $query) DESC
LIMIT 10
Every MATCH candidate is scored. Cost is O(n) in the number of
matched nodes. That is fine for a local dataset, a test, a demo, or a
small internal tool. It is not how you would serve a corpus of
millions — that is what vector indexes are for, and they are not
implemented yet.
Graph-Filtered Retrieval
The version that actually motivated adding vectors to LoraDB looks like this:
MATCH (d:Doc)
WITH d, vector.similarity.cosine(d.embedding, $query) AS score
MATCH (d)-[:MENTIONS]->(e:Entity)
WHERE e.type = $entity_type
RETURN d.id, d.title, score, collect(e.name) AS entities
ORDER BY score DESC
LIMIT 5
The similarity function supplies the candidates. The graph structure
(MENTIONS, the entity type filter) explains and constrains them.
Both live in one query and one engine.
Storing And Reading Back
Vectors round-trip through storage unchanged:
CREATE (:Doc {id: 1, embedding: vector([1, 2, 3], 3, INTEGER)})
MATCH (d:Doc {id: 1}) SET d.embedding = vector([0.1, 0.2], 2, FLOAT32)
MATCH (d:Doc {id: 1}) RETURN d.embedding AS e
A vector is also a legal map value, so a property map containing a vector is stored intact. The one restriction is that a list containing vectors cannot be stored as a property — if you need many embeddings, hang them off separate nodes. The engine rejects the write at property-conversion time instead of silently storing a shape a future vector index could not support.
AI Use Cases
Vectors in LoraDB are aimed at the workloads that already sit awkwardly between "vector store" and "graph database":
- Agent memory. Embed documents, chunks, observations, or tool calls. Connect them with edges to the sessions, entities, and decisions that reference them. Retrieve by similarity, filter by recency, explain by provenance.
- Semantic document retrieval with context. Find the k closest docs, then expand to the entities, authors, or topics they connect to before returning anything to the application.
- Tool and context selection. Score candidate prompts, tools, or examples by similarity to the current state, then filter by graph constraints (same tenant, same permission scope, same task type).
- Knowledge graph enrichment. Attach embeddings to entities so fuzzy lookups can locate the right node before structural queries run.
- Recommendations with graph guardrails. Cosine similarity
produces a long list of candidates; graph relationships
(
BLOCKED,ALREADY_SEEN,OWNED_BY) filter that list before ranking. - Internal search over connected product, customer, or support data. Small corpora, high structure, and queries that want both "similar to this ticket" and "in the same account and escalation path."
Every one of those workloads has the same shape: similarity for candidates, structure for the rest.
What Is Not Included Yet
This release is deliberately narrow. It does not include:
- Vector indexes. All vector functions are exhaustive today.
- Approximate nearest-neighbour search. There is no ANN path.
- Built-in embedding generation. LoraDB does not produce embeddings; bring them in from your application code.
- Hardened public-internet database hosting. The HTTP server is useful for local development and controlled environments. It is not a managed service.
Those are not hidden. They are the roadmap.
Exhaustive similarity is fine for small datasets, tests, demos, local prototypes, and internal workflows. It is not yet a substitute for an index-backed vector search service at scale.
Why This Fits The LoraDB Journey
LoraDB is developer-first.
That sequencing applies to vectors too. The first version needs the value model, the wire shape, the binding helpers, and the Cypher ergonomics to be right. Those are the parts that user code and tests depend on. A vector index that sits on a broken value model would inherit every problem. Shipping the foundation first, and being honest about the absence of an index, keeps the upgrade path clean.
It also matches how developers actually pick up a new capability. Clone the repo, generate a few embeddings, store them on nodes, write a similarity query, see whether the model fits the problem. Only after that does scale, persistence, and managed operations become worth talking about.
v0.2 makes that first loop work.
Try It
Get the repo and run the server:
cargo run --bin lora-server
Then try a vector query from curl or any binding:
curl -X POST http://127.0.0.1:4747/query \
-H 'content-type: application/json' \
-d '{"query":"RETURN vector([1,2,3], 3, INTEGER) AS v"}'
The docs site has a dedicated page for the value type, the coordinate rules, the functions, the storage restrictions, and the exhaustive kNN pattern:
Internal notes on the value model and the Cypher support matrix have been updated to match.
What Comes Next
Three directions stand out after v0.2:
- A vector index. The Cypher shape stays the same
(
ORDER BY vector.similarity.* LIMIT k); the executor starts routing scored candidates through an index instead of a linear scan. The design depends on the workloads people actually bring. - More metrics and norms as real usage demands them. The
current set (
EUCLIDEAN,EUCLIDEAN_SQUARED,MANHATTAN,COSINE,DOT,HAMMINGfor distance;EUCLIDEANandMANHATTANfor norm) covers the common cases. Extending is mechanical once a concrete need shows up. - The hosted path. Vectors on stored nodes eventually need managed operations to match. That is a separate release, but the value-type work here is what makes it possible without a second data model.
If you try v0.2 with vectors, the most useful feedback is concrete:
- what graph did you load, and what embedding workflow did you pair with it;
- where did exhaustive scan stop being enough;
- what would a vector index need to support for your workload — a specific metric, a specific filter shape, a specific freshness guarantee;
- which binding did you use, and did the tagged shape round-trip cleanly through your application code.
That is the feedback that will shape v0.3.