Choosing a backend¶
Engrama runs on two interchangeable storage backends. This guide tells you which one to pick and why.
Engrama 0.9 introduces portable storage: SQLite + the sqlite-vec
extension, both bundled into a single file under ~/.engrama/engrama.db.
The original Neo4j backend is still fully supported — it's now an opt-in
extra.
If you don't know which one to pick, start with SQLite. You can switch later with one environment variable.
At a glance¶
| SQLite (default) | Neo4j (opt-in) | |
|---|---|---|
| Install | git clone + uv sync |
git clone + uv sync --extra neo4j + Docker |
| External services | none | Neo4j 5.26 LTS in Docker |
| First run | seconds | ~15s for the database to boot |
| Disk footprint | one .db file |
Neo4j data directory + Docker image (~500 MB) |
| Portability | copy the .db file anywhere |
dump/restore via neo4j-admin |
| Concurrency | one writer, many readers (WAL) | many readers and writers |
| Vector search | sqlite-vec (brute-force, fine to ~100k vectors) |
Neo4j vector index (HNSW, scales further) |
| Multi-process write access | not recommended | yes |
| Cloud / remote access | local file only | bolt://host:7687 |
| Cypher query language | not available | yes |
| Memory profile | tiny (single SQLite process) | JVM heap (~1 GB minimum) |
| Operates without Docker | ✅ | ❌ |
The data model — labels, relationships, faceted classification — is identical on both backends. Anything you can store on SQLite you can later move to Neo4j (and vice versa) without restructuring your graph.
The decision tree¶
Are you running Engrama on a single laptop / VM / container, for one user?
├─ Yes → SQLite. You're done.
│
└─ No → Multiple processes need to write at the same time?
├─ Yes → Neo4j.
│
└─ No → Do you need >100k embeddings, or expect to in <12 months?
├─ Yes → Neo4j.
│
└─ No → Do you need ad-hoc Cypher analytics or a graph viewer?
├─ Yes → Neo4j.
│
└─ No → SQLite.
In practice the first branch covers ~90% of users.
When to pick SQLite¶
- You're getting started. Zero install friction, no Docker, no JVM.
git clone … && uv sync && uv run engrama initand you're querying the graph (Engrama is not yet on PyPI; install from source for now). - Single-agent setups. One Claude Desktop, one MCP client, one long-running script. SQLite handles this perfectly.
- CI runs and tests. No external service to spin up —
pytestworks out of the box on a fresh checkout. - Embedded distribution. Shipping a tool that includes Engrama as a library? Your users get a working memory layer with no Docker prereqs.
- Edge / resource-constrained hosts. No JVM means Engrama runs comfortably on a Raspberry Pi or a 512 MB VM.
- Portable research notebooks. Send a colleague your
.dbfile and they have your full graph — no schema migration needed.
When to pick Neo4j¶
- Production multi-user setups. Multiple agents (or humans via Bloom / Browser) writing concurrently to the same graph.
- Large-scale vector search.
sqlite-vecdoes brute-force similarity; fine up to ~100k vectors but Neo4j's HNSW index will outperform it beyond that. - You already have Cypher pipelines. If your team writes Cypher for analytics, business logic, or migrations, keep that investment.
- You need Bloom / Neo4j Browser for visual exploration. SQLite has no equivalent native UI.
- Cluster / high availability (Neo4j Enterprise). SQLite is a single-file database — no replication.
Switching from one to the other¶
Swap one environment variable. The data model is identical, so any tool or skill that works on one backend works on the other.
From SQLite to Neo4j¶
# 1. Install the extra and start Neo4j
uv sync --extra neo4j
docker compose up -d # uses docker-compose.yml from the repo
# 2. Tell Engrama to use it
echo 'GRAPH_BACKEND=neo4j' >> .env
echo 'NEO4J_PASSWORD=...' >> .env
# 3. (Optional) re-create the vector index for hybrid search
uv run engrama init --profile developer
uv run engrama reindex
If GRAPH_BACKEND=neo4j is set but the Python extra is missing, both
the CLI and the MCP server now fail with an explicit install hint
instead of a generic import or startup error.
To carry data across, the simplest path today is: configure a temporary
SDK script that reads from SQLite and writes to Neo4j via two Engrama
contexts. A first-class export tool is on the roadmap.
From Neo4j to SQLite¶
# 1. Tell Engrama to use SQLite
echo 'GRAPH_BACKEND=sqlite' >> .env
echo 'ENGRAMA_DB_PATH=~/.engrama/engrama.db' >> .env # optional, this is the default
# 2. Sync from your Obsidian vault — vault is portable by design (DDR-002)
uv run engrama-mcp # or use Claude Desktop
# then: engrama_sync_vault
Because relations are persisted in vault frontmatter (DDR-002), the
Obsidian vault is itself a portable backup of the full graph. Pointing a
fresh SQLite install at the same vault and running engrama_sync_vault
rebuilds the graph from scratch.
How does this work under the hood?¶
Both backends implement the same GraphStore, VectorStore, and
EmbeddingProvider protocols (engrama/core/protocols.py). A single
factory in engrama/backends/__init__.py reads GRAPH_BACKEND from the
environment and returns the right implementation.
Skills, the MCP server, the CLI, and the Python SDK are written against the protocols — they don't know which backend is underneath. That's why swapping is a one-variable change.
See architecture.md for the full layer diagram, and DDR-004 for the design rationale of the portable backend.
Frequently asked¶
Can I run both backends at the same time? You can, but a single Engrama process binds to one. Different processes can target different backends — useful for testing or migrations.
Does SQLite support all the features Neo4j has? For the public Engrama API (the 12 MCP tools, the SDK, the CLI), yes — they're feature-equivalent and exercised by the same parameterised contract suite. The only thing SQLite cannot do is execute raw Cypher patterns; it uses pre-translated SQL queries instead. If a future feature needs ad-hoc Cypher, the Neo4j backend will get it first.
What about embeddings?
Both backends support the full hybrid-search stack (vector + fulltext +
graph boost + temporal). SQLite stores vectors via sqlite-vec in the
same .db file; Neo4j uses its native vector index.
If the graph backend is healthy but the embedding service is down, Engrama degrades to fulltext search and reports that explicitly. This lets you distinguish "Neo4j is misconfigured" from "Ollama / embeddings are unavailable".
Does the schema migration script (engrama init) work on SQLite?
SQLite's schema lives in engrama/backends/sqlite/schema.sql and is
applied automatically when the database file is created. engrama init
on a SQLite backend is a no-op for schema (the Cypher constraints are
ignored) but still seeds the domain nodes from your profile.
Where is my data on each backend?
- SQLite:
~/.engrama/engrama.dbby default, or whereverENGRAMA_DB_PATHpoints. Single file. Back it up withcp. - Neo4j: inside the Docker volume
engrama_neo4j_data(or wherever you mounted Neo4j's data directory). Back it up withneo4j-admin database dump.