Skip to main content

CAD039: Convex SQL

Overview

Convex SQL is a SQL database layer built on the Lattice. It provides relational table storage with CRDT merge semantics, enabling SQL-like operations over distributed, replicated data structures. Tables support schema definitions, primary key indexing, and row-level merge replication via Lattice Nodes.

Convex SQL enables standard SQL query capabilities over lattice data through Apache Calcite, bridging familiar database paradigms with the convergent properties of lattice technology.

Motivation

While the KV Database provides excellent support for key-value workloads, many applications require:

  • Relational data models with defined schemas and typed columns
  • SQL query language for complex queries, joins, and aggregations
  • Table-based organisation for structured business data
  • Primary key indexing for efficient row lookups
  • Schema evolution with controlled migrations

Traditional distributed databases solve these problems with consensus protocols or conflict resolution callbacks. Convex SQL instead uses lattice merge properties to guarantee convergence while maintaining relational semantics.

Design Goals

  • Provide a familiar table-based API (CREATE TABLE, INSERT, SELECT, DELETE)
  • Support schema definitions with typed columns
  • Enable primary key indexing with ordered retrieval
  • Integrate with Apache Calcite for SQL parsing and query planning
  • Maintain compatibility with the lattice cursor system and owner authentication
  • Support per-database, per-table signed replicas for authentication

Specification

Lattice Path

Convex SQL occupies the :sql path in the standard lattice ROOT:

ROOT {
:data → DataLattice
:fs → OwnerLattice → MapLattice → DLFSLattice
:kv → OwnerLattice → MapLattice → KVStoreLattice
:sql → OwnerLattice → MapLattice → TableStoreLattice
}

The full path to a specific table is:

:sql / <owner-key> → Signed({<db-name> → {<table-name> → TableEntry, ...}, ...})

Where:

  • owner-key — the owner identity (see CAD038)
  • Signed(...) — the owner's signed map of database names to table stores
  • db-name — a string database name, scoped per owner
  • table-name — a string table name within the database

This structure mirrors the KV Database pattern: each owner has their own namespace of databases, and each database contains named tables.

Lattice Composition

The full lattice hierarchy for Convex SQL:

OwnerLattice                    ← per-owner merge with auth (CAD038)
└── SignedLattice ← Ed25519 signature verification
└── MapLattice ← per-database-name merge
└── TableStoreLattice
└── per-table-name merge
└── SQLTableLattice
└── schema + rows merge
└── TableLattice
└── per-row merge (by primary key)
└── SQLRowLattice (LWW)

Layer Descriptions

LayerLatticeMerge Behaviour
OwnerOwnerLatticePer-owner with authentication (CAD038)
SignatureSignedLatticeEd25519 signature verification
DatabaseMapLatticePer-database-name merge
Table StoreTableStoreLatticePer-table-name merge using SQLTableLattice
TableSQLTableLatticeSchema LWW + row-level merge via TableLattice
RowsTableLatticePer-primary-key merge using SQLRowLattice
RowSQLRowLatticeLWW by timestamp; equal timestamps favour deletions

Table Entries

Each table is stored as a Table Entry, a positional vector:

[schema, rows, utime]
IndexFieldTypeDescription
0schemavectorColumn definitions: [[name, type], ...]
1rowsIndexRow data: {primary-key → RowEntry, ...}
2utimeintegerSchema update timestamp (epoch millis)

Schema Format

The schema is a vector of column definitions, each a vector of [name, type]:

[[name₁, type₁], [name₂, type₂], ...]

Where:

  • name — column name (string)
  • type — column type (nil = any, or a type identifier)

Example:

[["id", nil], ["name", nil], ["email", nil]]

Column types are advisory for now; all values accept any CVM data type.

Table Tombstones

A tombstone is a table entry with nil schema and nil rows. The timestamp is preserved.

Tombstones are required for lattice-compatible DROP TABLE: a tombstone wins over older live tables during merge.

Row Entries

Each row within a table is stored as a Row Entry, a positional vector:

[values, utime, deleted]
IndexFieldTypeDescription
0valuesvectorColumn values for this row
1utimeintegerUpdate timestamp (epoch millis)
2deletedinteger / nilDeletion timestamp (nil = live)

Row Tombstones

A row tombstone has nil values and a non-nil deleted timestamp. The deletion timestamp represents when the row was deleted.

Primary Keys

Primary keys MUST be blob-like types (Blob, String, AccountKey, etc.) as required by the Index data structure for lexicographic ordering.

Implementations SHOULD provide automatic key conversion:

  • Integer → 8-byte big-endian encoding
  • String → UTF-8 byte encoding
  • Blob → direct use

This allows natural primary key usage while maintaining Index compatibility.

Merge Semantics

Merge operates at multiple levels:

Table Store Merge

Per-table-name merge using SQLTableLattice.merge() for each table entry.

Table Entry Merge

For each table:

  1. Schema: LWW by timestamp — latest schema wins
  2. Rows: Merge using TableLattice (per-primary-key merge)
  3. Tombstone handling: If the schema winner is a tombstone, the tombstone wins (table dropped)

This means schema changes from a later timestamp override earlier schemas, while row data merges independently regardless of schema changes.

Row Entry Merge

For each row (by primary key):

  1. Equal entries — return own (identity)
  2. One side nil — return the other
  3. Compare timestamps — newer wins (LWW)
  4. Equal timestamps — deletion wins (tombstone takes precedence)

These rules satisfy the lattice properties:

  • Commutative: merge(a, b) = merge(b, a)
  • Associative: merge(merge(a, b), c) = merge(a, merge(b, c))
  • Idempotent: merge(a, a) = a

Ownership and Authentication

Like KV Database, Convex SQL uses OwnerLattice for per-owner authentication. Each owner's data is signed with an Ed25519 key pair and verified during merge (CAD038).

The signed state per owner is:

Signed({db-name → {table-name → TableEntry, ...}, ...})

Authentication provides:

  • Data integrity — tampering invalidates the signature
  • Owner verification — only authorised signers can update an owner's tables
  • Multi-key support — address and DID owners may authorise multiple signing keys

Replication Model

Convex SQL uses the same merge-on-write replication model as KV Database:

  1. Each node maintains its own signed replica (databases containing tables)
  2. The node publishes its replica to the lattice at :sql
  3. Lattice Nodes propagate signed replicas to peers
  4. On receive, OwnerLattice merge combines entries from all owners
  5. Applications read the merged owner map and absorb remote data
┌──────────┐         ┌──────────┐         ┌──────────┐
│ Node A │ │ Node B │ │ Node C │
│ │ │ │ │ │
│ SQLDatabase │ SQLDatabase │ SQLDatabase│
│ tables │ │ tables │ │ tables │
│ │ │ │ │ │
│ export() │ │ export() │ │ export() │
│ ↓ │ │ ↓ │ │ ↓ │
│ :sql/A │◄───────►│ :sql/B │◄───────►│ :sql/C │
│ │ Lattice │ │ Lattice │ │
│ │ Merge │ │ Merge │ │
└──────────┘ └──────────┘ └──────────┘

Concurrent Schema Changes

When two nodes change a table's schema concurrently, LWW applies: the schema with the later timestamp wins. Row data merges independently, so rows from both nodes are preserved even if the schema changed.

Applications SHOULD coordinate schema changes through external mechanisms (e.g. governance) to avoid unexpected schema conflicts.

Row-Level Convergence

Row merges are deterministic and converge across all replicas:

  • Rows with different primary keys merge independently
  • Rows with the same primary key use LWW
  • Deletions propagate correctly via tombstones

Operations

Table Operations

OperationDescription
createTable(name, columns)Create table with column names
dropTable(name)Drop table (creates tombstone)
tableExists(name)Check if table exists
getSchema(name)Get column definitions
getColumnNames(name)Get column names as array
getRowCount(name)Get count of live rows
getTableNames()Get all table names

Row Operations

OperationDescription
insert(table, key, values)Insert or update row
selectByKey(table, key)Get row by primary key
selectAll(table)Get all live rows
deleteByKey(table, key)Delete row (creates tombstone)

Replication Operations

OperationDescription
exportReplica()Export signed replica for lattice publication
mergeReplicas(ownerMap)Merge remote replicas into local store
mergeReplicas(ownerMap, filter)Merge with owner filter predicate

SQL Integration

Convex SQL integrates with Apache Calcite to provide full SQL query capabilities over lattice data.

Query Execution Pipeline

SQL Query → Calcite Parser → Calcite Planner → Convex Relational Operators → Results

The Calcite integration provides:

  • SQL parsing — standard SQL syntax via Calcite's SQL parser
  • Query planning — Calcite planner with lattice-optimised rules
  • Schema discovery — automatic Calcite schema bridge from table definitions
  • DDL — CREATE TABLE and DROP TABLE via SQL
  • DML — SELECT, INSERT, UPDATE, DELETE
  • Joins — cross-table queries including merge joins
  • Aggregations — COUNT, SUM, AVG, etc. via Calcite aggregate operators
  • Sorting and projection — ORDER BY, column selection

Primary Key Filter Pushdown

When a query includes a primary key equality predicate (WHERE id = ?), the planner pushes the filter into the table scan, converting a full table scan into an O(log n) index lookup via selectByKey(). This is a critical optimisation for point queries.

Relational Operators

Custom Convex relational operators integrate with Calcite's planner:

OperatorDescription
ConvexTableScanFull or filtered table scan
ConvexFilterFilter with PK pushdown support
ConvexProjectColumn projection
ConvexSortOrdering
ConvexAggregateAggregation functions
ConvexJoinCross-table joins
ConvexMergeJoinMerge join for sorted data

Row Count Optimisation

Row count queries avoid materialising rows — the count is derived directly from the lattice index structure.

PreparedStatement Caching

Calcite query plans are cached for PreparedStatements, avoiding repeated parsing and planning for repeated queries with different parameters.

JDBC Driver

Convex SQL provides a standard JDBC driver for direct integration with any JDBC-compatible tool or application.

Connection URLs

jdbc:convex:mem:mydb              In-memory database (non-persistent)
jdbc:convex:file:/path/to/db.etch Persistent database backed by Etch store

The driver manages database instances automatically — no manual registration or server process required.

Transaction Support

The JDBC driver supports transaction isolation via the lattice fork/sync model:

OperationLattice Effect
setAutoCommit(false)Forks the cursor (snapshot isolation)
commit()Syncs the fork back to parent (lattice merge)
rollback()Discards the fork
setAutoCommit(true)Direct writes, no isolation

Because lattice merge is always well-defined, commits never fail due to conflicts — the merge function deterministically combines the transaction's writes with concurrent state.

Example: JDBC Usage

Connection conn = DriverManager.getConnection("jdbc:convex:mem:mydb");

// DDL
Statement stmt = conn.createStatement();
stmt.execute("CREATE TABLE users (id INTEGER, name VARCHAR, email VARCHAR)");

// DML
PreparedStatement ps = conn.prepareStatement("INSERT INTO users VALUES (?, ?, ?)");
ps.setInt(1, 1);
ps.setString(2, "Alice");
ps.setString(3, "alice@example.com");
ps.execute();

// Query
ResultSet rs = stmt.executeQuery("SELECT * FROM users WHERE id = 1");
while (rs.next()) {
System.out.println(rs.getString("name"));
}

// Transactions
conn.setAutoCommit(false);
stmt.execute("INSERT INTO users VALUES (2, 'Bob', 'bob@example.com')");
conn.commit(); // Lattice merge — always succeeds

conn.close();

PostgreSQL Wire Protocol

Convex SQL includes a PostgreSQL wire protocol server (PgServer), enabling any PostgreSQL-compatible client to connect:

psql -h localhost -p 5432 -d mydb

The PG server reuses the Calcite pipeline and ConvexSchema, providing the same SQL capabilities as the JDBC driver. A pg_catalog virtual schema provides compatibility with PostgreSQL client tooling.

Write operations (INSERT, UPDATE, DELETE) modify the local replica, which is then signed and propagated via the lattice.

Reference Implementation

A reference implementation is provided in the convex-db module (Java).

Classes

Specification ConceptJava ClassPackage
Root database managerConvexDBconvex.db
SQL Database wrapperSQLDatabaseconvex.db.lattice
Table operations facadeSQLSchemaconvex.db.lattice
Table store latticeTableStoreLatticeconvex.db.lattice
Table entry mergeSQLTableLatticeconvex.db.lattice
Table utilitiesSQLTableconvex.db.lattice
Row index latticeTableLatticeconvex.db.lattice
Row entry mergeSQLRowLatticeconvex.db.lattice
Row utilitiesSQLRowconvex.db.lattice
JDBC DriverConvexDriverconvex.db.jdbc
Calcite schema bridgeConvexSchemaconvex.db.calcite
Calcite tableConvexTableconvex.db.calcite
DDL executorConvexDdlExecutorconvex.db.calcite
PK filter pushdownConvexFilterconvex.db.calcite
PostgreSQL serverPgServerconvex.db.psql

The SQLDatabaseTest and ConvexDBTest classes provide test coverage for table operations, JDBC, replication, and merge semantics.

Example: Basic Table Operations

// Create a database with signing key
AKeyPair keyPair = AKeyPair.generate();
SQLDatabase db = SQLDatabase.create("mydb", keyPair);

// Create a table
db.tables().createTable("users", new String[]{"id", "name", "email"});

// Insert rows (primary key can be CVMLong, AString, or ABlob)
db.tables().insert("users", CVMLong.create(1),
Vectors.of(CVMLong.create(1),
Strings.create("Alice"),
Strings.create("alice@example.com")));

// Query by primary key
AVector<ACell> row = db.tables().selectByKey("users", CVMLong.create(1));

// Query all rows
Index<ABlob, AVector<ACell>> allRows = db.tables().selectAll("users");

// Delete a row
db.tables().deleteByKey("users", CVMLong.create(1));

// Drop a table
db.tables().dropTable("users");

Example: Multi-Node Replication

// Create two nodes with different keys
AKeyPair keyA = AKeyPair.generate();
AKeyPair keyB = AKeyPair.generate();

SQLDatabase dbA = SQLDatabase.create("shared", keyA);
SQLDatabase dbB = SQLDatabase.create("shared", keyB);

// Both create the same table
dbA.tables().createTable("data", new String[]{"id", "value"});
dbB.tables().createTable("data", new String[]{"id", "value"});

// Each writes different rows
dbA.tables().insert("data", CVMLong.create(1), Vectors.of(CVMLong.create(1), Strings.create("from-a")));
dbB.tables().insert("data", CVMLong.create(2), Vectors.of(CVMLong.create(2), Strings.create("from-b")));

// Exchange signed replicas
dbA.mergeReplicas(dbB.exportReplica());
dbB.mergeReplicas(dbA.exportReplica());

// Both now see all rows
dbA.tables().getRowCount("data"); // 2
dbB.tables().getRowCount("data"); // 2

Comparison with KV Database

AspectKV Database (CAD037)Convex SQL (CAD039)
Data ModelKey-valueRelational tables
SchemaImplicit (type tag per entry)Explicit column definitions
Key TypesString keysPrimary key (blob-like)
Value TypesMultiple (string, hash, set, counter, etc.)Row vectors
Merge GranularityPer-key, type-specificPer-row (LWW)
Query LanguageAPI methodsSQL (via Calcite)
Use CaseCaches, sessions, countersStructured business data

Both share the same OwnerLattice authentication model and lattice replication infrastructure.

See Also