Datomic is immutable by design. Retraction marks data as “not current” but preserves history. Excision exists but is operationally heavyweight — requires transactor restart and is intended for accidental sensitive data, not routine GDPR requests.
GDPR Article 17 (“right to erasure”) requires the ability to delete personal data on request. We need a compliant approach that works with Datomic’s immutability.
Encrypt PII at rest with per-user keys. On deletion request, destroy the key. The encrypted data remains in Datomic history but is cryptographically unreadable — effectively “forgotten”.
┌─────────────────────────────────────────────────────────────┐ │ Datomic (datomic_kvs in PostgreSQL) │ │ - Immutable, history preserved │ │ - Stores encrypted PII (bytes) │ │ - Stores lookup hashes (UUIDs, not PII) │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ PostgreSQL table: user_keys │ │ - Mutable, no history │ │ - DELETE = truly gone │ │ - Stores per-user DEKs (encrypted with master KEK) │ └─────────────────────────────────────────────────────────────┘
Key insight: Datomic data lives in PostgreSQL (datomic_kvs table), but a
separate user_keys table can use normal SQL DELETE — no history, no excision
needed.
- DEK (Data Encryption Key): Per-user AES-256-GCM key
- KEK (Key Encryption Key): Master key, encrypts all DEKs
- DEKs stored in PostgreSQL, encrypted with KEK
- KEK stored in secure location (environment variable, KMS, or 1Password)
- Generate random DEK for user
- Encrypt DEK with KEK, store in
user_keystable - Hash email:
(hasch/uuid "alice@example.com")→ deterministic UUID - Encrypt email with DEK → bytes
- Store hash (for lookup) and encrypted email (for display) in Datomic
- Hash input email → UUID
- Index lookup on
:user/email-hash(fast, indexed) - Found user → verify password
- Fetch encrypted email from Datomic
- Fetch DEK from
user_keys, decrypt with KEK - Decrypt email with DEK
- Display to user
DELETE FROM user_keys WHERE user_id = ?;Key is gone. Encrypted email in Datomic history is now unreadable garbage.
;; Lookup hash — indexed, not PII, cannot reverse
{:db/ident :user/email-hash
:db/valueType :db.type/uuid
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity}
;; Encrypted value — PII, marked for crypto-shredding awareness
{:db/ident :user/email-encrypted
:db/valueType :db.type/bytes
:db/cardinality :db.cardinality/one
:bits/pii true}The :bits/pii attribute is custom schema metadata. Application code can query
for all PII attributes and handle them appropriately.
CREATE TABLE user_keys (
user_id UUID PRIMARY KEY REFERENCES ... ,
dek BYTEA NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);- Deterministic: same input always produces same UUID
- Native Datomic type, efficiently indexed
- 128-bit collision resistance is sufficient
- Already in deps (
io.replikativ/hasch) - Cleaner than hex-encoded hash strings
Datomic queries can call arbitrary Clojure functions:
[:find ?email
:in $ ?user-id decrypt-fn
:where
[?u :user/id ?user-id]
[?u :user/email-encrypted ?ciphertext]
[(decrypt-fn ?ciphertext) ?email]]Function runs on the peer where keys are available. However, this cannot use indexes — use hash lookups for finding users, decrypt for display only.
Where does the master key live?
- Environment variable (simple, works for single-node)
- AWS KMS / GCP KMS (HSM-backed, audit logs, key rotation)
- HashiCorp Vault (self-hosted option)
- 1Password (already in stack, but not designed for programmatic access at scale)
For self-hosters, environment variable is pragmatic. Hosted Bits could use KMS.
How do we rotate the KEK?
- Generate new KEK
- Re-encrypt all DEKs with new KEK
- This is a PostgreSQL-only operation, no Datomic changes needed
- Datomic backups contain encrypted blobs (safe)
user_keystable must be backed up separately- Restore requires both to be in sync
- Consider: backup
user_keysencrypted with offline key
- GDPR compliant without Datomic excision
- Defense in depth — breach of Datomic alone is insufficient
- Can leverage
:bits/piischema attribute for automated handling - Existing
bits.cryptexcan be extended for this
- Two data stores to manage (Datomic +
user_keystable) - Cannot query encrypted values directly (hash lookup required)
- Key management complexity (KEK storage, rotation, backup)
- Slightly more complex registration/login flow
- PostgreSQL already in stack, no new infrastructure
- Pattern is well-established (envelope encryption + crypto-shredding)