ADR-002: UUID v4 as Primary Keys for All Entities
Date: 2026-02-19 Status: Accepted
Context
All 57 entities in the Curriculab data model require a primary key strategy. The choice affects:
- Security — sequential integer PKs expose row counts and allow enumeration attacks against REST API endpoints (e.g.,
GET /enrollments/1234). - API design — PKs appear directly in URLs and response bodies; they must be safe to expose to all client roles including students.
- Application-layer ID generation — generating an ID before the
INSERTsimplifies certain patterns (audit log references, idempotency keys, client-side optimistic UI). - Distributed compatibility — even if Curriculab runs on a single PostgreSQL instance today, the schema should not assume that IDs are generated by a single sequence.
- Cross-entity FK chains — the Student table uses a shared PK with User (see ADR-003); UUIDs make this pattern unambiguous.
Decision
All primary keys use UUID version 4 (randomly generated, 128-bit). PostgreSQL's gen_random_uuid() function (available without extensions from PG 13+) is the recommended generation method. Application-layer generation (e.g., via the uuid package in Node.js or Python's uuid.uuid4()) is also acceptable when an ID must be known before the INSERT.
Consequences
Positive
- Non-guessable: exposure of an ID in a URL or log line does not allow enumeration of adjacent records.
- Globally unique without coordination: IDs can be generated in the application, a queue, or a migration script without risk of collision.
- No information leakage: an ID does not reveal the creation order or total row count.
- Consistent: every FK across all 57 entities has the same type (
uuid), simplifying ORMs, query builders, and generated client SDKs.
Negative
- Storage: a UUID is 16 bytes (128 bits) vs. 4–8 bytes for a 32/64-bit integer PK. At scale this affects index and row sizes.
- Index fragmentation: random UUID v4 values insert at random B-tree positions, causing index page splits and bloat over time. This can be mitigated with
pg_repackor periodicVACUUM FULL. - Readability: UUIDs are harder to reference in logs and support conversations than short integers.
Neutral
- The PostgreSQL
uuidcolumn type stores values in 16 bytes (binary), not as the 36-character hyphenated string. Wire protocol and ORM mapping handle the conversion transparently.
Alternatives Considered
Auto-incrementing integer (SERIAL / BIGSERIAL) Simple and storage-efficient. However, sequential IDs are enumerable through the API, which creates an IDOR (Insecure Direct Object Reference) risk unless every endpoint enforces authorization checks. Even with authorization, leaking row count via max-ID is undesirable. Rejected in favour of UUID v4.
UUID v7 (time-ordered)
UUID v7 embeds a millisecond-precision timestamp in the most-significant bits, making values roughly sortable and reducing B-tree fragmentation significantly compared to v4. This is the most compelling alternative. Rejected for the initial schema because PostgreSQL does not include a native gen_uuid_v7() function in v15 (requires the pg_uuidv7 extension or application-layer generation), and UUID v7 library support across languages is less uniform than v4. This decision should be revisited as UUID v7 support matures; migrating from v4 to v7 requires only a schema and generation change, not a data-model restructure.
ULID (Universally Unique Lexicographically Sortable Identifier) Sortable and URL-friendly. Less standardised than UUID; adds a non-standard library dependency in every repo. Rejected in favour of the more widely supported UUID standard.
Composite natural keys
Several entities have strong natural uniqueness constraints (e.g., Enrollment(student_id, section_id)). Using these as PKs reduces one level of indirection but complicates FK chains (every referencing table must carry all columns of the composite key) and API design (URLs become multi-segment). These constraints are retained as UNIQUE constraints alongside a surrogate UUID PK.