Database Cleanup Audit
This document is the active entry point for schema and data cleanup work.
It is intentionally short: the goal is to identify likely cleanup targets before any destructive migration or data purge.
Current Approach
- Identify legacy tables no longer used by application code.
- Separate:
- schema candidates to drop,
- data candidates to purge,
- active tables that only need retention rules.
- Require a usage check before deletion:
- backend queries,
- frontend/API exposure,
- operational scripts,
- migration history and rollback expectations.
Confirmed Schema/Code Drift
These need cleanup or correction before any broad delete campaign.
Drift already cleaned
session_reminders- stale runtime reference removed from the insights rule
- backend runtime no longer references
session_reminders - table/type can now be treated as a true schema cleanup candidate instead of an active dependency
- legacy workflow compatibility references
- removed from
backend/src/db/database.types.ts - runtime fallbacks removed from:
progression/helpers/badge-checker.helper.tssession-info/session-client-info-prefill.service.ts
- orphan service removed:
client-access/client-access-tasks.service.ts
- this clears the remaining code dependency on:
user_workflowsuser_workflow_phasesuser_workflow_tasksdefault_workflowsdefault_workflow_phasesdefault_workflow_taskssession_workflow_tasks
- removed from
This means the next cleanup pass should include:
- removing stale runtime references,
- aligning Kysely database types with the real schema,
- only then dropping confirmed-dead schema leftovers.
Initial Candidate Tables
These are the first tables to review because they look legacy or replaced by newer flows.
High-confidence review candidates
projectsproject_tagsproject_contactsproject_vendorsproject_filescontact_projects
Rationale:
- the product pivot is now
sessions, notprojects, - recent code search shows no active backend table queries (
selectFrom,insertInto,updateTable,deleteFrom) for these legacy project tables in the main app code, - some remaining references are documentation, generated client types, or legacy E2E artifacts that need separate cleanup review.
Review carefully before touching
client_access_tokens- legacy
/clientUX was hard-cut, but the table still appears in active backend flows (client-accessservices, contracts/billing link generation, contacts merge)
- legacy
migration_logs- active operational role in migration observability and sync with Liquibase
duplicatesuser_duplicates- still active in backend, frontend, OpenAPI, navigation, and E2E
user_reminder_settings- still used by billing reminder services
session_workflow_action_runs- part of the workflow v2 runtime schema even if direct query usage is limited
Zero-usage or near-zero-usage review candidates
These currently appear to have no direct backend runtime references outside docs/types and should be reviewed table-by-table:
professional_type_profilesorganization_settingsprogression_settingsgdpr_processing_activities
Important:
- zero direct code references do not automatically mean safe to drop,
- but they are strong review candidates for the next schema cleanup batch.
Current code verdict after strict scan:
professional_type_profiles- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation
organization_settings- no direct runtime query found in backend/frontend
- apparent false positive: insights “organization settings” flows use
insight_settings, notorganization_settings
progression_settings- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation
gdpr_processing_activities- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation
Now strong schema cleanup candidates after code cleanup
user_workflowsuser_workflow_phasesuser_workflow_tasksdefault_workflowsdefault_workflow_phasesdefault_workflow_taskssession_workflow_tasks
Rationale:
- no backend runtime references remain after the latest cleanup pass,
- the compatibility types and fallback code have been removed,
- these are now candidates for a dedicated Liquibase drop migration after a final DB-level existence check.
Data Cleanup Candidates
These are likely to need retention or purge rules rather than immediate table deletion.
- expired tokens in auth-related tables
- old webhook event logs
- old scheduled email execution artifacts
- worker/runtime tables with historical rows:
job_runsoutbox_eventsfile_objectsexport_jobsai_tasks
- old notification/event tracking rows:
email_trackingaudit_logsmigration_logs
Recommended Next Pass
- Export an authoritative inventory from the live database:
- row counts per table
- total relation size
- last write timestamps when available
- use
infra/db-cleanup-audit.sqlfor a non-destructive first pass
- Fix schema/code drift first:
- confirm the live schema still contains the legacy workflow/session reminder tables before dropping them
- Confirm no runtime references remain for the legacy project tables.
- Draft cleanup migrations in small batches:
- batch 1: dead legacy tables
- batch 2: retention policies and archival/purge jobs
- Validate backups and rollback before dropping anything in production.
Non-Goals
- No deletion based only on documentation.
- No dropping tables that still appear in code, OpenAPI, or operational workflows.
- No data purge without a retention decision and a production backup path.
Audit Command
Run the inventory audit before any destructive migration:
psql "$DATABASE_URL" -f infra/db-cleanup-audit.sql
Expected outputs:
- candidate table presence in
public - approximate live row counts
- total relation size
- insert/update/delete counters from
pg_stat_user_tables - max
created_at/updated_atwhen those columns exist