Database Cleanup Audit

This document is the active entry point for schema and data cleanup work.

It is intentionally short: the goal is to identify likely cleanup targets before any destructive migration or data purge.

Current Approach

Identify legacy tables no longer used by application code.
Separate:
- schema candidates to drop,
- data candidates to purge,
- active tables that only need retention rules.
Require a usage check before deletion:
- backend queries,
- frontend/API exposure,
- operational scripts,
- migration history and rollback expectations.

Confirmed Schema/Code Drift

These need cleanup or correction before any broad delete campaign.

Drift already cleaned

session_reminders
- stale runtime reference removed from the insights rule
- backend runtime no longer references session_reminders
- table/type can now be treated as a true schema cleanup candidate instead of an active dependency
legacy workflow compatibility references
- removed from backend/src/db/database.types.ts
- runtime fallbacks removed from:
  - progression/helpers/badge-checker.helper.ts
  - session-info/session-client-info-prefill.service.ts
- orphan service removed:
  - client-access/client-access-tasks.service.ts
- this clears the remaining code dependency on:
  - user_workflows
  - user_workflow_phases
  - user_workflow_tasks
  - default_workflows
  - default_workflow_phases
  - default_workflow_tasks
  - session_workflow_tasks

This means the next cleanup pass should include:

removing stale runtime references,
aligning Kysely database types with the real schema,
only then dropping confirmed-dead schema leftovers.

Initial Candidate Tables

These are the first tables to review because they look legacy or replaced by newer flows.

High-confidence review candidates

projects
project_tags
project_contacts
project_vendors
project_files
contact_projects

Rationale:

the product pivot is now sessions, not projects,
recent code search shows no active backend table queries (selectFrom, insertInto, updateTable, deleteFrom) for these legacy project tables in the main app code,
some remaining references are documentation, generated client types, or legacy E2E artifacts that need separate cleanup review.

Review carefully before touching

client_access_tokens
- legacy /client UX was hard-cut, but the table still appears in active backend flows (client-access services, contracts/billing link generation, contacts merge)
migration_logs
- active operational role in migration observability and sync with Liquibase
duplicates
user_duplicates
- still active in backend, frontend, OpenAPI, navigation, and E2E
user_reminder_settings
- still used by billing reminder services
session_workflow_action_runs
- part of the workflow v2 runtime schema even if direct query usage is limited

Zero-usage or near-zero-usage review candidates

These currently appear to have no direct backend runtime references outside docs/types and should be reviewed table-by-table:

professional_type_profiles
organization_settings
progression_settings
gdpr_processing_activities

Important:

zero direct code references do not automatically mean safe to drop,
but they are strong review candidates for the next schema cleanup batch.

Current code verdict after strict scan:

professional_type_profiles
- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation
organization_settings
- no direct runtime query found in backend/frontend
- apparent false positive: insights “organization settings” flows use insight_settings, not organization_settings
progression_settings
- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation
gdpr_processing_activities
- no direct runtime query found in backend/frontend
- currently referenced only in DB types and documentation

Now strong schema cleanup candidates after code cleanup

user_workflows
user_workflow_phases
user_workflow_tasks
default_workflows
default_workflow_phases
default_workflow_tasks
session_workflow_tasks

Rationale:

no backend runtime references remain after the latest cleanup pass,
the compatibility types and fallback code have been removed,
these are now candidates for a dedicated Liquibase drop migration after a final DB-level existence check.

Data Cleanup Candidates

These are likely to need retention or purge rules rather than immediate table deletion.

expired tokens in auth-related tables
old webhook event logs
old scheduled email execution artifacts
worker/runtime tables with historical rows:
- job_runs
- outbox_events
- file_objects
- export_jobs
- ai_tasks
old notification/event tracking rows:
- email_tracking
- audit_logs
- migration_logs

Recommended Next Pass

Export an authoritative inventory from the live database:
- row counts per table
- total relation size
- last write timestamps when available
- use infra/db-cleanup-audit.sql for a non-destructive first pass
Fix schema/code drift first:
- confirm the live schema still contains the legacy workflow/session reminder tables before dropping them
Confirm no runtime references remain for the legacy project tables.
Draft cleanup migrations in small batches:
- batch 1: dead legacy tables
- batch 2: retention policies and archival/purge jobs
Validate backups and rollback before dropping anything in production.

Non-Goals

No deletion based only on documentation.
No dropping tables that still appear in code, OpenAPI, or operational workflows.
No data purge without a retention decision and a production backup path.

Audit Command

Run the inventory audit before any destructive migration:

psql "$DATABASE_URL" -f infra/db-cleanup-audit.sql

Expected outputs:

candidate table presence in public
approximate live row counts
total relation size
insert/update/delete counters from pg_stat_user_tables
max created_at / updated_at when those columns exist

Current Approach​

Confirmed Schema/Code Drift​

Drift already cleaned​

Initial Candidate Tables​

High-confidence review candidates​

Review carefully before touching​

Zero-usage or near-zero-usage review candidates​

Now strong schema cleanup candidates after code cleanup​

Data Cleanup Candidates​

Recommended Next Pass​

Non-Goals​

Audit Command​