Export & Storage Architecture
Last Updated: 2026-01-29
Status: Production Ready
Overview
This document describes the architecture for file exports, storage management, and cleanup mechanisms in the aaperture platform.
Architecture Diagram
┌─────────────────────────────────────────────────────────────────────────────┐
│ FRONTEND │
├─────────────────────────────────────────────────────────────────────────────┤
│ EntityExportDialog │ ExportFormatDialog │ useEntityExport hook │
│ │ │ │ │
│ └────────────────────┴───────────────────────┘ │
│ │ │
│ API Call (GET /export or /export/async) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ BACKEND │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ ExportController│───▶│ ExportService │───▶│ ExportCsvService│ │
│ │ (Rate Limited) │ │ │ │ ExportExcelSvc │ │
│ └─────────────────┘ └─────────────────┘ │ ExportPdfService│ │
│ │ │ └─────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ MetricsService │ │ StorageService │ │ PythonPdfService│ │
│ │ (Prometheus) │ │ │ │ (WeasyPrint) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ StorageAccess │ │
│ │ Service │ │
│ │ (Access Control)│ │
│ └─────────────────┘ │
│ │ │
└────────────────────────────────┼────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ CLOUDFLARE R2 │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Temporary Files (TTL 24h) │ Permanent Files (No auto-delete) │
│ ───────────────────────── │ ──────────────────────────────── │
│ • exports/ │ • pdfs/quotes/ │
│ • pdfs/ai-reports/ │ • pdfs/invoices/ │
│ • temp/ │ • pdfs/contracts/ │
│ │ • documents/ │
│ │ • avatars/ │
│ │ • logos/ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Security
1. Access Control on Signed URLs
Service: StorageAccessService (backend/src/storage/storage-access.service.ts)
All signed URL requests are validated against user ownership:
// In StorageController.getSignedUrl()
await this.storageAccessService.verifyAccessOrThrow(decodedKey, userId);
const expiration =
this.storageAccessService.normalizeExpiration(expirationSeconds);
Verification methods:
- Key pattern matching (userId in path)
file_objectstable lookup (org_id check)session_filestable lookup (session owner check)contact_filestable lookup (contact owner check)documentstable lookup (uploaded_by check)
Max expiration: 1 hour (3600 seconds) for most files, 1 week for specific use cases.
2. Access Control on FileObjects
Service: FileObjectsService (backend/src/file-objects/file-objects.service.ts)
Organization-based access verification:
async getSignedUrl(id: string, requestingUserId: string, expirationSeconds?: number): Promise<string> {
const fileObject = await this.findById(id);
await this.verifyUserAccess(fileObject, requestingUserId);
return this.storageService.getSignedUrl(fileObject.key, expirationSeconds);
}
3. Rate Limiting on Exports
Decorator: @RateLimit (backend/src/rate-limiting/rate-limiting.guard.ts)
// Synchronous exports: 10 requests per minute
@RateLimit({ keyPrefix: "export:sync", maxRequests: 10, windowMs: 60000 })
// Asynchronous exports: 5 requests per minute
@RateLimit({ keyPrefix: "export:async", maxRequests: 5, windowMs: 60000 })
File Storage
Unified File Objects
Service: EntityFilesService (backend/src/file-objects/entity-files.service.ts)
All files are stored in the file_objects table with polymorphic references:
interface FileObjectTable {
id: string;
org_id: string;
provider: string; // 'R2'
bucket: string;
key: string;
content_type: string;
size_bytes: number;
checksum: string | null;
created_at: Date;
created_by: string | null;
// Unified columns (migration 0158)
file_name: string | null;
file_url: string | null;
is_image: boolean | null;
updated_at: Date | null;
entity_type: string | null; // 'session', 'contact', 'export', 'quote', etc.
entity_id: string | null; // Polymorphic reference
}
Entity types:
session- Session attachmentscontact- Contact attachmentsexport- Export filesquote- Quote PDFsinvoice- Invoice PDFscontract- Contract PDFsdocument- Uploaded documentsavatar- User avatarslogo- Company logosai-report- AI-generated reports
R2 Key Prefixes
Constants: backend/src/export/export.constants.ts
// Temporary files (cleaned after 24h)
export const TEMPORARY_FILE_PREFIXES = [
"exports/",
"pdfs/ai-reports/",
"temp/",
] as const;
// Permanent files (never auto-deleted)
export const PERMANENT_FILE_PREFIXES = [
"pdfs/quotes/",
"pdfs/invoices/",
"pdfs/contracts/",
"documents/",
"avatars/",
"logos/",
] as const;
Cleanup Mechanisms
1. Export Files Cleanup (Daily)
Scheduler: ExportCleanupScheduler (backend/src/export/export-cleanup.scheduler.ts)
- Schedule: Every day at 4 AM (
0 4 * * *) - Retention: 24 hours (configurable via
EXPORT_RETENTION_HOURS) - Scope: Only
TEMPORARY_FILE_PREFIXES - Timeout: 30 minutes max
@Cron("0 4 * * *")
async cleanupOldExportFiles() {
const deleted = await this.storageService.deleteExportObjectsOlderThan(
[...TEMPORARY_FILE_PREFIXES],
maxAgeHours,
);
this.metricsService?.recordCleanupJob("export_files", "success", duration, undefined, deleted);
}
2. Orphaned File Objects Cleanup (Weekly)
Scheduler: FileObjectsCleanupScheduler (backend/src/file-objects/file-objects-cleanup.scheduler.ts)
- Schedule: Every Sunday at 3 AM (
0 3 * * 0) - Scope:
file_objectsnot referenced bydocumentsorexport_jobs - Min age: 7 days (to avoid deleting files being uploaded)
@Cron("0 3 * * 0")
async cleanupOrphanedFileObjects() {
// Find orphaned file_objects (no reference in documents or export_jobs)
// Delete R2 file + database record
}
3. Orphaned R2 Files Cleanup (Monthly)
Scheduler: FileObjectsCleanupScheduler
- Schedule: 1st of every month at 4 AM (
0 4 1 * *) - Scope: R2 files not referenced in any database table
- Prefixes scanned:
documents/,session-files/,contact-files/ - Min age: 7 days
@Cron("0 4 1 * *")
async cleanupOrphanedR2Files() {
// Scan R2 prefixes
// Check if file exists in file_objects, session_files, or contact_files
// Delete if orphaned and older than 7 days
}
4. Avatar/Logo Replacement Cleanup
Location: StorageController (backend/src/storage/storage.controller.ts)
When a user uploads a new avatar or logo, the old file is automatically deleted:
async uploadAvatar(userId: string, file: Express.Multer.File) {
// Get old avatar URL
const user = await this.getUser(userId);
// Upload new avatar
const newUrl = await this.storageService.uploadFile({ ... });
// Delete old avatar if different
if (user.avatar_url && user.avatar_url !== newUrl) {
await this.deleteOldAvatarSafely(user.avatar_url);
}
return newUrl;
}
Performance Optimizations
1. Streaming for Large Exports
Service: ExportCsvService (backend/src/export/export-csv.service.ts)
For exports exceeding 5000 records, data is fetched and written in batches:
async exportToCSVWithPagination<T>(
columns: CsvColumn[],
fetchBatch: BatchFetcher<T>,
batchSize: number = 1000,
): Promise<Buffer> {
// Stream data in batches instead of loading all in memory
while (true) {
const batch = await fetchBatch(offset, batchSize);
if (batch.length === 0) break;
// Write batch to CSV stream
offset += batchSize;
}
}
2. Batch Deletion in Cleanup
Service: StorageFileManagementService (backend/src/storage/storage-file-management.service.ts)
Files are deleted in batches of 1000 to avoid memory issues:
async deleteExportObjectsOlderThan(prefixes: string[], maxAgeHours: number): Promise<number> {
const batchSize = 1000;
let batch: string[] = [];
for await (const obj of this.listObjectsByPrefix(prefix)) {
if (obj.lastModified < cutoff) {
batch.push(obj.key);
if (batch.length >= batchSize) {
await this.deleteBatch(batch);
batch = [];
}
}
}
// Delete remaining
if (batch.length > 0) await this.deleteBatch(batch);
}
Monitoring (Prometheus Metrics)
Service: MetricsService (backend/src/common/metrics/metrics.service.ts)
Export Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
exports_total | Counter | entity, format, status | Total exports performed |
export_duration_seconds | Histogram | entity, format | Export duration |
export_size_bytes | Histogram | entity, format | Export file size |
export_errors_total | Counter | entity, format, error_type | Export errors |
Cleanup Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
cleanup_jobs_total | Counter | job_type, status | Cleanup jobs executed |
cleanup_duration_seconds | Histogram | job_type | Cleanup duration |
cleanup_files_scanned_total | Counter | job_type | Files scanned |
cleanup_files_deleted_total | Counter | job_type | Files deleted |
cleanup_errors_total | Counter | job_type, error_type | Cleanup errors |
Usage
// Record export metrics
this.metricsService.recordExport("sessions", "csv", "success", 2.5, 1024000);
// Record cleanup metrics
this.metricsService.recordCleanupJob("export_files", "success", 120, 5000, 150);
Column Definitions
Constants: backend/src/export/export-columns.constants.ts
Centralized column definitions for all export formats:
export const SESSION_COLUMNS: ColumnDefinition[] = [
{
csvHeader: "Date",
excelHeader: "Date",
key: "start_date",
pdfHeader: "Date",
},
{
csvHeader: "Location",
excelHeader: "Location",
key: "location",
pdfHeader: "Location",
},
// ...
];
export const CONTACT_COLUMNS: ColumnDefinition[] = [
{
csvHeader: "First Name",
excelHeader: "First Name",
key: "first_name",
pdfHeader: "First Name",
},
// ...
];
// Helper functions
export function getValidKeysForEntity(entity: string): string[];
export function getColumnsForEntity(
entity: string,
format: "csv" | "excel" | "pdf",
): Column[];
Field Validation
Service: ExportService (backend/src/export/export.service.ts)
Selected fields are validated against allowed columns:
private validateSelectedFields(entityType: string, selectedFields: string[]): void {
const validKeys = getValidKeysForEntity(entityType);
const invalidFields = selectedFields.filter(f => !validKeys.includes(f));
if (invalidFields.length > 0) {
throw new BadRequestException(
`Invalid fields for ${entityType}: ${invalidFields.join(", ")}. Valid fields: ${validKeys.join(", ")}`
);
}
}
Migration: Unified file_objects
Migration: 0158_unify_file_objects (infra/liquibase/changes/0158_unify_file_objects/)
This migration:
- Adds new columns to
file_objects(file_name,file_url,is_image,updated_at,entity_type,entity_id) - Migrates data from
session_filestofile_objectswithentity_type='session' - Migrates data from
contact_filestofile_objectswithentity_type='contact' - Creates compatibility views (
session_files_view,contact_files_view) - Categorizes existing files by R2 key prefix
Backward compatibility: The original session_files and contact_files tables are preserved. New code should use EntityFilesService which reads/writes to file_objects.
API Endpoints
Synchronous Export
GET /api/export?entity={entity}&format={format}&fields={fields}
Rate limit: 10 requests per minute
Response:
{
"url": "https://r2.../exports/user123/sessions_2026-01-29.csv?...",
"key": "exports/user123/sessions_2026-01-29.csv",
"filename": "sessions-2026-01-29.csv",
"expiration": 3600
}
Asynchronous Export
GET /api/export/async?entity={entity}&format={format}&fields={fields}
Rate limit: 5 requests per minute
Creates a BullMQ job and returns immediately. Result is delivered via WebSocket or can be polled.
Signed URL Generation
GET /api/storage/signed-url/:key?expirationSeconds={seconds}
Access control: Verified via StorageAccessService
Max expiration: 3600 seconds (1 hour)
Best Practices
- Always use
TEMPORARY_FILE_PREFIXESfor exports - Files will be auto-cleaned after 24h - Use
PERMANENT_FILE_PREFIXESfor business documents - Quotes, invoices, contracts are never auto-deleted - Record metrics - Use
MetricsServicefor all export and cleanup operations - Validate fields - Always validate
selectedFieldsagainst allowed columns - Use
EntityFilesService- For new file operations, use the unified service instead of direct table access - Handle timeouts - Cleanup jobs have 30-minute timeout protection
Related Documentation
- EXPORT_SYSTEM.md - Export system overview and frontend integration
- RESILIENCE_METRICS.md - Prometheus metrics documentation