Support Legal & Compliance Releases files File Service Architecture (Draft) On this page
File Service Architecture (Draft)
Mission Summary
Provide a permission-gated file management subsystem with drag-and-drop uploads, secure sharing, and backend-driven downloads.
Enforce API-only access to binary content while supporting multiple storage providers and ExecModule powered transformations.
Build with TBWEEP (tests before writing executable production code) so functionality is always covered by automated checks.
Domain Model
files – core metadata (UUID id, org_id, uploader_id, storage_key, driver, filename, size_bytes, mime_type, sha256, etag, status, virus_scan_state, created_at, updated_at, deleted_at, retention_expires_at).
file_acl – resource-level grants {file_id, subject_type, subject_id, permissions(read,write,share,delete), granted_by, granted_at}; mirrors but does not replace Spring Security ACL tables.
file_upload_sessions – multipart/resumable uploads (upload_id, file_id, storage_key, storage_driver, part_size, initiated_by, expires_at, completed_at, metadata_json).
spaces – logical data rooms (id, org_id, name, description, owner_id, created_at, updated_at, visibility, status).
space_members – memberships {space_id, subject_type(user|org|role), subject_id, role(owner|admin|editor|viewer), invited_by, invited_at}.
space_files – join table that allows associating files with zero or more spaces (primary context for breadcrumbs / navigation).
file_jobs – ExecModule requests and status (job_id, file_id, module_key, status, requested_by, payload_json, result_json, started_at, completed_at, error_reason).
file_tokens – short-lived download/view tokens (token_id, file_id, slug, jwt_id, expires_at, max_uses, remaining_uses, disposition, scope, created_by).
file_tags – normalized tagging (file_id, tag, created_by, created_at) to support filters.
Storage Drivers
Interface StorageDriver with multipart init/complete, presign (GET/PUT), head, copy, move, delete.
Drivers shipped MVP: S3StorageDriver, LocalFSStorageDriver, GCSStorageDriver, AzureBlobStorageDriver.
Driver selection via Spring configuration (valkyrai.file-storage.driver=s3|local|gcs|azure).
ExecModule copy/move uses the driver factory to instantiate target/source provider at runtime.
UI Explorer (Current)
web/typescript/valkyr_labs_com/src/components/FileManager delivers a React-based tree with drag-and-drop uploads backed by the new file endpoints.
Folder selection persists through the directory_path column; uploads call POST /files/uploads/init followed by /files/uploads/{sessionId}/direct.
The Files experience now surfaces as a dedicated LCARS dashboard tab and reuses the enhanced FileUploader dropzone for manual uploads.
API Surface (OpenAPI 3.1)
POST /files/uploads/init – create upload session, return provider-specific multipart fields and upload_id.
POST /files/uploads/complete – verify parts, compute SHA-256, finalize record, trigger virus scan workflow.
GET /files – paginated list with filters: prefix, search, mime_type, tag, owner, space, status.
GET /files/{id}/meta – metadata with ACL summary and audit details.
PATCH /files/{id} – rename, move (prefix/space), tag updates (add/remove), update retention.
DELETE /files/{id} – soft delete; enqueues purge job respecting retention.
POST /files/{id}/actions/presign – create API-gated download/view token; returns single-use or TTL-bound JWT token.
GET /files/{id}/view?token= – stream file via backend; enforces virus scan status, ACL, and token scope.
POST /files/{id}/acl/grant & /revoke – manage resource ACL entries.
POST /spaces / GET /spaces / POST /spaces/{id}/share / POST /spaces/{id}/add-file – manage Data Rooms.
ExecModule integration: POST /files/{id}/exec/{module} (new job), GET /jobs/{id}, POST /jobs/{id}/cancel.
Security & Compliance
JWT auth reused from ValkyrAI; roles owner/admin/editor/viewer enforce baseline RBAC.
Resource ACL adds subject scoped permissions; default owner gets full rights.
Virus/DLP scan hook: file states UPLOADING → SCANNING → AVAILABLE; downloads blocked until AVAILABLE.
Soft delete retains data until retention_expires_at; purge worker deletes storage object and metadata.
Tokenized links always call /files/{id}/view route; raw provider URLs never exposed.
Events emitted on upload lifecycle, sharing actions, and ExecModule job transitions to the existing event bus.
TBWEEP Test Strategy
Unit tests first for storage driver factory, ACL evaluator, and upload session validator.
Service-level tests (Spring @DataJpaTest + @SpringBootTest) for upload finalize flow, ACL grant/revoke, token issuance.
Integration tests using LocalFS + Testcontainers MinIO (for S3 driver) verifying multipart init/complete + download gating.
End-to-end tests (Playwright or Cypress) for UI package once backend endpoints land; mocked API for CI.
Contract tests ensure OpenAPI spec stays aligned (use Schemathesis or OpenAPI snapshot validation).
Next Steps
Wire remaining Liquibase change sets for ACL/tokens and expand metadata indexes.
Add service/controller integration tests (MockMvc) and repository slices once DB fixtures land.
Flesh out advanced driver features (multipart persistence, resumable state) and ExecModule job orchestration.
Build UI explorer package with drag-and-drop upload, permissions-aware actions, and Storybook examples.