Architecture

Dubby is a TypeScript monorepo managed with Yarn workspaces and Nx. This page documents the system architecture from high-level structure down to the key subsystems.

Applications

App	Description
Server	Hono HTTP server with tRPC and REST APIs
Web	Expo web client (browser)
TV	React Native TV app (tvOS + Android TV)
Website	Marketing site and documentation

Package layers

Packages are organized in dependency layers. Lower layers have no internal dependencies; higher layers compose them.

Foundation (no internal dependencies)

Package	Purpose
`@dubby/core`	Types, Result monad, domain errors, constants, nanoid ID generation
`@dubby/config`	Zod schemas for env vars and input validation
`@dubby/db`	Drizzle ORM schema (75+ tables), migrations, SQLite client
`@dubby/logger`	Pino logger abstraction with secret redaction

Domain packages

Package	Dependencies	Purpose
`@dubby/auth`	core, db	better-auth instance, Argon2id password hashing, stream tokens
`@dubby/catalog`	core, db, scanner	Movie/show queries, credits, search
`@dubby/library`	core, db	Library CRUD, permissions, path validation
`@dubby/parser`	—	Filename parsing (movie/TV/anime/date-based), zero dependencies
`@dubby/scanner`	core, parser	Filesystem walking, folder-context enrichment
`@dubby/probe`	core, db, ffmpeg-utils	FFprobe media file analysis
`@dubby/metadata`	core, http-client	TMDB/TVDb API clients with rate limiting
`@dubby/enrichment`	core, db, metadata	MetadataManager — match, enrich, artwork download
`@dubby/subtitles`	core, db, http-client	OpenSubtitles integration
`@dubby/watch-progress`	core, db	Play position tracking, continue watching
`@dubby/media-analysis`	core, ffmpeg-utils	Codec detection, device capability matching
`@dubby/optimize`	core, config, ffmpeg-utils, media-analysis	Ahead-of-time transcoding

Infrastructure packages

Package	Dependencies	Purpose
`@dubby/jobs`	core, db	Durable workflow engine on BullMQ + SQLite
`@dubby/streaming`	core, probe, media-analysis, telemetry	Playback planning, FFmpeg orchestration, HLS
`@dubby/ingest`	core, db, enrichment, jobs, parser, probe, scanner, subtitles	Scan/ingest workflow orchestration
`@dubby/config-service`	config, db, jobs	Runtime config with hot-reload

Client packages

Package	Purpose
`@dubby/api-client`	TanStack Query hooks + tRPC client for web/TV
`@dubby/ui`	Shared React Native UI components (NativeWind)
`@dubby/media-utils`	Image URL builders, trickplay URL helpers
`@dubby/i18n`	i18next internationalization (10 namespaces)

Key dependency rules

@dubby/db has zero internal dependencies — it’s a pure schema and query layer
@dubby/core has zero internal dependencies — shared primitives only
@dubby/streaming does not depend on @dubby/db directly — it uses port interfaces to avoid the coupling
@dubby/ingest is the largest leaf package — it orchestrates enrichment, optimization, subtitle downloads, and file watching

Server request lifecycle

Startup sequence

On boot, the server runs through a strict initialization order:

Load and validate environment variables (fail-fast on missing/invalid values)
Check database migrations are up to date (exits if pending)
Initialize database connection, recover interrupted workflow runs
Initialize probe cache, metadata manager, config service
Detect server capabilities (CPU cores, GPU availability, FFmpeg codecs)
Initialize streaming subsystem (session manager, process manager, seek handler, reaper)
Create DI container (Valkey connection, job queues, event bus, file watchers, workers)
Wire SSE event bus, start cron jobs
Mount all route handlers, build Hono app, start HTTP server
Register graceful shutdown handlers

Middleware stack

Every HTTP request passes through these layers in order:

Layer	Purpose
Request logging	Method, path, status, duration
Security headers	`X-Content-Type-Options`, `X-Frame-Options`, `Referrer-Policy`, `Permissions-Policy`
CORS	Permissive in development, strict allowlist in production
Health checks	`/health`, `/health/live`, `/health/ready` — served before auth
Auth rate limiting	Fixed-window limiter on sign-in/sign-up endpoints
better-auth	Session/token resolution for `/api/auth/*`
Streaming routes	HLS playlists, segments, direct file serving
Image and trickplay	Artwork proxy, seek thumbnail sprites
SSE	Long-lived event streams
REST API (`/v1/*`)	OpenAPIHono sub-app with per-route auth
tRPC (`/api/*`)	`@hono/trpc-server` adapter with superjson
Static files	SPA serving in standalone mode
404 / error handler	Catches unmatched routes and unhandled errors

Graceful shutdown

On SIGINT/SIGTERM, the server shuts down in reverse order: stop session reaper, drain active transcodes, shut down streaming service, stop telemetry, drain job workers, disconnect Valkey, close HTTP server. A configurable grace period allows in-flight transcodes to finish before process exit.

Data flow

tRPC (internal API)

tRPC routers are thin shells. All business logic lives in domain packages:

Domain functions accept db: Database as their first parameter — never a singleton. This enables test isolation with in-memory databases.

REST (public API)

REST routes at /v1/* mirror the same domain package calls as tRPC. The OpenAPIHono framework provides Zod-validated request bodies and auto-generated OpenAPI 3.1 spec at /v1/openapi.json.

Procedure hierarchy

All tRPC procedures build on a composable middleware chain:

Each layer adds auth checks and narrows the context types, so downstream handlers can safely access ctx.userId and ctx.userRole without null checks.

Streaming architecture

The streaming subsystem is the most complex part of Dubby. It’s built from several cooperating components.

PlaybackPlanner

A pure function that takes a media probe result, client capabilities, and server capabilities, then produces a complete playback plan. The plan specifies exactly which of the 7 playback tiers to use, which streams to select, and what FFmpeg command to build.

7-tier playback hierarchy (cheapest to most expensive):

Tier	Mode	Description
1	Direct play	Source file served as-is via HTTP Range
2	Direct play + audio transcode	Video copied, audio re-encoded
3	Remux	Container repackage (e.g., MKV to MP4), streams copied
4	Remux + audio transcode	Repackage container and re-encode audio
5	Remux to HLS	Streams copied into HLS segments
6	Remux to HLS + audio transcode	HLS segments with re-encoded audio
7	Full transcode to HLS	Video and audio re-encoded into HLS

The planner considers: codec compatibility, HDR format (tone-mapping vs passthrough), keyframe intervals (>15s forces full transcode), and container format.

SessionManager

Tracks all active playback sessions with an explicit state machine:

Maintains two indexes: by session ID (O(1) lookup) and by user ID (per-user queries). Emits typed events on state transitions.

ProcessManager

Centralized FFmpeg process controller. Enforces per-user and global transcode limits — when a limit is exceeded, the oldest process is killed to make room. Handles graceful shutdown with SIGTERM first, then SIGKILL after a timeout.

SeekHandler

Debounced seek processor (150ms default). When a seek arrives, it evaluates three paths:

Cache hit — Target segment is already transcoded, return immediately
Wait — Target is close to the transcode frontier, poll until ready
Restart — Target is beyond the transcoded range, restart FFmpeg at the new position (new “epoch”)

SessionReaper

Background service running two intervals: a reap cycle that terminates idle sessions and cleans orphaned transcode directories, and a stats cycle that logs session/transcode counts.

Stream token authentication

HLS segment URLs carry a short-lived HMAC token (userId:sessionId:expiresAt:hmac) signed with BETTER_AUTH_SECRET. This is separate from the session Bearer token — if a segment URL leaks, it has limited scope and expiry. TV/mobile clients pass it as a ?token= query parameter since native video players can’t set HTTP headers on segment requests.

Workflow engine

Background jobs use a durable execution engine built on BullMQ (Valkey/Redis) with SQLite as the source-of-truth persistence layer.

Step checkpointing

The core primitive is ctx.step(name, fn):

Check if this step has already completed (from the workflow_steps table)
If yes, return the stored result without re-executing the function
If no, execute the function, persist the result, and return it

This means: after any crash, when a workflow is re-run, all previously-completed steps replay from the database instantly. Only the first incomplete step re-executes. This makes workflows crash-safe and resumable.

Fan-out and child workflows

ctx.fanOut(name, items, fn) — Runs a function for each item with bounded concurrency (semaphore). The entire result array is persisted as a single step.
ctx.spawnChildren(name, items) — Enqueues independent child workflows (potentially processed by different workers), then polls until all complete.

Cancellation

Workflows can be cancelled via JobQueue.cancel(runId). This publishes a cancel signal via Redis pub/sub. Before each step, the workflow checks its cancellation token and throws WorkflowCancelledError if cancelled.

Startup recovery

On every server start, interrupted (running) workflow runs are detected:

Runs with retries remaining are reset to pending and re-enqueued
Runs with no retries left are marked failed
Incomplete step rows from interrupted runs are deleted

Registered workflows

Queue	Workflows	Purpose
`scan`	Library scan	Discover files, spawn per-item ingest children
`ingest-prepare`	Ingest Prepare	Quick probe, persist record, chapter intro detection
`ingest-enrich`	Ingest Enrich	TMDB quick-match, ratings from MDBList
`ingest-enhance`	Ingest Enhance	Subtitle download, finalize as ready
`analyze`	Analyze Keyframes	Deferred keyframe analysis for playback optimization
`metadata`	Refresh, retry	Re-fetch metadata from providers
`subtitle`	Download	Search and download subtitles from OpenSubtitles
`detect-intros`	Detect Intro (Season)	Chromaprint audio fingerprint intro detection
`optimize`	Item optimization	Ahead-of-time transcoding for target devices
`optimize-orchestrator`	Library optimization	Fan-out optimization across a library
`system`	Orphan recovery	Re-enqueue stuck pipeline items
`backup`	Create, restore	Database backup and restore
`migration`	Migration import	Plex/Jellyfin data import

Auth architecture

Session management

Dubby uses better-auth with SQLite-backed sessions:

Password hashing: Argon2id (64 MB memory, 3 iterations, 4 parallelism)
Session lifetime: 7 days, auto-extended every 24 hours on activity
Auth methods: Email/password with Bearer token plugin for API clients

Sessions are resolved from cookies (web) or Authorization: Bearer headers (TV/mobile/API). Deactivated user accounts are rejected even with valid sessions.

Role capabilities

Four roles with capability-based access control:

Capability	Owner	Admin	Member	Guest
Manage server (core settings)	Yes	No	No	No
Manage settings (UI, transcoding)	Yes	Yes	No	No
Invite / manage users	Yes	Yes	No	No
Create / delete libraries	Yes	Yes	No	No
Scan libraries	Yes	Yes	No	No
Manage metadata	Yes	Yes	No	No
View audit logs	Yes	Yes	No	No
Access all libraries	Yes	Yes	No	No
Browse / play media	Yes	Yes	Yes	Yes

Audit logging

Security-relevant events (login attempts, role changes, feature flag modifications, library operations) are recorded to an audit log table. Retention is configurable and cleaned up automatically via a periodic cron job.

Real-time events (SSE)

The server publishes workflow events via Redis pub/sub. Connected clients receive them as Server-Sent Events:

SSE endpoint

GET /sse/workflows — Authenticated via cookie (web) or ?token= query parameter (TV/mobile).

The connection lifecycle:

Authenticate the request
Subscribe to the Redis event channel
Send connected event
Stream workflow events as they arrive
Send ping keepalives every 30 seconds
Clean up subscription on disconnect

Event types include: workflow.started, workflow.completed, workflow.failed, workflow.progress, step.started, step.completed, and fanout.progress. The admin dashboard uses these to show real-time scan progress, active sessions, and job status.

Deployment modes

The server supports three deployment modes via DUBBY_MODE:

Mode	HTTP	Workers	Use case
`standalone`	Yes	Yes (in-process)	Home servers, NAS, single-node
`api`	Yes	No (enqueue only)	Horizontal API scaling
`worker`	No	Yes	Dedicated worker nodes

In api + worker split mode, an init container runs migrations before the API pods start. Rate limits are per-process, so with multiple API pods the effective limit multiplies by pod count.

Media ingest pipeline

When you add a library and trigger a scan, the ingest pipeline takes media files from raw filesystem entries to fully enriched, playable library items. The pipeline is a 3-stage cascade designed for maximum throughput — CPU-bound work (probing) runs on a separate queue from network-bound work (metadata fetching), and items are pipelined individually rather than batched.

Phase 1: Discovery

The scanner recursively walks the library path, filtering for video file extensions (mkv, mp4, m4v, avi, mov, wmv, ts, m2ts, webm, flv, ogv). It applies ignore patterns to skip system files and special features (trailers, samples, featurettes) with smart pattern matching to avoid false positives on titles like “Extraordinary” or “The Long Long Trailer”.

For TV libraries, the discovery step pre-creates show and season records serially before spawning child jobs. This prevents race conditions where concurrent episode ingest jobs would try to create the same show record simultaneously.

Phase 2: Per-item ingest (3-stage cascade)

Each discovered file becomes an independent child workflow. Items appear in the app incrementally as each stage completes — you don’t wait for the whole scan to finish.

Stage 1: Prepare (ingest-prepare queue)

Step	Purpose	Fatal?
Quick probe	FFprobe the file — codecs, duration, resolution, chapters, embedded streams (skips keyframe analysis)	Yes
Persist	Insert or update the movie/episode record, scan subtitle tracks and audio tracks	Yes
Detect intro chapters	Detect intro boundaries from chapter markers (TV episodes only, instant)	No
Cascade	Set ingest status to `persisted`, enqueue Ingest Enrich	Yes
Enqueue keyframes	Enqueue deferred keyframe analysis on the `analyze` queue	No

Stage 2: Enrich (ingest-enrich queue)

Step	Purpose	Fatal?
Enrich	TMDB quick-match by title/year; on match, fetch metadata, artwork, credits, genres	No
Ratings	Fill missing ratings from MDBList (only runs if enrich matched)	No
Cascade	Set ingest status to `enriched`, enqueue Ingest Enhance	Yes

Both the enrich and ratings steps are protected by circuit breakers (5 failures → 30s cooldown) to avoid hammering external APIs during outages.

Stage 3: Enhance (ingest-enhance queue)

Step	Purpose	Fatal?
Subtitles	Download missing subtitles from OpenSubtitles (if configured); protected by circuit breaker	No
Finalize	Set ingest status to `ready`; for TV, update season/show episode counts and mark show ready when all episodes done	Yes

Non-fatal steps are wrapped in try/catch — if subtitle downloads or intro detection fail, the item still completes successfully. The durable workflow engine ensures that if the server crashes mid-ingest, completed steps are not re-executed on restart.

Phase 3: Post-scan

After all items are ingested, the library scan runs additional steps:

Step	Purpose
Auto-optimize	If enabled, enqueue ahead-of-time transcoding for the library
Auto-detect intros	For TV libraries, enqueue Chromaprint audio fingerprint analysis per season (enabled by default)
Metadata refresh	If requested (full scan), re-fetch metadata from providers
Backfill keyframes	Enqueue keyframe analysis for any items that haven’t been analyzed yet

Deferred keyframe analysis

Keyframe analysis determines whether a file can be remuxed to HLS (fast, low CPU) or must be fully transcoded (slow, high CPU). It’s accurate but expensive — 30–180 seconds per file compared to 1–5 seconds for a basic probe. To keep items appearing in the library quickly, keyframe analysis is deferred to a background analyze queue. The playback planner handles missing keyframe data gracefully, assuming remux will work — which is correct for roughly 90% of content.

Intro detection

Two complementary approaches detect intro sequences:

Chapter-based (inline, per-episode) — During the prepare stage, if the file has chapter markers labeled as “intro” or “recap”, intro boundaries are extracted instantly from the existing probe data.
Audio fingerprint (deferred, per-season) — After a TV library scan completes, Chromaprint generates audio fingerprints for the first ~5 minutes of each episode in a season, then identifies the longest common audio sequence. This is CPU-intensive but highly accurate for shows without chapter markers. Enabled by default.

File watching

A file watcher monitors enabled library paths for changes using chokidar. When a new file appears, a debouncer polls stat() to check that the file size and modification time are stable across 3 consecutive checks (default 5-second interval). This prevents partial file ingestion during slow network copies or active downloads. Once the file stabilizes, a scan workflow is enqueued automatically.

Privacy architecture

Dubby provides four privacy presets that control how the server interacts with external services and what data appears in logs.

Preset	External services	TMDB images	Log masking
Maximum	Blocked	Blocked	All fields
Private	Allowed	Proxied	All fields
Balanced	Allowed	Proxied	Paths, user info, IPs
Open	Allowed	Direct CDN	None

Image proxying

When enabled, TMDB poster and backdrop URLs are rewritten to route through the Dubby server. Clients never make direct requests to the TMDB CDN, so their IP addresses are never exposed to external services. The server fetches the image and serves it locally.

Log redaction

Four independent masking flags control what appears in logs and audit entries:

Mask file paths — Filesystem paths redacted from API responses and logs
Mask media titles — Titles redacted in external request logs
Mask user info — User details redacted in logs
Mask IP addresses — Last octet replaced with xxx in audit logs

External service gating

A master allowExternalConnections flag acts as a kill switch for all outbound requests. When disabled, TMDB metadata fetching, OpenSubtitles downloads, and analytics are all blocked — the server operates in fully offline mode. Individual services can also be toggled independently.

All privacy setting changes are recorded in the audit log.

Design patterns

Result monad

Domain packages return Result<T, E> instead of throwing exceptions. A Result is either { ok: true, value: T } or { ok: false, error: E }, forcing callers to handle both paths explicitly. Errors are semantic domain types (NotFoundError, ForbiddenError, ValidationError) that carry structured context rather than raw strings.

A shared classifyDomainError() function maps any domain error to a canonical error code, which is then translated to the appropriate HTTP status or tRPC error code by framework-specific middleware. This keeps domain logic completely decoupled from the transport layer — both REST and tRPC routes use the same classification, so error behavior is always consistent.

Port/adapter pattern

The streaming package — one of the most complex parts of the system — has zero dependency on the database. Instead, it defines port interfaces (MediaInfoPort, TrackLookupPort, VariantLookupPort) as plain TypeScript types. Concrete adapters that talk to the database are wired at the server’s composition root.

This means the entire streaming engine can be tested with mock ports, and the database schema can evolve without touching streaming code. The same pattern applies to other infrastructure boundaries.

Durable workflow execution

Background jobs (library scanning, metadata enrichment, subtitle downloads) run on a custom durable execution engine inspired by Temporal. The core primitive is ctx.step(name, fn):

Check if this step has already completed (from the database)
If yes, return the stored result without re-executing
If no, execute the function, persist the result, return it

After any crash, the workflow resumes from where it left off — completed steps replay instantly from the database, and only the first incomplete step re-executes. This makes every workflow crash-safe and resumable without any special error handling.

The engine also supports:

Fan-out — Process items in parallel with semaphore-based concurrency control. Progress is published to the event bus in real time.
Child workflows — Bulk-enqueue independent workflows and poll for completion. A library scan spawns thousands of per-item ingest children this way.
Cancellation — Cancel signals propagate via Redis pub/sub. Before each step, the context checks its cancellation token and throws WorkflowCancelledError.
Startup recovery — On every boot, interrupted runs are detected. Those with retries remaining are re-enqueued; those without are marked failed.

Epoch-based seek handling

Seeking during HLS transcoding is surprisingly complex. The seek handler debounces rapid scrubbing (150ms) to prevent FFmpeg process explosion, then evaluates three paths:

Cache hit — The target segment is already transcoded. Return immediately.
Wait — The target is close to the transcode frontier. Poll until FFmpeg catches up.
Restart — The target is beyond the transcoded range. Kill FFmpeg and restart at the new position with a new “epoch.”

The epoch counter ensures HLS segment filenames never collide across seeks — each restart writes to a new namespace. The result is a discriminated union (cache_hit | debounced | waited | transcode_restarted | direct_play | error) that gives clients detailed information about what happened and how long it took.

Capability-based authorization

Four roles (owner, admin, member, guest) map to an immutable capability table. Rather than checking role names, the system checks capabilities (canScanLibraries, canManageUsers, canAccessAllLibraries). tRPC procedures are built in a composable middleware hierarchy:

publicProcedure
  → protectedProcedure (requires session)
    → adminProcedure (requires owner or admin)
    → capabilityProcedure(cap) (requires specific capability)

Each layer narrows the TypeScript context types, so downstream handlers have compile-time guarantees about what’s available. This approach is future-proof — new roles can be added by mixing capabilities without modifying middleware.

Config resolution hierarchy

Runtime configuration follows a strict priority order:

Schema defaults — Hardcoded baseline values
Auto-detected values — Hardware capabilities (GPU, CPU cores, available codecs)
Database-persisted values — Admin settings from the UI (sensitive values encrypted with AES-256-GCM)
Environment variables — Highest priority, for container orchestration overrides

The config service supports hot-reload: when an admin changes a setting in the UI, it takes effect immediately without restarting the server. Listeners can subscribe to specific config keys for real-time updates.

Secret scrubbing

The per-request context object intentionally excludes secrets (BETTER_AUTH_SECRET, REDIS_URL, DATABASE_AUTH_TOKEN, DUBBY_ENCRYPTION_KEY). Error serialization in both tRPC and REST often includes context for debugging — this exclusion prevents accidental secret leakage in error payloads. Procedures that actually need secrets read from process.env directly rather than the context.

Heartbeat-based session recreation

Playback sessions send a heartbeat every 10 seconds with the current position and play state. If the server restarts and the session no longer exists, the client automatically:

Captures the current playing position
Recreates the session at that position
Resumes playback seamlessly

Users experience uninterrupted playback across server restarts — the player picks up exactly where it left off without manual intervention.

Cache-then-invalidate

Client state management follows a strict pattern: Zustand for local state (auth, preferences, player), TanStack Query for server state (via tRPC). Mutations use optimistic updates — the UI updates immediately, then the server confirms. On success, the cache is invalidated to pick up any server-side side effects. On failure, the optimistic update is rolled back and the previous state is restored.

API parity

tRPC is the internal API (used by web and TV clients); REST is the public API (for third-party integrations). Every tRPC endpoint has a corresponding REST endpoint with identical behavior, validation, and error handling. Both share the same Zod schemas and domain functions — the only difference is the transport layer. The REST API auto-generates an OpenAPI 3.1 spec at /v1/openapi.json.

Database conventions

IDs: nanoid strings (21 characters), never UUID or autoincrement
ORM: Drizzle with SQLite (libsql), WAL mode and foreign keys always enabled
Timestamps: ISO 8601 strings, not Date objects
Booleans: integer({ mode: 'boolean' }) (SQLite has no native boolean)
Test isolation: Domain functions accept db: Database as their first parameter, enabling each test to run against an independent in-memory database