Skip to content
🚧 These docs are a work in progress and may contain inaccuracies. Content is being actively reviewed and validated.

Architecture

Dubby is a TypeScript monorepo managed with Yarn workspaces and Nx. This page documents the system architecture from high-level structure down to the key subsystems.

AppDescription
ServerHono HTTP server with tRPC and REST APIs
WebExpo web client (browser)
TVReact Native TV app (tvOS + Android TV)
WebsiteMarketing site and documentation

Packages are organized in dependency layers. Lower layers have no internal dependencies; higher layers compose them.

PackagePurpose
@dubby/coreTypes, Result monad, domain errors, constants, nanoid ID generation
@dubby/configZod schemas for env vars and input validation
@dubby/dbDrizzle ORM schema (75+ tables), migrations, SQLite client
@dubby/loggerPino logger abstraction with secret redaction
PackageDependenciesPurpose
@dubby/authcore, dbbetter-auth instance, Argon2id password hashing, stream tokens
@dubby/catalogcore, db, scannerMovie/show queries, credits, search
@dubby/librarycore, dbLibrary CRUD, permissions, path validation
@dubby/parserFilename parsing (movie/TV/anime/date-based), zero dependencies
@dubby/scannercore, parserFilesystem walking, folder-context enrichment
@dubby/probecore, db, ffmpeg-utilsFFprobe media file analysis
@dubby/metadatacore, http-clientTMDB/TVDb API clients with rate limiting
@dubby/enrichmentcore, db, metadataMetadataManager — match, enrich, artwork download
@dubby/subtitlescore, db, http-clientOpenSubtitles integration
@dubby/watch-progresscore, dbPlay position tracking, continue watching
@dubby/media-analysiscore, ffmpeg-utilsCodec detection, device capability matching
@dubby/optimizecore, config, ffmpeg-utils, media-analysisAhead-of-time transcoding
PackageDependenciesPurpose
@dubby/jobscore, dbDurable workflow engine on BullMQ + SQLite
@dubby/streamingcore, probe, media-analysis, telemetryPlayback planning, FFmpeg orchestration, HLS
@dubby/ingestcore, db, enrichment, jobs, parser, probe, scanner, subtitlesScan/ingest workflow orchestration
@dubby/config-serviceconfig, db, jobsRuntime config with hot-reload
PackagePurpose
@dubby/api-clientTanStack Query hooks + tRPC client for web/TV
@dubby/uiShared React Native UI components (NativeWind)
@dubby/media-utilsImage URL builders, trickplay URL helpers
@dubby/i18ni18next internationalization (10 namespaces)
  • @dubby/db has zero internal dependencies — it’s a pure schema and query layer
  • @dubby/core has zero internal dependencies — shared primitives only
  • @dubby/streaming does not depend on @dubby/db directly — it uses port interfaces to avoid the coupling
  • @dubby/ingest is the largest leaf package — it orchestrates enrichment, optimization, subtitle downloads, and file watching

On boot, the server runs through a strict initialization order:

  1. Load and validate environment variables (fail-fast on missing/invalid values)
  2. Check database migrations are up to date (exits if pending)
  3. Initialize database connection, recover interrupted workflow runs
  4. Initialize probe cache, metadata manager, config service
  5. Detect server capabilities (CPU cores, GPU availability, FFmpeg codecs)
  6. Initialize streaming subsystem (session manager, process manager, seek handler, reaper)
  7. Create DI container (Valkey connection, job queues, event bus, file watchers, workers)
  8. Wire SSE event bus, start cron jobs
  9. Mount all route handlers, build Hono app, start HTTP server
  10. Register graceful shutdown handlers

Every HTTP request passes through these layers in order:

LayerPurpose
Request loggingMethod, path, status, duration
Security headersX-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy
CORSPermissive in development, strict allowlist in production
Health checks/health, /health/live, /health/ready — served before auth
Auth rate limitingFixed-window limiter on sign-in/sign-up endpoints
better-authSession/token resolution for /api/auth/*
Streaming routesHLS playlists, segments, direct file serving
Image and trickplayArtwork proxy, seek thumbnail sprites
SSELong-lived event streams
REST API (/v1/*)OpenAPIHono sub-app with per-route auth
tRPC (/api/*)@hono/trpc-server adapter with superjson
Static filesSPA serving in standalone mode
404 / error handlerCatches unmatched routes and unhandled errors

On SIGINT/SIGTERM, the server shuts down in reverse order: stop session reaper, drain active transcodes, shut down streaming service, stop telemetry, drain job workers, disconnect Valkey, close HTTP server. A configurable grace period allows in-flight transcodes to finish before process exit.

tRPC routers are thin shells. All business logic lives in domain packages:

Domain functions accept db: Database as their first parameter — never a singleton. This enables test isolation with in-memory databases.

REST routes at /v1/* mirror the same domain package calls as tRPC. The OpenAPIHono framework provides Zod-validated request bodies and auto-generated OpenAPI 3.1 spec at /v1/openapi.json.

All tRPC procedures build on a composable middleware chain:

Each layer adds auth checks and narrows the context types, so downstream handlers can safely access ctx.userId and ctx.userRole without null checks.

The streaming subsystem is the most complex part of Dubby. It’s built from several cooperating components.

A pure function that takes a media probe result, client capabilities, and server capabilities, then produces a complete playback plan. The plan specifies exactly which of the 7 playback tiers to use, which streams to select, and what FFmpeg command to build.

7-tier playback hierarchy (cheapest to most expensive):

TierModeDescription
1Direct playSource file served as-is via HTTP Range
2Direct play + audio transcodeVideo copied, audio re-encoded
3RemuxContainer repackage (e.g., MKV to MP4), streams copied
4Remux + audio transcodeRepackage container and re-encode audio
5Remux to HLSStreams copied into HLS segments
6Remux to HLS + audio transcodeHLS segments with re-encoded audio
7Full transcode to HLSVideo and audio re-encoded into HLS

The planner considers: codec compatibility, HDR format (tone-mapping vs passthrough), keyframe intervals (>15s forces full transcode), and container format.

Tracks all active playback sessions with an explicit state machine:

Maintains two indexes: by session ID (O(1) lookup) and by user ID (per-user queries). Emits typed events on state transitions.

Centralized FFmpeg process controller. Enforces per-user and global transcode limits — when a limit is exceeded, the oldest process is killed to make room. Handles graceful shutdown with SIGTERM first, then SIGKILL after a timeout.

Debounced seek processor (150ms default). When a seek arrives, it evaluates three paths:

  • Cache hit — Target segment is already transcoded, return immediately
  • Wait — Target is close to the transcode frontier, poll until ready
  • Restart — Target is beyond the transcoded range, restart FFmpeg at the new position (new “epoch”)

Background service running two intervals: a reap cycle that terminates idle sessions and cleans orphaned transcode directories, and a stats cycle that logs session/transcode counts.

HLS segment URLs carry a short-lived HMAC token (userId:sessionId:expiresAt:hmac) signed with BETTER_AUTH_SECRET. This is separate from the session Bearer token — if a segment URL leaks, it has limited scope and expiry. TV/mobile clients pass it as a ?token= query parameter since native video players can’t set HTTP headers on segment requests.

Background jobs use a durable execution engine built on BullMQ (Valkey/Redis) with SQLite as the source-of-truth persistence layer.

The core primitive is ctx.step(name, fn):

  1. Check if this step has already completed (from the workflow_steps table)
  2. If yes, return the stored result without re-executing the function
  3. If no, execute the function, persist the result, and return it

This means: after any crash, when a workflow is re-run, all previously-completed steps replay from the database instantly. Only the first incomplete step re-executes. This makes workflows crash-safe and resumable.

  • ctx.fanOut(name, items, fn) — Runs a function for each item with bounded concurrency (semaphore). The entire result array is persisted as a single step.
  • ctx.spawnChildren(name, items) — Enqueues independent child workflows (potentially processed by different workers), then polls until all complete.

Workflows can be cancelled via JobQueue.cancel(runId). This publishes a cancel signal via Redis pub/sub. Before each step, the workflow checks its cancellation token and throws WorkflowCancelledError if cancelled.

On every server start, interrupted (running) workflow runs are detected:

  • Runs with retries remaining are reset to pending and re-enqueued
  • Runs with no retries left are marked failed
  • Incomplete step rows from interrupted runs are deleted
QueueWorkflowsPurpose
scanLibrary scanDiscover files, spawn per-item ingest children
ingest-prepareIngest PrepareQuick probe, persist record, chapter intro detection
ingest-enrichIngest EnrichTMDB quick-match, ratings from MDBList
ingest-enhanceIngest EnhanceSubtitle download, finalize as ready
analyzeAnalyze KeyframesDeferred keyframe analysis for playback optimization
metadataRefresh, retryRe-fetch metadata from providers
subtitleDownloadSearch and download subtitles from OpenSubtitles
detect-introsDetect Intro (Season)Chromaprint audio fingerprint intro detection
optimizeItem optimizationAhead-of-time transcoding for target devices
optimize-orchestratorLibrary optimizationFan-out optimization across a library
systemOrphan recoveryRe-enqueue stuck pipeline items
backupCreate, restoreDatabase backup and restore
migrationMigration importPlex/Jellyfin data import

Dubby uses better-auth with SQLite-backed sessions:

  • Password hashing: Argon2id (64 MB memory, 3 iterations, 4 parallelism)
  • Session lifetime: 7 days, auto-extended every 24 hours on activity
  • Auth methods: Email/password with Bearer token plugin for API clients

Sessions are resolved from cookies (web) or Authorization: Bearer headers (TV/mobile/API). Deactivated user accounts are rejected even with valid sessions.

Four roles with capability-based access control:

CapabilityOwnerAdminMemberGuest
Manage server (core settings)YesNoNoNo
Manage settings (UI, transcoding)YesYesNoNo
Invite / manage usersYesYesNoNo
Create / delete librariesYesYesNoNo
Scan librariesYesYesNoNo
Manage metadataYesYesNoNo
View audit logsYesYesNoNo
Access all librariesYesYesNoNo
Browse / play mediaYesYesYesYes

Security-relevant events (login attempts, role changes, feature flag modifications, library operations) are recorded to an audit log table. Retention is configurable and cleaned up automatically via a periodic cron job.

The server publishes workflow events via Redis pub/sub. Connected clients receive them as Server-Sent Events:

GET /sse/workflows — Authenticated via cookie (web) or ?token= query parameter (TV/mobile).

The connection lifecycle:

  1. Authenticate the request
  2. Subscribe to the Redis event channel
  3. Send connected event
  4. Stream workflow events as they arrive
  5. Send ping keepalives every 30 seconds
  6. Clean up subscription on disconnect

Event types include: workflow.started, workflow.completed, workflow.failed, workflow.progress, step.started, step.completed, and fanout.progress. The admin dashboard uses these to show real-time scan progress, active sessions, and job status.

The server supports three deployment modes via DUBBY_MODE:

ModeHTTPWorkersUse case
standaloneYesYes (in-process)Home servers, NAS, single-node
apiYesNo (enqueue only)Horizontal API scaling
workerNoYesDedicated worker nodes

In api + worker split mode, an init container runs migrations before the API pods start. Rate limits are per-process, so with multiple API pods the effective limit multiplies by pod count.

When you add a library and trigger a scan, the ingest pipeline takes media files from raw filesystem entries to fully enriched, playable library items. The pipeline is a 3-stage cascade designed for maximum throughput — CPU-bound work (probing) runs on a separate queue from network-bound work (metadata fetching), and items are pipelined individually rather than batched.

The scanner recursively walks the library path, filtering for video file extensions (mkv, mp4, m4v, avi, mov, wmv, ts, m2ts, webm, flv, ogv). It applies ignore patterns to skip system files and special features (trailers, samples, featurettes) with smart pattern matching to avoid false positives on titles like “Extraordinary” or “The Long Long Trailer”.

For TV libraries, the discovery step pre-creates show and season records serially before spawning child jobs. This prevents race conditions where concurrent episode ingest jobs would try to create the same show record simultaneously.

Phase 2: Per-item ingest (3-stage cascade)

Section titled “Phase 2: Per-item ingest (3-stage cascade)”

Each discovered file becomes an independent child workflow. Items appear in the app incrementally as each stage completes — you don’t wait for the whole scan to finish.

Stage 1: Prepare (ingest-prepare queue)

StepPurposeFatal?
Quick probeFFprobe the file — codecs, duration, resolution, chapters, embedded streams (skips keyframe analysis)Yes
PersistInsert or update the movie/episode record, scan subtitle tracks and audio tracksYes
Detect intro chaptersDetect intro boundaries from chapter markers (TV episodes only, instant)No
CascadeSet ingest status to persisted, enqueue Ingest EnrichYes
Enqueue keyframesEnqueue deferred keyframe analysis on the analyze queueNo

Stage 2: Enrich (ingest-enrich queue)

StepPurposeFatal?
EnrichTMDB quick-match by title/year; on match, fetch metadata, artwork, credits, genresNo
RatingsFill missing ratings from MDBList (only runs if enrich matched)No
CascadeSet ingest status to enriched, enqueue Ingest EnhanceYes

Both the enrich and ratings steps are protected by circuit breakers (5 failures → 30s cooldown) to avoid hammering external APIs during outages.

Stage 3: Enhance (ingest-enhance queue)

StepPurposeFatal?
SubtitlesDownload missing subtitles from OpenSubtitles (if configured); protected by circuit breakerNo
FinalizeSet ingest status to ready; for TV, update season/show episode counts and mark show ready when all episodes doneYes

Non-fatal steps are wrapped in try/catch — if subtitle downloads or intro detection fail, the item still completes successfully. The durable workflow engine ensures that if the server crashes mid-ingest, completed steps are not re-executed on restart.

After all items are ingested, the library scan runs additional steps:

StepPurpose
Auto-optimizeIf enabled, enqueue ahead-of-time transcoding for the library
Auto-detect introsFor TV libraries, enqueue Chromaprint audio fingerprint analysis per season (enabled by default)
Metadata refreshIf requested (full scan), re-fetch metadata from providers
Backfill keyframesEnqueue keyframe analysis for any items that haven’t been analyzed yet

Keyframe analysis determines whether a file can be remuxed to HLS (fast, low CPU) or must be fully transcoded (slow, high CPU). It’s accurate but expensive — 30–180 seconds per file compared to 1–5 seconds for a basic probe. To keep items appearing in the library quickly, keyframe analysis is deferred to a background analyze queue. The playback planner handles missing keyframe data gracefully, assuming remux will work — which is correct for roughly 90% of content.

Two complementary approaches detect intro sequences:

  1. Chapter-based (inline, per-episode) — During the prepare stage, if the file has chapter markers labeled as “intro” or “recap”, intro boundaries are extracted instantly from the existing probe data.

  2. Audio fingerprint (deferred, per-season) — After a TV library scan completes, Chromaprint generates audio fingerprints for the first ~5 minutes of each episode in a season, then identifies the longest common audio sequence. This is CPU-intensive but highly accurate for shows without chapter markers. Enabled by default.

A file watcher monitors enabled library paths for changes using chokidar. When a new file appears, a debouncer polls stat() to check that the file size and modification time are stable across 3 consecutive checks (default 5-second interval). This prevents partial file ingestion during slow network copies or active downloads. Once the file stabilizes, a scan workflow is enqueued automatically.

Dubby provides four privacy presets that control how the server interacts with external services and what data appears in logs.

PresetExternal servicesTMDB imagesLog masking
MaximumBlockedBlockedAll fields
PrivateAllowedProxiedAll fields
BalancedAllowedProxiedPaths, user info, IPs
OpenAllowedDirect CDNNone

When enabled, TMDB poster and backdrop URLs are rewritten to route through the Dubby server. Clients never make direct requests to the TMDB CDN, so their IP addresses are never exposed to external services. The server fetches the image and serves it locally.

Four independent masking flags control what appears in logs and audit entries:

  • Mask file paths — Filesystem paths redacted from API responses and logs
  • Mask media titles — Titles redacted in external request logs
  • Mask user info — User details redacted in logs
  • Mask IP addresses — Last octet replaced with xxx in audit logs

A master allowExternalConnections flag acts as a kill switch for all outbound requests. When disabled, TMDB metadata fetching, OpenSubtitles downloads, and analytics are all blocked — the server operates in fully offline mode. Individual services can also be toggled independently.

All privacy setting changes are recorded in the audit log.

Domain packages return Result<T, E> instead of throwing exceptions. A Result is either { ok: true, value: T } or { ok: false, error: E }, forcing callers to handle both paths explicitly. Errors are semantic domain types (NotFoundError, ForbiddenError, ValidationError) that carry structured context rather than raw strings.

A shared classifyDomainError() function maps any domain error to a canonical error code, which is then translated to the appropriate HTTP status or tRPC error code by framework-specific middleware. This keeps domain logic completely decoupled from the transport layer — both REST and tRPC routes use the same classification, so error behavior is always consistent.

The streaming package — one of the most complex parts of the system — has zero dependency on the database. Instead, it defines port interfaces (MediaInfoPort, TrackLookupPort, VariantLookupPort) as plain TypeScript types. Concrete adapters that talk to the database are wired at the server’s composition root.

This means the entire streaming engine can be tested with mock ports, and the database schema can evolve without touching streaming code. The same pattern applies to other infrastructure boundaries.

Background jobs (library scanning, metadata enrichment, subtitle downloads) run on a custom durable execution engine inspired by Temporal. The core primitive is ctx.step(name, fn):

  1. Check if this step has already completed (from the database)
  2. If yes, return the stored result without re-executing
  3. If no, execute the function, persist the result, return it

After any crash, the workflow resumes from where it left off — completed steps replay instantly from the database, and only the first incomplete step re-executes. This makes every workflow crash-safe and resumable without any special error handling.

The engine also supports:

  • Fan-out — Process items in parallel with semaphore-based concurrency control. Progress is published to the event bus in real time.
  • Child workflows — Bulk-enqueue independent workflows and poll for completion. A library scan spawns thousands of per-item ingest children this way.
  • Cancellation — Cancel signals propagate via Redis pub/sub. Before each step, the context checks its cancellation token and throws WorkflowCancelledError.
  • Startup recovery — On every boot, interrupted runs are detected. Those with retries remaining are re-enqueued; those without are marked failed.

Seeking during HLS transcoding is surprisingly complex. The seek handler debounces rapid scrubbing (150ms) to prevent FFmpeg process explosion, then evaluates three paths:

  • Cache hit — The target segment is already transcoded. Return immediately.
  • Wait — The target is close to the transcode frontier. Poll until FFmpeg catches up.
  • Restart — The target is beyond the transcoded range. Kill FFmpeg and restart at the new position with a new “epoch.”

The epoch counter ensures HLS segment filenames never collide across seeks — each restart writes to a new namespace. The result is a discriminated union (cache_hit | debounced | waited | transcode_restarted | direct_play | error) that gives clients detailed information about what happened and how long it took.

Four roles (owner, admin, member, guest) map to an immutable capability table. Rather than checking role names, the system checks capabilities (canScanLibraries, canManageUsers, canAccessAllLibraries). tRPC procedures are built in a composable middleware hierarchy:

publicProcedure
→ protectedProcedure (requires session)
→ adminProcedure (requires owner or admin)
→ capabilityProcedure(cap) (requires specific capability)

Each layer narrows the TypeScript context types, so downstream handlers have compile-time guarantees about what’s available. This approach is future-proof — new roles can be added by mixing capabilities without modifying middleware.

Runtime configuration follows a strict priority order:

  1. Schema defaults — Hardcoded baseline values
  2. Auto-detected values — Hardware capabilities (GPU, CPU cores, available codecs)
  3. Database-persisted values — Admin settings from the UI (sensitive values encrypted with AES-256-GCM)
  4. Environment variables — Highest priority, for container orchestration overrides

The config service supports hot-reload: when an admin changes a setting in the UI, it takes effect immediately without restarting the server. Listeners can subscribe to specific config keys for real-time updates.

The per-request context object intentionally excludes secrets (BETTER_AUTH_SECRET, REDIS_URL, DATABASE_AUTH_TOKEN, DUBBY_ENCRYPTION_KEY). Error serialization in both tRPC and REST often includes context for debugging — this exclusion prevents accidental secret leakage in error payloads. Procedures that actually need secrets read from process.env directly rather than the context.

Playback sessions send a heartbeat every 10 seconds with the current position and play state. If the server restarts and the session no longer exists, the client automatically:

  1. Captures the current playing position
  2. Recreates the session at that position
  3. Resumes playback seamlessly

Users experience uninterrupted playback across server restarts — the player picks up exactly where it left off without manual intervention.

Client state management follows a strict pattern: Zustand for local state (auth, preferences, player), TanStack Query for server state (via tRPC). Mutations use optimistic updates — the UI updates immediately, then the server confirms. On success, the cache is invalidated to pick up any server-side side effects. On failure, the optimistic update is rolled back and the previous state is restored.

tRPC is the internal API (used by web and TV clients); REST is the public API (for third-party integrations). Every tRPC endpoint has a corresponding REST endpoint with identical behavior, validation, and error handling. Both share the same Zod schemas and domain functions — the only difference is the transport layer. The REST API auto-generates an OpenAPI 3.1 spec at /v1/openapi.json.

  • IDs: nanoid strings (21 characters), never UUID or autoincrement
  • ORM: Drizzle with SQLite (libsql), WAL mode and foreign keys always enabled
  • Timestamps: ISO 8601 strings, not Date objects
  • Booleans: integer({ mode: 'boolean' }) (SQLite has no native boolean)
  • Test isolation: Domain functions accept db: Database as their first parameter, enabling each test to run against an independent in-memory database