Configuration
Config files live in config/ and are picked up automatically. TOML
and YAML are both supported (the parser is chosen by file extension).
Any value can be overridden by an environment variable using the
TSUNDOKU_ prefix and __ for nesting:
TSUNDOKU_SERVER__PORT=9000 \
TSUNDOKU_STORAGE__DATA_DIR=/var/lib/tsundoku \
tsundoku serve
A sibling <stem>.local.<ext> file (e.g. tsundoku.docker.local.toml)
gets auto-merged on top of the base file before env vars apply. The
local file is gitignored — use it for secrets and host-specific
overrides.
The canonical reference is
config/tsundoku.example.toml
with inline documentation for every key. This page is a faster-to-skim
version organized by section.
[server]
[server]
host = "127.0.0.1"
port = 8080
Bind address for the HTTP server. Behind a reverse proxy you'll
typically set host = "0.0.0.0" and let the proxy terminate TLS.
[storage]
Single on-disk root. The database, every provider's offline cache, the cover cache, and scratch space all live here. Each subpath defaults to a subdirectory; override individually if you need to.
[storage]
data_dir = "./data"
# database_path = "./data/db/tsundoku.db"
# provider_cache_dir = "./data/cache/providers"
# cover_cache_dir = "./data/cache/covers"
# tmp_dir = "./data/tmp"
| Default | Contents |
|---|---|
${data_dir}/db/tsundoku.db | SQLite database (the only stateful file) |
${data_dir}/cache/providers/ | Metadata provider offline caches (e.g. MangaBaka dump) |
${data_dir}/cache/covers/ | Reserved for future cover-image cache |
${data_dir}/tmp/ | Transient downloads, in-progress ingests |
Docker mounts a single volume at data_dir. Back up by copying the
directory.
[logging]
[logging]
level = "info"
json = false
level is parsed by tracing-subscriber's
EnvFilter
syntax — accepts info, debug, tsundoku=trace,sea_orm=warn, etc.
Set json = true when log aggregators want structured output.
[api]
[api]
docs = true
Mount the Scalar API docs UI at /docs. Disable in hardened
deployments where you don't want a docs surface exposed.
[auth]
Single-user auth from config, not a users table.
[auth]
read_requires_auth = false
# api_key = "read-...........-only"
# admin_token = "write-..........-only"
- Reads are public by default. Flip
read_requires_auth = trueand setapi_key; the frontend sends the key viaX-API-KeyorAuthorization: Bearer. - Write endpoints (review queue, manual polls, manual cache refresh,
retry/reject) always require
admin_tokenas a bearer token. - A missing
admin_tokenreturns503 Misconfigured(distinct from401 Unauthorized) so a fresh deploy doesn't look like a credentialing bug.
[metadata]
[metadata]
active_provider = "mangabaka"
Exactly one provider runs the auto-resolution path. Others may be
registered (for cross-provider foreign-ID chains and review-UI
search), but only active_provider drives the resolver's fuzzy step.
Switching is a config-level decision, not a runtime API call.
[metadata.series_refresh]
[metadata.series_refresh]
# cron = "0 5 * * *" # disabled by default; omit to keep it off
batch_size = 50
min_age_days = 7
Re-fetches catalog series rows from the active provider so the
stored title, description, cover, genres, tags, counts, and rating
keep up with upstream changes. Distinct from
providers.mangabaka.offline_refresh_cron,
which swaps the provider's dump but never touches a series row.
| Field | Default | Purpose |
|---|---|---|
cron | unset | Cron for the scheduled job; absent disables it. The POST /series/refresh-all and tsundoku refresh-series triggers still work. 5-field crons get padded to seconds-0. |
batch_size | 50 | Max rows refreshed per tick. Each row is one outbound provider call; tune to what the provider's rate limit tolerates. 0 makes every tick a no-op (transient disable without dropping the cron). |
min_age_days | 7 | Skip rows whose metadata_fetched_at is fresher than this many days. Matches MangaBaka's published-dump cadence; tighten or loosen by observed upstream churn. |
The full operator-facing story (UI, CLI, endpoints) is on the Providers page → Series-row refresh.
[providers.mangabaka]
[providers.mangabaka]
enabled = true
api_base_url = "https://api.mangabaka.dev"
# api_key = "mb-..."
# api_fallback = true
# offline_dump_url = "https://..."
# offline_refresh_cron = "0 4 * * 0"
negative_cache_ttl_days = 7
timeout_seconds = 60
| Field | Purpose |
|---|---|
enabled | When false, provider is skipped at boot. |
api_base_url | Live-API endpoint. Used for api_fallback calls. |
api_key | Optional. Without it, the provider runs in offline-only mode. |
api_fallback | When true + api_key set, cache misses fall back to a live API call. |
offline_dump_url | URL of the nightly SQLite dump. Leave unset to disable the offline cache. |
offline_refresh_cron | Cron expression for periodic dump refresh. 5-field crons are auto-padded to seconds-0. |
negative_cache_ttl_days | How long to remember "this ID isn't in MangaBaka" before retrying. |
timeout_seconds | HTTP timeout for both the API and the dump download. |
The full lifecycle of the offline cache lives on the Providers page.
[ingestion]
Controls how releases get matched to series rows. Thresholds use Dice-coefficient scoring on character bigrams (case- and punctuation-insensitive).
[ingestion]
resolution_threshold = 0.85
review_threshold = 0.55
fuzzy_search_limit = 10
queue_low_confidence = true
# [[ingestion.format_type_rules]]
# formats = ["cbz", "cbr", "zip", "rar"]
# required_kinds = ["manga", "manhwa", "manhua"]
# [ingestion.cleanup]
# extra_format_keywords = ["Remastered", "DigitalUncen"]
Thresholds
| Field | Default | Meaning |
|---|---|---|
resolution_threshold | 0.85 | Dice score above which a fuzzy hit auto-resolves. |
review_threshold | 0.55 | Minimum score to surface a candidate in the review queue. |
fuzzy_search_limit | 10 | Cap on MetadataProvider::search candidates inspected per release. |
queue_low_confidence | true | When false, sub-threshold hits leave the release unresolved without writing candidates. |
[[ingestion.format_type_rules]]
A rule fires when any of its formats is detected on the release.
The matched series's kind must then be in required_kinds
(case-insensitive). Mismatches demote the release to ambiguous for
human review. Empty list disables format-type validation.
[ingestion.cleanup]
Title-cleaning knobs. The resolver strips structural noise (parens, brackets, volume/chapter markers, file extensions, format keywords, year tokens) from each release title before searching the active provider. Rules and ordering live in code; the only operator surface is the keyword list.
extra_format_keywords are appended to the built-in list (Digital,
Raw, Color, Colored, Omnibus, Premium, Complete,
Decensored, Uncensored, Webtoon, WN, LN). Each entry is
matched as a whole-word token, case-insensitively. Regex
metacharacters are rejected at config load — entries must be plain
words or phrases.
[[sources]]
Each entry is one polled instance. The kind field picks the
implementation; per-kind options live in the matching nested block.
[[sources]]
kind = "nyaa"
name = "english-manga-trusted"
cron = "0 */2 * * *" # every 2 hours
enabled = true
[sources.nyaa]
feed_url = "https://nyaa.si/?page=rss&c=3_1&f=2"
fetch_details = true
timeout_seconds = 30
site_base_url = "https://nyaa.si"
| Field | Purpose |
|---|---|
kind | Implementation selector. v1 ships only "nyaa". |
name | Stable identifier. Used in URLs, metrics, and config overrides. |
cron | Schedule. Omit to skip the scheduled poll; the CLI one-shot still works. |
enabled | Set to false to keep the entry around without polling. |
sources.nyaa.feed_url | The RSS URL to poll. Tune per the Sources page. |
sources.nyaa.fetch_details | Fetch each post's detail page for richer file/link data. Default true. |
sources.nyaa.timeout_seconds | HTTP timeout for feed + detail fetches. |
sources.nyaa.site_base_url | Override for proxied feeds. Defaults to https://nyaa.si. |
Multiple [[sources]] blocks polling the same kind are supported —
useful for distinct uploader feeds, language subcategories, or
parallel polling at different cadences.