Dock dock.build · v0.5

Holistic evaluations, dated

Reviews

Periodic full-product evaluations — architecture, product, design, security, the lot — scored and dated. Distinct from the Build Log: that's the running record of decisions; this is where we step back and grade the whole thing.

2026-06-13 · Full review — the loop is complete, self-host is one command

Snapshot after permissions, reviews, @mentions + an in-app inbox, @agent, collections, browse/search, a full mobile pass, and single-container deploy.

8.5/ 10 · A−

An exceptional foundation. The architecture is honest, the publish → review → revise loop is complete, and self-host is genuinely one command — a level of coherence most products don't reach in a year. What's left is operational, not conceptual: the hardening a public, multi-tenant, internet-facing tier demands.

DimensionScoreRead
Architecture9.0Ports/adapters with a real SQLite→Postgres→D1 + fs→S3/R2 ladder, switched by env. Compile-time parity guards make "works across drivers" a typecheck. One can() choke point for authz; an outbox for webhooks instead of a queue service. Single-container or split falls out of one image.
Code quality9.0Strict TS (no any, no suppressions), Biome, models own their DB access, batch over N+1. The drizzle drift that had caused cross-writer bugs was root-caused and locked with a pinned version + dedupe.
Product & strategy9.0A coherent thesis with sharp calls that land: review the rendered experience, not a line diff, because an artifact is a whole document; an agent is just a principal at commenter rank, so safety falls out of permissions, not new trust.
Design & UX8.5Google-Docs-style margin comments, a rendered-experience review surface, a real mobile pass driven at 390px, a polished split login, disciplined minimal color. A notch below the engineering only because some surfaces are young.
Security8.0CSP sandbox with no allow-same-origin, the authz choke point, webhook SSRF guard, per-IP rate limiting, HMAC-signed payloads, enforced auth secret. Strong fundamentals; the sandbox serving domain + abuse/takedown story are correctly flagged but not yet built.
Testing9.0~134 cases across seven packages: unit (the can() matrix, anchoring, diff) + API integration against real databases + multi-check end-to-end runs against real Postgres + browser verification + a nightly Playwright smoke.
Self-host & DX9.0One command is the whole app — docker run, SQLite + blobs in a volume, zero config. The same image scales out by adding DATABASE_URL + OBJECT_STORE_URL. One-click Railway/Fly with auto-detected URLs. Best-in-class for a young OSS product.
Scale & ops7.0Keyset pagination and a stateless-API option are in; but SSE + presence are in-process (Redis backplane queued), analytics grows unbounded (roll-up queued), it's single-workspace today, search is title-only, and observability is a delivery log — no metrics/tracing yet.

What we evaluated

The point where the loop closed: publish → comment → propose → review → revise, with @mentions and an in-app inbox, @agent as a safe contributor, collections, browse + server-side search, a full mobile pass, and a single-container deploy. Seven packages, ~134 tests, 54 API routes — built in roughly two days.

What's strong

  • The data layer is the standout. One set of ports, three drivers and two blob stores, switched by env, with compile-time parity guards: a schema that drifts between SQLite and Postgres fails the build. "Works everywhere" is enforced, not hoped.
  • Authorization is a single choke point. can(actor, action, visibility) gates every mutating route the same way, so a route that forgets to gate is the visible exception — not a silent hole.
  • The product reasons from first principles. Approving the rendered output (the diff demoted to a third tab) and modeling agents as commenter-rank principals are genuine insight, not features bolted on.
  • Verification is a habit. Real-database integration, multi-check end-to-end runs against real Postgres, browser-driven checks before merge, and a nightly smoke — not an afterthought.
  • Self-host is real. docker run is the whole product at one URL; the same image scales by env. That's the n8n/Ghost bar, met early.

Where the gaps are

  • Hosted readiness is the frontier. The two things called non-negotiable-before-public-signups — a dedicated sandbox serving domain and an abuse/takedown story — aren't built. Today the no-allow-same-origin CSP is the guard; a separate origin is the belt that should go with it.
  • Horizontal scale has known edges. SSE + presence are in-process, so multi-instance fan-out doesn't work until the Redis backplane lands; the analytics table grows unbounded pending roll-up.
  • Single-workspace today. Multi-tenant is designed and additive (org plugin, scope-by-org) but not in.
  • Thin on the edges. Search is title-only (no relevance/full-text yet); observability is a delivery log + health check, with no metrics or tracing.
  • It's two days old. Breadth is real and tested, but real-world load, abuse, and long-tail content haven't hit it.

Bottom line

Dock is an exceptional foundation: the architecture is honest, the loop is complete, and self-host is one command. The core is sound — what remains is everything a public, multi-tenant, internet-facing tier demands (a sandbox domain, multi-instance fan-out, abuse controls, observability, multi-tenant). The risk isn't the code; it's the operational surface a hosted tier exposes. A−, trending up.