user {}

# Why nextrs over Next.js

> Same app, same UI, same database — measured head to head, twice: a minimal app and a real production app converted end-to-end

<div class="not-prose my-8 grid grid-cols-1 sm:grid-cols-2 gap-4">
  <div class="stats shadow w-full">
    <div class="stat">
      <div class="stat-title">Authed, DB-backed page</div>
      <div class="stat-value text-primary">99×</div>
      <div class="stat-desc">throughput vs Next.js — real app, same Postgres</div>
    </div>
  </div>
  <div class="stats shadow w-full">
    <div class="stat">
      <div class="stat-title">Public page render</div>
      <div class="stat-value text-primary">522×</div>
      <div class="stat-desc">340k req/s vs 652 req/s</div>
    </div>
  </div>
  <div class="stats shadow w-full">
    <div class="stat">
      <div class="stat-title">Cold start, real app</div>
      <div class="stat-value text-primary">20×</div>
      <div class="stat-desc">215 ms vs 4.3 s — and nextrs cold ≈ its warm</div>
    </div>
  </div>
  <div class="stats shadow w-full">
    <div class="stat">
      <div class="stat-title">Memory serving</div>
      <div class="stat-value text-primary">2.6×</div>
      <div class="stat-desc">92 MB vs 236 MB — real app, RSS</div>
    </div>
  </div>
</div>

Benchmark blog posts usually compare a hello-world. We did that too — but then we took a **real production app** (a bookings/admin platform: better-auth, Postgres, S3, shadcn/radix, 23 pages, 68 server actions) and converted it to nextrs with **byte-identical frontends** — same React components, same flows, verified route-by-route and flow-by-flow against the original before any benchmark ran. Only the backend changed: the Node/RSC runtime became a single compiled Rust binary.

Everything below is measured, reproducible from [`benchmarks/`](https://github.com/drewhirschi/nextrs/tree/main/benchmarks), and reported with its caveats. The conversion itself is documented down to per-slice timings in [`docs/hhh-migration-timelog.md`](https://github.com/drewhirschi/nextrs/blob/main/docs/hhh-migration-timelog.md).

## The real app, head to head

Local, matched profiles (release Rust vs production `next build`), same machine, same Postgres, `hey` with 50 concurrent connections:

| Metric | nextrs | Next.js | gap |
|---|---|---|---|
| **Page `/` (public landing)** | 340,589 req/s | 652 req/s | **~522×** |
| **Page `/app` (authed: cookie HMAC + session row + user query)** | 38,351 req/s | 389 req/s | **~99×** |
| `/app` latency p50 / p99 | 1.3 / 1.8 ms | 123 / 206 ms | ~95× |
| **Memory (RSS, serving)** | 91.7 MB | 235.8 MB | **~2.6×** |

The authed row is the one to stare at: both sides validate the session cookie and hit the same Postgres on every request. That's not a static-file trick — it's the per-request cost of the framework runtime, and it's two orders of magnitude.

The minimal-app numbers (same todos app, both client-rendered, in-memory store) are the ceiling: **~423×** page throughput, **~132×** API throughput, **~43×** memory (5.7 MB vs 247 MB). Details in [`benchmarks/results/results.md`](https://github.com/drewhirschi/nextrs/blob/main/benchmarks/results/results.md).

## Why it's this lopsided

A nextrs request is a **compiled Rust function** — the handler runs in well under a millisecond, with no per-request runtime to spin up. A Next.js request, even for a client-rendered page, runs through the **Node + React Server Components pipeline** every time: serialize the flight payload, resolve the dynamic import, run the framework's request machinery. The per-request cost is the *runtime*, not the rendering — which is why the gap holds even when both pages render in the browser.

## Cold starts: latency *and* frequency

Vercel exposes no cold/warm signal, so both apps self-report (`x-cold`, `x-instance` headers) and we count instances directly. Same region, both apps loaded **simultaneously**.

**Latency — this is where app size decides everything.** On the minimal app the gap is modest: cold p50 **648 ms vs 830 ms**, a ~200 ms difference that is Node runtime boot vs loading a static binary. On the **real app**, that boot cost explodes with the dependency tree:

| Cold start, real app (`iad1`) | nextrs | Next.js |
|---|---|---|
| cold p50 | **215 ms** | **4,323 ms** |
| cold p95 | 582 ms | 4,812 ms |
| warm p50 | 209 ms | 342 ms |

nextrs's cold start is statistically indistinguishable from its warm requests — loading the binary costs nothing your users can see. Next.js's grew ~5× from the todo app to **4.3 seconds**, because every cold instance re-boots the framework plus the app's module graph. One line grows with your app; the other doesn't.

**Frequency** — how often users actually *hit* a cold start. At low concurrency it's a tie, and we say so: Vercel scales per concurrent connection regardless of framework. Under 150-way sustained load on the **real app**, **Next.js needed 100 instances (89 cold boots); nextrs served the same load on 43 (32)** — 57% fewer instances, half the cold starts per request, and instance-time is what Fluid compute bills. The two effects compound: Next.js's cold starts are both ~2× more frequent *and* ~20× more expensive, which is why its p95 TTFB under that load was 5.5 s while nextrs's p50 sat at ~200 ms.

## The conversion is real — and repeatable

The real-app comparison only counts because the two frontends are identical. The conversion that got us there is codified in an agent-followable guide ([`docs/migrating-nextjs-to-nextrs.md`](https://github.com/drewhirschi/nextrs/blob/main/docs/migrating-nextjs-to-nextrs.md)): server actions become same-signature fetch shims (call sites unchanged), server-component pages become seeded client pages, and even better-auth moved into the binary — a native Rust implementation of its wire protocol (scrypt, signed session cookies, Google OAuth), oracle-diffed 48/48 against the real thing and locked in by 111 tests, with the unchanged better-auth React client none the wiser. The whole conversion was verified route-by-route, three roles, money flows step-by-step, plus a byte-level wire audit that caught two serialization drifts before they could ship.

The deployed nextrs app is **one Rust binary and a folder of static files**. No Node runtime anywhere.

Scaffold to fully-verified conversion: **~4.5 hours wall clock**, mostly parallel agents. The timelog has every slice.

## Reading it honestly

- **Warm latency over the network is a tie.** ~260 ms round-trips bury a sub-millisecond handler. nextrs wins throughput, memory, cold start, and instance count — not warm wall-clock latency.
- **nextrs's memory advantage shrinks as the app grows** — 43× on the todo app, 2.6× on the real one (5.7 → 92 MB; the sqlx pool and a 31 MB binary are real). Node's footprint barely moved (247 → 236 MB): it's dominated by the runtime floor, nextrs's by what your app actually uses.
- **Throughput numbers are floors.** At 340k req/s the load generator is the bottleneck, not the server.
- **This isn't "Next.js is bad."** Next.js ships HMR, a vast ecosystem, RSC streaming, image optimization — far more than these apps exercise. The claim is narrow: for the same user-visible app, nextrs serves it with a fraction of the per-request cost, memory, and cold-start exposure.

## Reproduce it

```sh
# Minimal app: throughput + memory (local)
benchmarks/scripts/bench-local.sh
# Real app: throughput + memory (local, DB-backed)
benchmarks/scripts/bench-hhh-local.sh
# Cold start latency + frequency (against deployed URLs)
benchmarks/scripts/bench-cold.sh      https://your-app.vercel.app/api/health
benchmarks/scripts/bench-cold-freq.sh https://your-app.vercel.app/api/health 300 40
```

Fairness controls — matched build profiles, both pages client-rendered, per-request fresh data, same-region simultaneous cold-start runs — are documented in [`benchmarks/methodology.md`](https://github.com/drewhirschi/nextrs/blob/main/benchmarks/methodology.md).

---

# Getting Started

> Set up a nextrs app: the app/ tree, build-time codegen, and the dev loop

nextrs is a Next.js-style routing framework for Rust. You write convention files (`page.rs`, `layout.rs`, `loading.html`, `middleware.rs`, `route.rs`) in an `app/` directory; a build step discovers them and wires the router. No client-side framework — pages are server-rendered HTML, streamed when a route has a loading state.

## The pieces

A nextrs app is a normal Rust binary crate plus three things:

```
mysite/
├── Cargo.toml          # depends on nextrs; build-dep on nextrs with "build" feature
├── build.rs            # one call: emit_registry
├── askama.toml         # points Askama at app/ for templates
├── app/                # your routes (the convention tree)
│   ├── layout.rs       # root layout (+ layout.html Askama template)
│   ├── page.rs         # /
│   └── hello/
│       └── page.html   # /hello — static HTML needs no Rust at all
├── public/             # static assets, served at the root URL path
└── src/
    └── main.rs         # ~15 lines: include the registry, serve it
```

`Cargo.toml`:

```toml
[dependencies]
nextrs = "0.1"
axum = "0.8"
tokio = { version = "1", features = ["full"] }
askama = "0.15"

[build-dependencies]
nextrs = { version = "0.1", features = ["build"] }
```

`build.rs`:

```rust
fn main() {
    nextrs::build::emit_registry("app", "src/main.rs", "nextrs_routes.rs")
        .expect("nextrs::build::emit_registry failed");
}
```

`emit_registry` scans `app/`, and writes a generated `generated_registry()` function into `$OUT_DIR`. It also tells cargo to rerun whenever anything under `app/` changes, so adding a file is enough — no manual wiring, ever. (A copy of the generated code is dumped to `target/nextrs/` if you want to read it.)

`src/main.rs`:

```rust
include!(concat!(env!("OUT_DIR"), "/nextrs_routes.rs"));

#[tokio::main]
async fn main() {
    let app = nextrs::router::build_router_with_public(
        generated_registry(),
        concat!(env!("CARGO_MANIFEST_DIR"), "/public"),
    );
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}
```

You own `main.rs` — pick the address, attach tower layers (the demo site adds `tower-livereload` in debug builds), read env vars. The framework only owns the router.

`askama.toml`:

```toml
[general]
dirs = ["app"]
```

## Your first page

`app/page.rs` plus an Askama template `app/page.html`:

```rust
use askama::Template;

#[derive(Template)]
#[template(path = "page.html")]
pub struct HomePage;

pub async fn render(_req: http::Request<axum::body::Body>) -> String {
    HomePage.render().unwrap()
}
```

Pages are async functions from a request to an HTML string. They can await anything — database calls, upstream APIs — and read headers, the URI, and extensions set by middleware from the request. If a page doesn't need Rust at all, skip the `.rs` file and write just `page.html`; the build step serves it statically.

Run it:

```bash
cargo run
# Listening on 0.0.0.0:3000
```

## The dev loop

The repo ships a file watcher that restarts the server when anything relevant changes (source, templates, content, assets):

```bash
cargo run --bin nextrs-dev
```

It polls for changes, debounces, SIGTERMs the server cleanly, and restarts. Combined with `tower-livereload`, the browser refreshes itself after the rebuild.

## Where to go next

- [Routing Conventions](/docs/conventions) — every file type the framework understands.
- [Streaming](/docs/streaming) — how `loading` slots stream the shell before the page resolves.
- [Deploy to Vercel](/docs/deploy-vercel) or [Deploy with Docker](/docs/deploy-docker).

---

# Routing Conventions

> page, layout, loading, middleware, and route files — what each does and how they compose

Every directory under `app/` is a URL segment. Five file names have meaning inside a segment:

| File | Role | Signature |
|---|---|---|
| `page.{rs,html}` | The content for this URL | `pub async fn render(Request<Body>) -> String` |
| `layout.{rs,html}` | Wraps this segment's children (and nested routes) | `pub fn render(children: &str) -> String` |
| `loading.{rs,html}` | Skeleton streamed while the page computes | `pub fn render() -> String` |
| `middleware.rs` | Guard that runs before anything renders | `pub async fn handle(Request<Body>) -> MiddlewareResult` |
| `route.rs` | API handlers (JSON etc.) | `pub async fn get/post/put/patch/delete/...` |

For `page`, `layout`, and `loading`, both `.rs` and `.html` are accepted; if both exist, **`.rs` wins**. An `.html` file is served as-is (for layouts, `{{ children }}` is substituted literally) — zero Rust required for static segments.

## Pages

```rust
use askama::Template;

#[derive(Template)]
#[template(path = "users/page.html")]
pub struct UsersPage { pub names: Vec<String> }

pub async fn render(req: http::Request<axum::body::Body>) -> String {
    let names = fetch_users().await;
    UsersPage { names }.render().unwrap()
}
```

Pages receive the full request: headers, URI, and any extensions middleware inserted. They return the rendered HTML string; the framework wraps it in the layout chain and the HTTP response.

## Layouts

Layouts nest: a request to `/a/b` renders `app/layout` around `app/a/layout` around `app/a/b/page`, root to leaf.

```rust
use askama::Template;

#[derive(Template)]
#[template(path = "layout.html")]
pub struct RootLayout<'a> { pub children: &'a str }

pub fn render(children: &str) -> String {
    RootLayout { children }.render().unwrap()
}
```

**Askama layouts must use `{{ children|safe }}`.** Without `|safe`, Askama HTML-escapes the children — which breaks both your page markup and the framework's internal content marker (see [Streaming](/docs/streaming) for why that marker exists). This is the most common first-run mistake.

## Loading

A `loading.{rs,html}` file opts the route into streaming: the loading skeleton is sent immediately, the page handler runs concurrently, and the resolved page is swapped in on the same response. Routes without a loading slot return one synchronous response. Details in [Streaming](/docs/streaming).

## Middleware

`middleware.rs` files compose root-to-leaf along the matched path and run **before** layouts, loading, pages, and API handlers:

```rust
use axum::body::Body;
use http::Request;
use nextrs::conventions::MiddlewareResult;

pub async fn handle(mut req: Request<Body>) -> MiddlewareResult {
    let Some(user) = authenticate(&req).await else {
        return MiddlewareResult::response((
            http::StatusCode::SEE_OTHER,
            [("location", "/login")],
        ));
    };
    req.extensions_mut().insert(user);
    MiddlewareResult::next(req)
}
```

`MiddlewareResult::next(req)` continues (pass the request along — you may have mutated it); `MiddlewareResult::response(...)` short-circuits with a real HTTP response. Because middleware runs before the loading shell is sent, redirects and auth failures get correct status codes and headers even on streaming routes. Downstream pages read what middleware inserted via `req.extensions().get::<User>()`.

## API routes

`route.rs` exports one public async function per HTTP method. Handlers are ordinary Axum handlers — extractors in, `impl IntoResponse` out:

```rust
use axum::Json;
use serde::{Deserialize, Serialize};

#[derive(Serialize)]
pub struct Pong { pub message: String }

pub async fn get() -> Json<Pong> {
    Json(Pong { message: "pong".into() })
}
```

The build step detects which methods a `route.rs` exports by name. A segment can have both `page.rs` and `route.rs` — the page owns GET, the route file handles the rest. **Exporting `get()` from a `route.rs` next to a page is a compile error** (the build emits `compile_error!` with the conflicting path), so the conflict can't ship.

To generate a typesafe TypeScript client from your `route.rs` handlers, see [Typesafe Client Generation](/docs/typesafe-client).

## Dynamic segments

A directory named `[param]` matches one path segment:

```
app/users/[id]/page.rs   →  /users/{id}
```

Inside the handler, extract the parameter with Axum's `Path` extractor:

```rust
use axum::extract::Path;
use axum::RequestPartsExt;

pub async fn render(req: http::Request<axum::body::Body>) -> String {
    let (mut parts, _body) = req.into_parts();
    let Path(id): Path<String> = parts.extract().await.unwrap();
    format!("<h1>user {}</h1>", id)
}
```

## Static assets

Files in `public/` (sibling of `app/`) are served at the root URL path: `public/style.css` → `/style.css`. Locally they're a router fallback (routes win over files); on Vercel the CDN serves them before the function is invoked (files win over routes). Don't give a route and a file the same name and the asymmetry never matters.

---

# Streaming

> How loading slots stream the shell before the page resolves — and how to verify it

Streaming is the framework's central UX feature. When a route has a `loading.{rs,html}` slot, the server sends the loading shell to the browser **before** the page handler has finished computing — then sends the page content as a second chunk on the same response, swapping the shell out with a tiny inline script. One HTTP request, no client-side framework, no htmx.

## The model

A request to a route with a `loading` slot produces a chunked response shaped like this:

```
[layout-open]
<div id="__nx_slot__">
  …loading content…
</div>
                                    ← server awaits the page handler here
                                      (could be 100ms, could be 2s)
<template id="__nx_page__">
  …page content…
</template>
<script>
  // ~200 bytes inline
  var s = document.getElementById('__nx_slot__');
  var t = document.getElementById('__nx_page__');
  if (s && t) { s.replaceWith(t.content); t.remove(); }
</script>
[layout-close]
```

The browser parses incrementally as bytes arrive: the user sees the loading shell as soon as it paints (typically under 300ms TTFB). When the page handler resolves, its content arrives inside a `<template>` (parsed but not rendered), and the swap script replaces the slot with it.

Routes **without** a `loading` slot skip the streaming machinery and return one synchronous response.

## How the layout splits

The layout's closing half (`</body></html>`) has to arrive *after* the page swap. The framework composes the layout chain around an internal sentinel comment, then splits the rendered shell on it into `(before, after)` halves. The streamed order is `before + loading slot + (await page) + page template + swap script + after`.

This is why **Askama layouts must use `{{ children|safe }}`**: with plain `{{ children }}`, Askama escapes the sentinel, the split fails to find it, and your page renders outside the layout. (Static `.html` layouts do literal substitution and aren't affected.)

## Middleware runs first

All matching `middleware.rs` handlers run root-to-leaf **before** the loading shell is sent — once the first chunk ships, the status and headers are committed. That ordering means auth checks and redirects in middleware return real HTTP status codes even on streaming routes. Put fast request guards in middleware; put slow data work in the page and let the loading shell cover it.

## Verifying streaming works

The smoke test that catches buffering anywhere in the stack:

```bash
curl -o /dev/null -w "TTFB=%{time_starttransfer}s total=%{time_total}s\n" \
  http://localhost:3000/with-loading
```

If `TTFB ≈ total`, streaming is broken (or the route has no loading slot). If `TTFB << total` and the gap matches the page's work time, it's streaming.

To see the individual chunks:

```bash
curl --no-buffer --trace-time --trace - http://localhost:3000/with-loading 2>&1 \
  | grep "<= Recv data"
```

Two or more `Recv data` events, separated by roughly the page handler's duration, means it's working. A real deploy of the demo's `/with-loading` route (800ms simulated work) shows the first frame at T+0.000s and the page frame at T+0.84s.

## Deploy targets

Locally, axum's `Body::from_stream` streams over chunked transfer encoding with no extra setup. On Vercel, the stock adapter buffers `text/html` responses — the framework ships a drop-in fix. See [Deploy to Vercel](/docs/deploy-vercel#streaming-through-the-vercel-adapter).

## Current limits

- **One swap per route.** No Suspense-style nested boundaries (yet) — one loading slot, one page swap.
- **No error frames.** If the page handler panics after the shell shipped, the browser keeps the loading state. An `error.{rs,html}` convention is on the roadmap.

---

# Typesafe Client Generation

> Generate a typed TypeScript / React Query client from your route.rs handlers

nextrs can generate a fully-typed TypeScript client — TanStack (React) Query hooks with typed request and response shapes — directly from your `route.rs` handlers. Rename a field in Rust and the TypeScript call sites stop compiling. The pipeline is OpenAPI-based:

```
route.rs (#[nextrs::api])  ─codegen→  generated_openapi()
        │                                     │
        │                       cargo run --bin dump-openapi
        ▼                                     ▼
   served at /openapi.json            client/openapi.json
                                              │
                                            orval
                                              ▼
                            src/generated/**  (hooks + types)
```

## Annotate a handler

Handlers stay ordinary Axum handlers — typed extractors in, concrete return types out. Add `#[nextrs::api]` to the ones you want in the client:

```rust
use axum::Json;
use serde::{Deserialize, Serialize};
use utoipa::ToSchema;

#[derive(Serialize, Deserialize, ToSchema)]
pub struct PingResponse {
    pub message: String,
    pub pong: bool,
}

#[derive(Serialize, Deserialize, ToSchema)]
pub struct PingRequest {
    pub message: String,
}

#[nextrs::api(
    get,
    responses((status = 200, description = "Pong", body = PingResponse)),
)]
pub async fn get() -> Json<PingResponse> {
    Json(PingResponse { message: "pong".into(), pong: true })
}

#[nextrs::api(
    post,
    operation_id = "sendPing",
    responses((status = 200, description = "Echoes the message", body = PingResponse)),
)]
pub async fn post(Json(req): Json<PingRequest>) -> Json<PingResponse> {
    Json(PingResponse { message: req.message, pong: true })
}
```

`#[nextrs::api]` is a thin wrapper over `#[utoipa::path]` that derives the URL from the file's location (`app/api/ping/route.rs` → `/api/ping`), so the path is never restated and can't drift from the file convention. You write the method, `responses(...)` (response types aren't inferred from the return type), and optionally `operation_id` / `tag` for nicer hook names. The request body **is** inferred from the `Json<T>` extractor.

Annotation is **opt-in per handler**: an un-annotated handler still routes and serves normally — it just doesn't appear in the spec or the generated client.

## The spec

The same build-time discovery that wires your routes collects the annotated handlers into a `generated_openapi()` function. The app serves the document at `/openapi.json`, and a `dump-openapi` binary writes the identical spec to `client/openapi.json` so the client can be generated offline.

## Generate the client

The client directory holds the orval config and the committed generated output:

```bash
cd site/client
npm install      # first time only
npm run gen      # dump openapi.json from Rust, then run orval
npm run typecheck
```

Both `openapi.json` and `src/generated/**` are committed, so contract changes show up in the diff. Rerun `npm run gen` whenever an annotated `route.rs` changes.

## Use the hooks

Each annotated handler becomes a hook named from its `operation_id` — GETs become query hooks, anything with a body becomes a mutation hook:

```tsx
import { QueryClient, QueryClientProvider } from "@tanstack/react-query";
import { useGetApiPing, useSendPing } from "@site/client";

function Ping() {
  const { data } = useGetApiPing();          // GET  /api/ping → typed PingResponse
  const send = useSendPing();                // POST /api/ping → typed PingRequest in

  return (
    <button onClick={() => send.mutate({ data: { message: "hi" } })}>
      {data?.data.message ?? "…"}
    </button>
  );
}

const queryClient = new QueryClient();
export const App = () => (
  <QueryClientProvider client={queryClient}>
    <Ping />
  </QueryClientProvider>
);
```

The generated client uses the platform `fetch` (no HTTP-library dependency) and same-origin URLs — the nextrs app serves both the pages and the API, so there's no CORS story to manage.

## Why OpenAPI

Direct Rust→TS type generation (`ts-rs`, `specta`) only produces *types* — you'd still hand-write the fetch layer and hooks. Going through OpenAPI lets orval generate the entire client (hooks, types, fetchers), keeps the door open to Swagger UI and non-TypeScript consumers, and the file-convention discovery removes utoipa's usual hand-maintained path list.

---

# Deploy to Vercel

> Single Rust binary on Vercel functions, static assets on the CDN, streaming preserved

A nextrs app deploys to Vercel as **one Rust binary** behind a catch-all rewrite. Vercel's Fluid compute runs the function and supports HTTP response streaming, so the loading→page swap works in production. This is the officially supported way to host an Axum app on Vercel — nextrs just needs one adapter (below) to keep HTML streaming intact.

## Project layout

Vercel auto-detects `Cargo.toml` and builds Rust functions from `api/`. Three pieces:

**1. A `[[bin]]` entry pointing at `api/index.rs`:**

```toml
[[bin]]
name = "index"
path = "api/index.rs"

[dependencies]
nextrs = { version = "0.1", features = ["vercel"] }
vercel_runtime = { version = "2", features = ["axum"] }
```

**2. `api/index.rs` — the entire function:**

```rust
use nextrs::vercel::StreamingVercelLayer;
use tower::ServiceBuilder;
use vercel_runtime::Error;

include!(concat!(env!("OUT_DIR"), "/nextrs_routes.rs"));

#[tokio::main]
async fn main() -> Result<(), Error> {
    let router = nextrs::router::build_router(generated_registry());
    let app = ServiceBuilder::new()
        .layer(StreamingVercelLayer::new())
        .service(router);
    vercel_runtime::run(app).await
}
```

The registry is the same one your local `main.rs` consumes — generated by `build.rs` from the `app/` tree, so both entry points always serve identical routes.

**3. `vercel.json` — route everything to the function:**

```json
{
  "rewrites": [{ "source": "/(.*)", "destination": "/api/index" }]
}
```

Dynamic segments need no Vercel-side configuration — the catch-all passes the full path through and Axum matches it.

Deploy with `vercel deploy`.

## Streaming through the Vercel adapter

The stock `vercel_runtime::axum::VercelLayer` only streams responses whose content-type is `text/event-stream` or `application/json`. nextrs streams `text/html`, so the stock layer silently buffers the whole response — TTFB equals total time and the loading shell is pointless.

`nextrs::vercel::StreamingVercelLayer` (behind the `vercel` cargo feature) is a drop-in replacement that streams unconditionally. Non-streaming responses are unaffected. If you ever see `TTFB ≈ total` on a deployed streaming route, check that you're using it.

## Static assets on the CDN

Vercel serves files from a root-level `public/` directory at root URL paths **before** applying rewrites, with edge caching. Since your assets live next to your app at `site/public/`, mirror them at build time from the workspace root's `build.rs`:

```rust
nextrs::build::sync_public_dir("site/public", "public")
    .expect("sync_public_dir failed");
```

The root `public/` is a generated artifact — gitignore it. Deployed assets come back with `x-vercel-cache: HIT` at ~145ms; the function never sees those requests.

## What to expect on latency

Measured on a real deploy (warm, p50): non-streaming pages ~220–250ms TTFB; streaming routes show the shell at ~230ms with the page following whenever its data resolves; CDN-cached assets ~145ms. Cold starts add roughly 250–330ms, paid once per warm cycle — Fluid compute keeps Rust functions warm aggressively.

## Verify after deploying

```bash
curl -o /dev/null -w "TTFB=%{time_starttransfer}s total=%{time_total}s\n" \
  https://your-deployment.vercel.app/with-loading
```

`TTFB << total` means streaming survived the trip. (Preview URLs behind Vercel SSO need an `x-vercel-protection-bypass` header.) More verification recipes in [Streaming](/docs/streaming#verifying-streaming-works).

---

# Deploy with Docker

> Run a nextrs app on any container host — Fly.io, Railway, ECS, or a VPS

A nextrs app is a plain Axum binary, so serverful deployment is the boring kind: build a release binary, ship it with the `public/` directory, run it behind a reverse proxy. A container works on any host — Fly.io, Railway, Render, ECS, Cloud Run, or a VPS with Docker installed.

## The Dockerfile

A standard two-stage build (the repo ships this at the workspace root):

```dockerfile
FROM rust:1-bookworm AS builder
WORKDIR /build
COPY . .
RUN cargo build --release -p site

FROM debian:bookworm-slim
RUN apt-get update \
    && apt-get install -y --no-install-recommends ca-certificates \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY --from=builder /build/target/release/site /app/site
COPY site/public /app/public
ENV NEXTRS_PUBLIC_DIR=/app/public
EXPOSE 3000
CMD ["/app/site"]
```

One detail worth knowing: **`NEXTRS_PUBLIC_DIR` points the binary at the shipped assets.** The default asset path is compiled in via `CARGO_MANIFEST_DIR`, which only exists on the build machine. The env var overrides it at runtime — set it anywhere the binary runs away from its source tree.

Add a `.dockerignore` with at least `target/` and `node_modules/` so the build context stays small.

## Build and run

```bash
docker build -t mysite .
docker run --rm -p 3000:3000 mysite
curl -i http://localhost:3000/
```

The server binds `0.0.0.0:3000`. Map whatever host port you like.

## Streaming and the reverse proxy

There's no Vercel adapter in this picture — axum streams chunked `text/html` natively, so loading shells work out of the box. The one thing that can break streaming is a **buffering reverse proxy** in front of the container. If you put nginx in front, disable response buffering for the app:

```nginx
location / {
    proxy_pass http://127.0.0.1:3000;
    proxy_http_version 1.1;
    proxy_buffering off;
}
```

Caddy and Traefik stream by default. After deploying, run the smoke test from [Streaming](/docs/streaming#verifying-streaming-works) — if TTFB equals total time on a loading route, something in the path is buffering.

## Static assets

Serverful, the binary serves `public/` itself via a router fallback (`tower-http` `ServeDir`) — same URLs as the Vercel CDN path, no extra configuration. If you want a CDN in front, point it at the same root URLs; everything under `public/` is safe to cache.

## Logs and environment

The binary reads `.env` if present (via `dotenvy`) and respects `RUST_LOG` for tracing verbosity (`RUST_LOG=info` is the default). Container hosts that capture stdout get structured logs with no extra setup.

---

# React Pages & Server Props (Preview)

> Roadmap preview: page.tsx in the app tree, with the React Query cache warmed by the server before your bundle runs

> **Status: implemented, pre-release.** Everything on this page runs in the nextrs repo today — the runnable [`examples/react-todos`](https://github.com/drewhirschi/nextrs/tree/main/examples/react-todos) crate is exactly this code. APIs may still shift before a release. The typed-client pipeline it builds on is documented at [Typesafe Client Generation](/docs/typesafe-client).

## The idea

nextrs will let you drop `page.tsx` files into the `app/` tree next to `page.rs` and `page.html`:

```
app/
├── layout.rs           # Rust layouts wrap React pages like any other page
└── todos/
    ├── page.tsx        # React page — discovered and routed by the same codegen
    └── props.rs        # optional: Rust warms your React Query cache
```

`.tsx` pages are **client-rendered by default**. The server streams the layout shell and a script tag; your component renders in the browser and talks to the backend through the generated typed hooks. One Rust binary serves the APIs, the Rust pages, and the React pages. There is no Node server and no JS runtime inside the binary — that's a permanent constraint, not a phase. If a page needs request-time server rendering, that's what `page.rs` is for.

The interesting part is what replaces server-side rendering's data story.

## The waterfall, and `props.rs`

A client-rendered page normally pays: stream shell → download bundle → mount React → hook fires a fetch → round-trip *back to the server that just streamed the shell*. The server had the data the whole time.

`props.rs` is a Rust file beside your page that runs per request, calls the same handler that serves the API endpoint, and injects the result into the streamed HTML — keyed exactly the way the generated client keys its queries:

```rust
// app/todos/props.rs
include!(concat!(env!("OUT_DIR"), "/nextrs_seeds.rs"));

pub async fn props(req: http::Request<axum::body::Body>) -> nextrs::QuerySeed {
    nextrs::QuerySeed::new()
        // A plain typed function call (no HTTP): runs the GET /api/todos
        // handler and pairs the result with its canonical query key.
        .seed(get_api_todos(
            api_todos::TodosFilter { status: Some("open".into()) },
            req.extensions(),
        ))
        .await
}
```

The `get_api_todos` companion (and the `api_todos` module alias that makes the filter type reachable) is generated by the build from the `#[nextrs::api]` annotation on the handler — seedable handlers are GETs taking at most one `Query<T>` extractor and returning `Json<...>`.

By the time your bundle executes, the JSON is already in the DOM, loaded into the React Query cache before mount.

## What the page looks like

The payoff: **the component has no idea any of this happened.** It's vanilla React Query — except the data is just there on first paint:

```tsx
// app/todos/page.tsx
import { useQueryClient } from "@tanstack/react-query";
import { useGetTodos, useAddTodo, getGetTodosQueryKey } from "@site/client";

export default function Todos() {
  const queryClient = useQueryClient();

  // Warmed from the stream: defined on first render, no spinner, no mount
  // fetch. Goes stale and refetches like any query afterward.
  const { data: todos, refetch, isFetching } = useGetTodos({ status: "open" });

  const addTodo = useAddTodo({
    mutation: {
      onSuccess: () => {
        // Prefix invalidation refetches every /api/todos variant — including
        // the server-seeded entry, because the seed used the same canonical
        // key the hooks use.
        queryClient.invalidateQueries({ queryKey: getGetTodosQueryKey() });
      },
    },
  });

  return (
    <section>
      <button onClick={() => refetch()} disabled={isFetching}>Refresh</button>
      <ul>{todos?.data.map((t) => <li key={t.id}>{t.title}</li>)}</ul>
      <button onClick={() => addTodo.mutate({ data: { title: "ship nextrs" } })}>
        Add
      </button>
    </section>
  );
}
```

Three properties worth noticing:

1. **Seeding is a pure progressive enhancement.** Delete `props.rs` and this file works unchanged — it just fetches on mount instead of rendering instantly.
2. **Mutations invalidate seeded data.** The seed lives under the same `[url, params]` key the hooks use, so your `invalidateQueries` call refreshes streamed data and fetched data alike.
3. **Refetching, staleness, optimistic updates are untouched.** The seed is an ordinary cache entry; everything React Query does applies to it.

## Thin handlers, and why seeds go through them

nextrs's conventions are deliberately just the adapter layer — `route.rs`, `page.rs`, `middleware.rs`, and `props.rs` all translate between the web and your domain logic, which lives wherever you keep it (a lib crate, a `core` module). Handlers stay thin:

```rust
// app/api/todos/route.rs — adapter only: extract, delegate, map
#[nextrs::api(get, responses((status = 200, body = Vec<Todo>)))]
pub async fn get(Query(f): Query<TodosFilter>) -> Json<Vec<Todo>> {
    Json(core::todos::list(f.into()).await)
}
```

`props.rs` runs on the server, so it *could* call `core::todos::list` directly. It calls the handler instead, on purpose: the seed is a cache entry **keyed by URL** — it impersonates a response from `GET /api/todos`, and the client will refetch that endpoint later and overwrite it. The wire shape (the DTO mapping, serde casing, the response envelope) belongs to the HTTP adapter, so producing a cache entry for that endpoint has to go through the adapter — or risk drifting from it and flickering from seed-shape to handler-shape on the first refetch. With a thin handler, calling it costs exactly one DTO mapping more than calling the service, and that mapping is the part the seed can't safely skip.

Server data that *isn't* endpoint-shaped — session user, feature flags, a precomputed view model — impersonates nothing, so it skips the HTTP adapter entirely: that's the design's second mode, plain typed initial props (`usePageProps<T>()`), with the TypeScript type generated from the Rust struct through the same OpenAPI pipeline. One rule, two lanes: data that belongs to an endpoint goes through the endpoint's adapter; data that belongs to the page goes through the page's.

## End-to-end type safety

The same property the typed client has, extended to seeds and props: the Rust structs derive `ToSchema`, the schema flows into the OpenAPI document, and orval generates the TypeScript. Rename a field in Rust and the `.tsx` stops compiling.

## Where this is headed

1. **Phase 1** — client-rendered `page.tsx`: discovery, routing, and bundling (Rust toolchain via swc) wired into `cargo build`; a dev watcher that rebuilds bundles in milliseconds without restarting the server.
2. **Phase 2** — `props.rs` as shown above: typed initial props, then React Query cache seeding.
3. **Phase 3** — build-time prerendering: `loading.tsx` skeletons and static `.tsx` pages rendered to HTML during the build (Node at build time only) and hydrated in the browser.

Follow along or argue with us: [github.com/drewhirschi/nextrs](https://github.com/drewhirschi/nextrs).