Stream HTML to the browser before the full response is ready
Use HTTP chunked transfer encoding and React renderToPipeableStream (or ReadableStream) to begin delivering HTML to the browser as soon as the first bytes are available, reducing Time to First Byte and First Contentful Paint.
- Streaming sends HTML in chunks as it is generated instead of waiting for the full response
- React renderToPipeableStream and Next.js App Router stream by default when you use Suspense
- Place Suspense boundaries around slow data dependencies to flush above-fold HTML immediately
- Monitor TTFB in Lighthouse — values above 600ms are a sign that streaming is not being used
Rule Details
HTML streaming relies on HTTP chunked transfer encoding (Transfer-Encoding: chunked), which allows a server to send a response in multiple pieces without knowing the total content length upfront. The browser begins parsing each chunk as it arrives, discovering and fetching subresources (CSS, fonts, scripts, images) long before the last chunk is sent.
Code Examples
With traditional buffered SSR:
0ms Server starts executing
800ms All database queries complete
800ms HTML generation begins
820ms HTML generation completes
820ms TTFB — first byte arrives in browser
850ms CSS parsed
900ms FCPWith streaming:
0ms Server starts executing
10ms TTFB — shell HTML flushed immediately (head, above-fold skeleton)
10ms Browser fetches CSS, discovers LCP image
300ms Slow data query completes; component HTML streamed
380ms FCP — above-fold content paintedThe gain is that the browser starts work 790 ms earlier in this example.
Why It Matters
Traditional SSR waits for all data fetches to complete before sending a single byte of HTML, which means the browser cannot parse, discover subresources, or render anything until the slowest data query finishes. Streaming decouples the delivery of above-fold content from slow database queries and third-party API calls, dramatically improving perceived performance and LCP.
React renderToPipeableStream
renderToPipeableStream is React's streaming SSR API. It accepts a React tree and renders it to a Node.js writable stream in chunks. The key option is onShellReady — called when the application shell (everything outside <Suspense> boundaries) is ready to send:
// server/render.ts (custom Express / Node.js setup)
import { renderToPipeableStream } from 'react-dom/server';
import { IncomingMessage, ServerResponse } from 'http';
import App from './App';
export function handleRequest(req: IncomingMessage, res: ServerResponse) {
let didError = false;
const { pipe, abort } = renderToPipeableStream(
<App url={req.url} />,
{
// bootstrapScripts is optional — use for client hydration
bootstrapScripts: ['/static/js/main.js'],
onShellReady() {
// The shell is the minimal HTML that does not depend on Suspense
// boundaries — flush it immediately for a low TTFB
res.statusCode = didError ? 500 : 200;
res.setHeader('Content-Type', 'text/html; charset=utf-8');
// chunked transfer is automatic when the length is unknown
pipe(res);
},
onShellError(error) {
// Shell rendering failed — fall back to a basic error page
res.statusCode = 500;
res.setHeader('Content-Type', 'text/html');
res.end('<h1>Something went wrong</h1>');
},
onError(error) {
didError = true;
console.error(error);
},
}
);
// Abort streaming after 10 seconds to avoid hanging connections
setTimeout(abort, 10_000);
}Next.js App Router: Streaming by Default
The App Router streams automatically. Any async Server Component that wraps a data fetch is a natural Suspense boundary. Wrap slow components in explicit <Suspense> to control what is flushed first:
// app/dashboard/page.tsx
import { Suspense } from 'react';
import { DashboardSkeleton, RecentOrdersSkeleton } from '@/components/skeletons';
export default function DashboardPage() {
return (
<main>
{/*
The page title and navigation are part of the shell —
they render immediately without waiting for any data.
*/}
<h1>Dashboard</h1>
<nav>{/* ... */}</nav>
{/*
KPICards fetches summary stats. If it is slow, show a skeleton
rather than blocking the page shell.
*/}
<Suspense fallback={<DashboardSkeleton />}>
<KPICards />
</Suspense>
{/*
RecentOrders fetches a paginated list — isolated in its own boundary
so it does not block KPICards.
*/}
<Suspense fallback={<RecentOrdersSkeleton />}>
<RecentOrders />
</Suspense>
</main>
);
}
// KPICards is a Server Component that fetches its own data
async function KPICards() {
const stats = await fetchKPIStats(); // potentially slow
return <ul>{stats.map((s) => <KPICard key={s.id} stat={s} />)}</ul>;
}
async function RecentOrders() {
const orders = await fetchRecentOrders(); // independent query
return <OrderList orders={orders} />;
}Next.js will flush the HTML for <h1>Dashboard</h1> and the nav immediately, then stream in KPICards and RecentOrders as their data resolves — potentially out-of-order using React's slot mechanism.
Wrapping the entire page in a single Suspense boundary means nothing is flushed until all data is ready — identical to buffered SSR. Use granular boundaries around each independently fetched section so above-fold content is never blocked by below-fold data.
Node.js ReadableStream (Non-React)
For non-React server environments, use the Web Streams API to stream HTML chunks:
// Plain Node.js / Deno / Cloudflare Workers example
export function createStreamingResponse(
getSlowData: () => Promise<string>
): Response {
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
// Flush the page shell immediately
controller.enqueue(
encoder.encode(`<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>My App</title>
<link rel="stylesheet" href="/styles.css">
</head>
<body>
<header><nav><!-- navigation --></nav></header>
<main id="content">
<div class="loading-skeleton" aria-busy="true" aria-label="Loading content...">
</div>
`)
);
// Fetch slow data while the browser is already parsing the shell
const data = await getSlowData();
// Stream the content section once data is ready
controller.enqueue(
encoder.encode(`
<script>
// Replace skeleton with real content
document.getElementById('content').innerHTML = ${JSON.stringify(data)};
</script>
</main>
</body>
</html>`)
);
controller.close();
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/html; charset=utf-8',
// Disable any response buffering by proxies
'X-Accel-Buffering': 'no',
},
});
}Flushing Resource Hints Early
One of the most valuable streaming patterns is to include <link rel="preload"> hints in the first chunk, even before the full page is determined:
// First chunk always includes critical resource hints
const shellChunk = `<!DOCTYPE html>
<html>
<head>
<!-- Preload the LCP image — browser starts fetching immediately -->
<link rel="preload" as="image" href="/hero.webp" fetchpriority="high">
<!-- Preload critical fonts -->
<link rel="preload" as="font" type="font/woff2" href="/fonts/inter.woff2" crossorigin>
<link rel="stylesheet" href="/critical.css">
</head>
<body>
<div id="app">
`;Nginx, Cloudflare, and other proxies often buffer responses before forwarding them downstream, which nullifies streaming. Set X-Accel-Buffering: no (Nginx) and ensure your CDN is configured for streaming pass-through on HTML responses.
Measuring the Impact
Use WebPageTest's waterfall view to verify that HTML chunks arrive before the server finishes rendering:
- The first green bar (Time to First Byte) should be under 200 ms
- The CSS and LCP image requests should start while the HTML bar is still active
- The "Layout" event in DevTools Performance should occur before the response completes
Verification
Use WebPageTest (opens in new tab) or an equivalent waterfall trace to confirm the shell HTML actually arrives before the slow data completes, because streaming only helps when the browser receives and starts parsing that first chunk early.
Automated Checks
- Open DevTools Network panel and click the page request — the Response tab should show HTML arriving in multiple chunks (the response will not be "complete" for several hundred milliseconds while later chunks arrive).
- Run Lighthouse and confirm TTFB is under 600 ms (green) and FCP is under 1.8 s.
- Test with a simulated 3G connection in DevTools — the skeleton UI should appear quickly even if data takes 3–5 seconds.
Manual Checks
- In Next.js, add
console.log('rendering KPICards')inside an async Server Component wrapped in Suspense — confirm in the server log that the page load event fires before this log appears.
Use with AI
Copy these prompts to use with your AI assistant, or install the MCP server to use directly from Claude, Cursor, or Windsurf.
Check
Verify implementation
Examine this server-side rendering implementation to determine whether HTML is streamed in chunks or buffered until fully generated before sending to the client.
Fix
Auto-fix issues
Refactor the SSR implementation to use renderToPipeableStream with Suspense boundaries around slow data fetches, or use ReadableStream in a Node.js handler, so that above-fold HTML is flushed to the client immediately.
Explain
Learn more
Explain how streaming HTML improves TTFB and FCP by decoupling the delivery of fast content from slow server-side data dependencies.
Review
Code review
Review server entry points and page components for renderToString calls, missing Suspense boundaries around data-fetching components, and use of await on slow queries before returning a response.