Skip to main content
Beta: Front-End Checklist is currently in beta. Some issues are still being fixed. Thanks for your patience.

Stream HTML to the browser before the full response is ready

Use HTTP chunked transfer encoding and React renderToPipeableStream (or ReadableStream) to begin delivering HTML to the browser as soon as the first bytes are available, reducing Time to First Byte and First Contentful Paint.

Utilities
Quick take
Typical fix time 45 min
  • Streaming sends HTML in chunks as it is generated instead of waiting for the full response
  • React renderToPipeableStream and Next.js App Router stream by default when you use Suspense
  • Place Suspense boundaries around slow data dependencies to flush above-fold HTML immediately
  • Monitor TTFB in Lighthouse — values above 600ms are a sign that streaming is not being used
Why it matters: Traditional SSR waits for all data fetches to complete before sending a single byte of HTML, which means the browser cannot parse, discover subresources, or render anything until the slowest data query finishes. Streaming decouples the delivery of above-fold content from slow database queries and third-party API calls, dramatically improving perceived performance and LCP.

Rule Details

HTML streaming relies on HTTP chunked transfer encoding (Transfer-Encoding: chunked), which allows a server to send a response in multiple pieces without knowing the total content length upfront. The browser begins parsing each chunk as it arrives, discovering and fetching subresources (CSS, fonts, scripts, images) long before the last chunk is sent.

Code Examples

With traditional buffered SSR:

0ms    Server starts executing
800ms  All database queries complete
800ms  HTML generation begins
820ms  HTML generation completes
820ms  TTFB — first byte arrives in browser
850ms  CSS parsed
900ms  FCP

With streaming:

0ms    Server starts executing
10ms   TTFB — shell HTML flushed immediately (head, above-fold skeleton)
10ms   Browser fetches CSS, discovers LCP image
300ms  Slow data query completes; component HTML streamed
380ms  FCP — above-fold content painted

The gain is that the browser starts work 790 ms earlier in this example.

Why It Matters

Traditional SSR waits for all data fetches to complete before sending a single byte of HTML, which means the browser cannot parse, discover subresources, or render anything until the slowest data query finishes. Streaming decouples the delivery of above-fold content from slow database queries and third-party API calls, dramatically improving perceived performance and LCP.

React renderToPipeableStream

renderToPipeableStream is React's streaming SSR API. It accepts a React tree and renders it to a Node.js writable stream in chunks. The key option is onShellReady — called when the application shell (everything outside <Suspense> boundaries) is ready to send:

// server/render.ts (custom Express / Node.js setup)
import { renderToPipeableStream } from 'react-dom/server';
import { IncomingMessage, ServerResponse } from 'http';
import App from './App';
 
export function handleRequest(req: IncomingMessage, res: ServerResponse) {
  let didError = false;
 
  const { pipe, abort } = renderToPipeableStream(
    <App url={req.url} />,
    {
      // bootstrapScripts is optional — use for client hydration
      bootstrapScripts: ['/static/js/main.js'],
 
      onShellReady() {
        // The shell is the minimal HTML that does not depend on Suspense
        // boundaries — flush it immediately for a low TTFB
        res.statusCode = didError ? 500 : 200;
        res.setHeader('Content-Type', 'text/html; charset=utf-8');
        // chunked transfer is automatic when the length is unknown
        pipe(res);
      },
 
      onShellError(error) {
        // Shell rendering failed — fall back to a basic error page
        res.statusCode = 500;
        res.setHeader('Content-Type', 'text/html');
        res.end('<h1>Something went wrong</h1>');
      },
 
      onError(error) {
        didError = true;
        console.error(error);
      },
    }
  );
 
  // Abort streaming after 10 seconds to avoid hanging connections
  setTimeout(abort, 10_000);
}

Next.js App Router: Streaming by Default

The App Router streams automatically. Any async Server Component that wraps a data fetch is a natural Suspense boundary. Wrap slow components in explicit <Suspense> to control what is flushed first:

// app/dashboard/page.tsx
import { Suspense } from 'react';
import { DashboardSkeleton, RecentOrdersSkeleton } from '@/components/skeletons';
 
export default function DashboardPage() {
  return (
    <main>
      {/*
        The page title and navigation are part of the shell —
        they render immediately without waiting for any data.
      */}
      <h1>Dashboard</h1>
      <nav>{/* ... */}</nav>
 
      {/*
        KPICards fetches summary stats. If it is slow, show a skeleton
        rather than blocking the page shell.
      */}
      <Suspense fallback={<DashboardSkeleton />}>
        <KPICards />
      </Suspense>
 
      {/*
        RecentOrders fetches a paginated list — isolated in its own boundary
        so it does not block KPICards.
      */}
      <Suspense fallback={<RecentOrdersSkeleton />}>
        <RecentOrders />
      </Suspense>
    </main>
  );
}
 
// KPICards is a Server Component that fetches its own data
async function KPICards() {
  const stats = await fetchKPIStats(); // potentially slow
  return <ul>{stats.map((s) => <KPICard key={s.id} stat={s} />)}</ul>;
}
 
async function RecentOrders() {
  const orders = await fetchRecentOrders(); // independent query
  return <OrderList orders={orders} />;
}

Next.js will flush the HTML for <h1>Dashboard</h1> and the nav immediately, then stream in KPICards and RecentOrders as their data resolves — potentially out-of-order using React's slot mechanism.

Suspense boundaries that are too coarse defeat streaming

Wrapping the entire page in a single Suspense boundary means nothing is flushed until all data is ready — identical to buffered SSR. Use granular boundaries around each independently fetched section so above-fold content is never blocked by below-fold data.

Node.js ReadableStream (Non-React)

For non-React server environments, use the Web Streams API to stream HTML chunks:

// Plain Node.js / Deno / Cloudflare Workers example
export function createStreamingResponse(
  getSlowData: () => Promise<string>
): Response {
  const encoder = new TextEncoder();
 
  const stream = new ReadableStream({
    async start(controller) {
      // Flush the page shell immediately
      controller.enqueue(
        encoder.encode(`<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>My App</title>
  <link rel="stylesheet" href="/styles.css">
</head>
<body>
  <header><nav><!-- navigation --></nav></header>
  <main id="content">
    <div class="loading-skeleton" aria-busy="true" aria-label="Loading content...">
    </div>
`)
      );
 
      // Fetch slow data while the browser is already parsing the shell
      const data = await getSlowData();
 
      // Stream the content section once data is ready
      controller.enqueue(
        encoder.encode(`
    <script>
      // Replace skeleton with real content
      document.getElementById('content').innerHTML = ${JSON.stringify(data)};
    </script>
  </main>
</body>
</html>`)
      );
 
      controller.close();
    },
  });
 
  return new Response(stream, {
    headers: {
      'Content-Type': 'text/html; charset=utf-8',
      // Disable any response buffering by proxies
      'X-Accel-Buffering': 'no',
    },
  });
}

Flushing Resource Hints Early

One of the most valuable streaming patterns is to include <link rel="preload"> hints in the first chunk, even before the full page is determined:

// First chunk always includes critical resource hints
const shellChunk = `<!DOCTYPE html>
<html>
<head>
  <!-- Preload the LCP image — browser starts fetching immediately -->
  <link rel="preload" as="image" href="/hero.webp" fetchpriority="high">
  <!-- Preload critical fonts -->
  <link rel="preload" as="font" type="font/woff2" href="/fonts/inter.woff2" crossorigin>
  <link rel="stylesheet" href="/critical.css">
</head>
<body>
  <div id="app">
`;
Disable CDN and proxy response buffering

Nginx, Cloudflare, and other proxies often buffer responses before forwarding them downstream, which nullifies streaming. Set X-Accel-Buffering: no (Nginx) and ensure your CDN is configured for streaming pass-through on HTML responses.

Measuring the Impact

Use WebPageTest's waterfall view to verify that HTML chunks arrive before the server finishes rendering:

  1. The first green bar (Time to First Byte) should be under 200 ms
  2. The CSS and LCP image requests should start while the HTML bar is still active
  3. The "Layout" event in DevTools Performance should occur before the response completes

Verification

Use WebPageTest (opens in new tab) or an equivalent waterfall trace to confirm the shell HTML actually arrives before the slow data completes, because streaming only helps when the browser receives and starts parsing that first chunk early.

Automated Checks

  • Open DevTools Network panel and click the page request — the Response tab should show HTML arriving in multiple chunks (the response will not be "complete" for several hundred milliseconds while later chunks arrive).
  • Run Lighthouse and confirm TTFB is under 600 ms (green) and FCP is under 1.8 s.
  • Test with a simulated 3G connection in DevTools — the skeleton UI should appear quickly even if data takes 3–5 seconds.

Manual Checks

  • In Next.js, add console.log('rendering KPICards') inside an async Server Component wrapped in Suspense — confirm in the server log that the page load event fires before this log appears.

Use with AI

Copy these prompts to use with your AI assistant, or install the MCP server to use directly from Claude, Cursor, or Windsurf.

Check

Verify implementation

Examine this server-side rendering implementation to determine whether HTML is streamed in chunks or buffered until fully generated before sending to the client.

Fix

Auto-fix issues

Refactor the SSR implementation to use renderToPipeableStream with Suspense boundaries around slow data fetches, or use ReadableStream in a Node.js handler, so that above-fold HTML is flushed to the client immediately.

Explain

Learn more

Explain how streaming HTML improves TTFB and FCP by decoupling the delivery of fast content from slow server-side data dependencies.

Review

Code review

Review server entry points and page components for renderToString calls, missing Suspense boundaries around data-fetching components, and use of await on slow queries before returning a response.

Sources

References used to support the guidance in this rule.

Further Reading

Tools and supplementary material for exploring the topic in more depth.

Chrome DevTools Network paneldeveloper.chrome.comTool
WebPageTestwebpagetest.orgTool

Rules that often go hand-in-hand with this one.

Reduce Time to First Byte (TTFB)

Measures and optimizes server response time (TTFB) to ensure a fast initial response

Performance
Optimize largest contentful paint

The largest content element loads within 2.5 seconds for a good user experience.

Performance
Eliminate render-blocking resources

Checks for render-blocking CSS and JavaScript that prevent the initial page render

Performance
Virtualize long lists and tables

Render only the visible subset of rows or cards in large collections to reduce DOM size, memory usage, and scroll-time rendering work.

Performance

Was this rule helpful?

Your feedback helps improve rule quality. This stays internal for now.

Loading feedback...
0 / 385