For a page to appear in Google Search, it must be crawlable and indexable. Google's indexing controls guide (opens in new tab) and your broader robots meta strategy both matter here, because three separate mechanisms can block indexing: noindex meta tags, X-Robots-Tag HTTP headers, and robots.txt rules.

Code Examples

❌ Avoid — important page accidentally noindexed

<!-- Product page that should rank in search -->
<meta name="robots" content="noindex, nofollow">
<!-- This page will never appear in Google Search -->

❌ Avoid — Next.js metadata with indexing disabled

// app/products/[slug]/page.tsx — accidental noindex
export const metadata = {
  robots: {
    index: false,   // This blocks indexing for ALL product pages!
    follow: false,
  },
}

✅ Correct — indexable page (no noindex)

<!-- Default: no meta robots tag = index, follow -->
<!-- Or be explicit: -->
<meta name="robots" content="index, follow">

✅ Correct — Next.js metadata enabling indexing

export const metadata = {
  // For pages you want indexed, either omit 'robots' or set explicitly:
  robots: {
    index: true,
    follow: true,
  },
}

✅ Limiting noindex to genuinely private content

// pages/api/preview-post.ts — only noindex preview/draft pages
export async function generateMetadata({ params }) {
  const post = await getPost(params.slug)
 
  return {
    robots: post.published
      ? { index: true, follow: true }
      : { index: false, follow: false },  // Draft only
  }
}

✅ Testing indexability

# Check X-Robots-Tag header
curl -I https://yoursite.com/important-page | grep -i 'x-robots'
 
# Check meta robots in HTML
curl -s https://yoursite.com/important-page | grep -i 'robots'

Why It Matters

No index = no ranking: A page with noindex cannot appear in search results under any circumstances.
Common accidents: CMS platforms often default new pages to noindex (e.g., draft mode, staging environments accidentally promoted, category pages in WordPress).
Silent failure: Indexing blocks cause no visible errors on the frontend. Only Google Search Console (opens in new tab) or log analysis reliably surfaces the problem, and those same checks help uncover indexability conflicts.

The Three Indexing Mechanisms

Mechanism	Scope	Who reads it
`<meta name="robots" content="noindex">`	Single page	Crawlers that fetched the page
`X-Robots-Tag: noindex` HTTP header	Single resource	Crawlers that fetched the resource
`robots.txt Disallow`	URL pattern	Crawlers before fetching

Exceptions

Staging, utility, login, account, or internal search pages may intentionally use different crawl or index signals if they are not meant to rank.
Temporary migration states can produce noisy intermediate signals; flag the live production URL pattern, not one-off transition artifacts.
When redirects, canonicals, robots directives, or indexability signals conflict, fix the strongest final signal first instead of reporting every downstream symptom as a separate blocker.

Standards

Use these references as the standard for the final search-facing HTML, metadata, and crawl behavior.
Check the implementation against Google Search Central: Prevent Google from indexing pages before treating the rule as satisfied.
Check the implementation against Google Search Central: Robots meta tag before treating the rule as satisfied.

Verification

Automated Checks

Use Google Search Console Coverage report → "Excluded" tab to find pages marked "Indexed, though blocked by robots.txt" or "Noindex page".
Use URL Inspection in Search Console for specific URLs.

Manual Checks

Export your top 50–100 pages by importance (traffic, revenue, links).
For each, check: meta robots, X-Robots-Tag header, robots.txt.
Fix blocks and request re-indexing.

Further Reading

Tools and supplementary material for exploring the topic in more depth.

robots.txt 파일 만들기 및 제출 | Google 크롤링 인프라 | Crawling infrastructure | Google for Developers

robots.txt 파일은 사이트의 루트에 위치합니다. robots.txt 파일을 만들고 예를 확인하며 robots.txt 규칙을 확인하는 방법을 알아보세요.

Google for DevelopersGuide

Bloquer l'indexation dans la recherche avec "noindex" | Google Search Central | Documentation | Google for Developers

Une balise noindex peut empêcher Google d'indexer une page afin qu'elle ne s'affiche pas dans les résultats de recherche. Découvrez dans ce guide comment mettre…

Google for DevelopersGuide

Related rules

Rules that often go hand-in-hand with this one.

Avoid conflicting indexability signals

Detects conflicting signals between robots.txt, meta robots, X-Robots-Tag headers, and canonical tags

SEO

Set robots meta directives correctly

Checks robots meta tag for valid indexing directives in the page head.

SEO

Robots Meta Conflict

Detects pages blocked by robots.txt that also carry noindex meta tags, creating a paradox where the directive is never read.

SEO

Schema + Noindex Conflict

Detects pages that carry rich result schema markup but are blocked from indexing via noindex or robots.txt.

SEO

Make important pages indexable

Code Examples

❌ Avoid — important page accidentally noindexed

❌ Avoid — Next.js metadata with indexing disabled

✅ Correct — indexable page (no noindex)

✅ Correct — Next.js metadata enabling indexing

✅ Limiting noindex to genuinely private content

✅ Testing indexability

Why It Matters

The Three Indexing Mechanisms

Exceptions

Standards

Verification

Automated Checks

Manual Checks

Use with AI

Sources

Further Reading

Was this rule helpful?

Rule Details

Code Examples

❌ Avoid — important page accidentally noindexed

❌ Avoid — Next.js metadata with indexing disabled

✅ Correct — indexable page (no noindex)

✅ Correct — Next.js metadata enabling indexing

✅ Limiting noindex to genuinely private content

✅ Testing indexability

Why It Matters

The Three Indexing Mechanisms

Exceptions

Standards

Verification

Automated Checks

Manual Checks

Use with AI

Sources

Further Reading

Related rules

Was this rule helpful?