Vercel doesn't give you traditional server config access, but you've got options. Here's how to block AI crawlers on Next.js projects hosted on Vercel.

Method 1: robots.txt (simplest)

Create public/robots.txt:

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: Anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

User-agent: *
Allow: /

This works for well-behaved bots that check robots.txt. But some bots (especially Bytespider) ignore it.

Want to skip the copy-paste?

Use our robots.txt generator to create these rules automatically.

Try robots.txt Generator →

Method 2: Next.js Middleware (recommended)

Middleware lets you intercept requests before they reach your pages. This is actual enforcement, not just a suggestion.

Create middleware.ts in your project root (or src/ if you use that structure):

import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const AI_BOTS = [
  'GPTBot',
  'ChatGPT-User',
  'ChatGPT',
  'OAI-SearchBot',
  'ClaudeBot',
  'Claude-Web',
  'Anthropic',
  'Google-Extended',
  'CCBot',
  'Bytespider',
  'PerplexityBot',
  'Omgilibot',
  'FacebookBot',
  'Meta-ExternalAgent',
];

export function middleware(request: NextRequest) {
  const userAgent = request.headers.get('user-agent') || '';

  // Check if User-Agent matches any AI bot
  const isAIBot = AI_BOTS.some(bot =>
    userAgent.toLowerCase().includes(bot.toLowerCase())
  );

  if (isAIBot) {
    return new NextResponse('Forbidden', { status: 403 });
  }

  return NextResponse.next();
}

// Apply to all routes
export const config = {
  matcher: '/((?!_next/static|_next/image|favicon.ico).*)',
};

This blocks AI crawlers from all pages while allowing Next.js static assets to load normally.

Middleware with logging

Want to see what's being blocked?

import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const AI_BOTS = [
  'GPTBot',
  'ClaudeBot',
  'Google-Extended',
  'CCBot',
  'Bytespider',
];

export function middleware(request: NextRequest) {
  const userAgent = request.headers.get('user-agent') || '';
  const isAIBot = AI_BOTS.some(bot =>
    userAgent.toLowerCase().includes(bot.toLowerCase())
  );

  if (isAIBot) {
    // Log blocked requests (visible in Vercel function logs)
    console.log(`Blocked AI bot: ${userAgent} from ${request.url}`);
    return new NextResponse('Forbidden', { status: 403 });
  }

  return NextResponse.next();
}

export const config = {
  matcher: '/((?!_next/static|_next/image|favicon.ico).*)',
};

Check Vercel Dashboard → Logs to see blocked requests.

Method 3: vercel.json headers

You can't directly block with vercel.json, but you can set headers that instruct reverse proxies or CDNs to block. Limited use case, but here's an example:

{
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        {
          "key": "X-Robots-Tag",
          "value": "noai, noimageai"
        }
      ]
    }
  ]
}

The noai directive is experimental and not widely respected yet. Stick with middleware for actual blocking.

Rate limiting (Edge Runtime)

Vercel's Edge Runtime supports rate limiting. You can slow down AI bots instead of blocking:

import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

// Simple in-memory rate limiting (resets on cold start)
const requestCounts = new Map<string, { count: number; timestamp: number }>();
const WINDOW_MS = 60000; // 1 minute
const MAX_REQUESTS = 10;

export function middleware(request: NextRequest) {
  const userAgent = request.headers.get('user-agent') || '';

  const isAIBot = ['GPTBot', 'ClaudeBot', 'Bytespider'].some(bot =>
    userAgent.toLowerCase().includes(bot.toLowerCase())
  );

  if (isAIBot) {
    const ip = request.ip || 'unknown';
    const now = Date.now();
    const record = requestCounts.get(ip);

    if (record && now - record.timestamp < WINDOW_MS) {
      record.count++;
      if (record.count > MAX_REQUESTS) {
        return new NextResponse('Rate limited', { status: 429 });
      }
    } else {
      requestCounts.set(ip, { count: 1, timestamp: now });
    }
  }

  return NextResponse.next();
}

Note: This simple implementation resets when the Edge function cold starts. For production rate limiting, use Vercel KV or an external service.

Partial blocking

Block AI from specific routes only:

import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const AI_BOTS = ['GPTBot', 'ClaudeBot', 'Bytespider'];
const PROTECTED_PATHS = ['/premium', '/members', '/api'];

export function middleware(request: NextRequest) {
  const userAgent = request.headers.get('user-agent') || '';
  const pathname = request.nextUrl.pathname;

  const isAIBot = AI_BOTS.some(bot =>
    userAgent.toLowerCase().includes(bot.toLowerCase())
  );

  const isProtectedPath = PROTECTED_PATHS.some(path =>
    pathname.startsWith(path)
  );

  if (isAIBot && isProtectedPath) {
    return new NextResponse('Forbidden', { status: 403 });
  }

  return NextResponse.next();
}

export const config = {
  matcher: '/((?!_next/static|_next/image|favicon.ico).*)',
};

This blocks AI bots from /premium, /members, and /api while allowing them to access public content.

Testing locally

Run your Next.js dev server and test with curl:

curl -A "GPTBot/1.0" -I http://localhost:3000/

Should return 403 Forbidden if your middleware is working.

Testing on Vercel

After deploying:

curl -A "GPTBot/1.0" -I https://yoursite.vercel.app/

Check Vercel Dashboard → Logs for the blocked request.

Common issues

Middleware not running

Check file is named middleware.ts (not middleware.tsx)
Check it's in the right location (root or src/ depending on project structure)
Check the matcher config isn't too restrictive

Static files being blocked

Make sure your matcher excludes static paths:

export const config = {
  matcher: '/((?!_next/static|_next/image|favicon.ico|.*\\.(?:svg|png|jpg|jpeg|gif|webp)$).*)',
};

Cold start logging gaps

Edge functions have cold starts. Your in-memory logging or rate limiting might reset. For persistent logging, use Vercel's built-in function logs or integrate with an external service.

App Router vs Pages Router

The middleware above works with both. Just make sure:

App Router: middleware.ts in root or src/
Pages Router: Same location

The config export works identically.

Using with Cloudflare

If you front Vercel with Cloudflare, you can do bot blocking there instead (see our Cloudflare guide). But the middleware approach works fine alone—no need for both unless you want defense in depth.

My recommendation

Add robots.txt in public/ for documentation and compliant bots
Add middleware for actual enforcement
Check Vercel logs to verify it's working

Skip the manual work

Generate your blocking rules in seconds with our free tools.

robots.txt Generator .htaccess Generator

How to Block AI Crawlers on Vercel (Next.js)