How to Block AI Crawlers on Netlify
Netlify doesn't have traditional server config, but between Edge Functions and _redirects, you can block AI crawlers effectively. Here's how.
Method 1: robots.txt (simplest)
Create public/robots.txt (or just robots.txt in your repo root for static sites):
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: Anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
User-agent: *
Allow: /
This is a polite request. Well-behaved bots honor it; others don't.
Use our robots.txt generator to create these rules automatically.
Try robots.txt Generator →Method 2: Edge Functions (recommended)
Netlify Edge Functions run at the edge, intercepting requests before they hit your site. This is real enforcement.
Setup
Create netlify/edge-functions/block-ai-bots.ts:
import type { Config, Context } from "@netlify/edge-functions";
const AI_BOTS = [
'GPTBot',
'ChatGPT-User',
'ChatGPT',
'OAI-SearchBot',
'ClaudeBot',
'Claude-Web',
'Anthropic',
'Google-Extended',
'CCBot',
'Bytespider',
'PerplexityBot',
'Omgilibot',
];
export default async (request: Request, context: Context) => {
const userAgent = request.headers.get('user-agent') || '';
const isAIBot = AI_BOTS.some(bot =>
userAgent.toLowerCase().includes(bot.toLowerCase())
);
if (isAIBot) {
return new Response('Forbidden', { status: 403 });
}
// Continue to the next handler (your site)
return context.next();
};
export const config: Config = {
path: "/*",
};
Configuration in netlify.toml
Add to your netlify.toml:
[[edge_functions]]
path = "/*"
function = "block-ai-bots"
Excluding static assets
You probably don't want to run the function for every static asset. Refine the path:
[[edge_functions]]
path = "/*"
function = "block-ai-bots"
excludedPath = ["/_next/*", "/images/*", "/fonts/*", "/*.ico"]
Or in the function itself:
export default async (request: Request, context: Context) => {
const url = new URL(request.url);
// Skip static assets
if (url.pathname.match(/\.(js|css|png|jpg|svg|ico|woff|woff2)$/)) {
return context.next();
}
const userAgent = request.headers.get('user-agent') || '';
const isAIBot = AI_BOTS.some(bot =>
userAgent.toLowerCase().includes(bot.toLowerCase())
);
if (isAIBot) {
return new Response('Forbidden', { status: 403 });
}
return context.next();
};
Method 3: Netlify redirects (limited)
You can't block by User-Agent with _redirects or netlify.toml redirects—they don't support header matching. Use Edge Functions instead.
However, you can use redirects to serve different robots.txt files:
# netlify.toml - not useful for User-Agent blocking
[[redirects]]
from = "/robots.txt"
to = "/robots-blocking.txt"
status = 200
force = true
Limited use case, but an option if you need conditional robots.txt.
Logging blocked requests
import type { Config, Context } from "@netlify/edge-functions";
const AI_BOTS = ['GPTBot', 'ClaudeBot', 'Bytespider'];
export default async (request: Request, context: Context) => {
const userAgent = request.headers.get('user-agent') || '';
const isAIBot = AI_BOTS.some(bot =>
userAgent.toLowerCase().includes(bot.toLowerCase())
);
if (isAIBot) {
console.log(`Blocked: ${userAgent} from ${request.url}`);
return new Response('Forbidden', { status: 403 });
}
return context.next();
};
export const config: Config = {
path: "/*",
};
Check Netlify Dashboard → Functions → Logs to see blocked requests.
Rate limiting
Netlify Edge Functions can implement rate limiting, though it's a bit more involved:
import type { Config, Context } from "@netlify/edge-functions";
// Simple rate limiting with edge KV (Blob store)
// Note: This is pseudocode - actual implementation depends on your KV setup
const AI_BOTS = ['GPTBot', 'ClaudeBot', 'Bytespider'];
const MAX_REQUESTS = 10;
const WINDOW_SECONDS = 60;
export default async (request: Request, context: Context) => {
const userAgent = request.headers.get('user-agent') || '';
const isAIBot = AI_BOTS.some(bot =>
userAgent.toLowerCase().includes(bot.toLowerCase())
);
if (isAIBot) {
// For proper rate limiting, use Netlify Blobs or external KV
// This is a simplified example
const ip = context.ip;
console.log(`AI bot request from ${ip}: ${userAgent}`);
// Implement your rate limiting logic here
// For now, just block outright
return new Response('Forbidden', { status: 403 });
}
return context.next();
};
For production rate limiting, consider using Netlify Blobs for state or a service like Upstash Redis.
Partial blocking
Block AI from specific paths only:
const PROTECTED_PATHS = ['/premium', '/members', '/api'];
export default async (request: Request, context: Context) => {
const userAgent = request.headers.get('user-agent') || '';
const url = new URL(request.url);
const isAIBot = AI_BOTS.some(bot =>
userAgent.toLowerCase().includes(bot.toLowerCase())
);
const isProtectedPath = PROTECTED_PATHS.some(path =>
url.pathname.startsWith(path)
);
if (isAIBot && isProtectedPath) {
return new Response('Forbidden', { status: 403 });
}
return context.next();
};
Testing locally
Use Netlify CLI:
netlify dev
Then test:
curl -A "GPTBot/1.0" -I http://localhost:8888/
Testing on Netlify
After deploying:
curl -A "GPTBot/1.0" -I https://yoursite.netlify.app/
Should return 403 Forbidden.
Framework-specific notes
Next.js on Netlify
You can use Next.js middleware instead of Netlify Edge Functions. Either works, but Next.js middleware might feel more natural if you're already using Next.js features.
Gatsby
Gatsby generates static files. Use Netlify Edge Functions as shown above.
Hugo / Jekyll / other static
Same approach—Edge Functions work regardless of your static site generator.
Common issues
Edge Function not running
- Check
netlify.tomlconfiguration - Verify function file is in
netlify/edge-functions/ - Check Netlify Dashboard → Functions for errors
Wrong response
- Check your logic isn't too broad (accidentally blocking real users)
- Test with specific User-Agent strings
Performance concerns
Edge Functions are fast, but if you're worried:
- Exclude static asset paths
- Keep the bot list short
- Use early return for non-bot requests
Using with Cloudflare
If you front Netlify with Cloudflare (via CNAME), you can do bot blocking in Cloudflare instead. See our Cloudflare guide. But Netlify Edge Functions work fine alone.
My recommendation
- Add
robots.txtfor documentation and compliant bots - Add Edge Function for actual enforcement
- Check Netlify function logs to verify
Generate your blocking rules in seconds with our free tools.
See also: