Back to blog
wordpressrobots-txthtaccessplugins

How to Block AI Crawlers on WordPress

December 1, 2024(Updated: Dec 5, 2024)5 min read
Share:
Blogging and content management workspace

Running WordPress? You've got several options for blocking AI crawlers, from simple plugins to manual configuration. Here's what actually works.

Method 1: Edit robots.txt (easiest)

WordPress generates a virtual robots.txt by default. You can override it with a physical file or use a plugin.

Using a plugin

The easiest approach is using a plugin like Yoast SEO or Rank Math:

Yoast SEO:

  1. Go to Yoast SEO → Tools → File Editor
  2. Edit your robots.txt
  3. Add the blocking rules

Rank Math:

  1. Go to Rank Math → General Settings → Edit robots.txt
  2. Add your rules

Add these lines:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Anthropic-ai
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: PerplexityBot
Disallow: /

Manual robots.txt

Alternatively, create a physical robots.txt file in your WordPress root directory (where wp-config.php lives). This overrides WordPress's virtual one.

Want to skip the copy-paste?

Use our robots.txt generator to create these rules automatically.

Try robots.txt Generator

Method 2: .htaccess rules

WordPress already uses .htaccess for permalinks. You can add bot-blocking rules there.

Finding your .htaccess

Your .htaccess file is in your WordPress root directory. You'll need FTP access or a file manager in your hosting control panel.

Adding the rules

Add this BEFORE the WordPress section (before # BEGIN WordPress):

# Block AI Crawlers
<IfModule mod_rewrite.c>
RewriteEngine On

# OpenAI
RewriteCond %{HTTP_USER_AGENT} GPTBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ChatGPT [NC,OR]

# Anthropic
RewriteCond %{HTTP_USER_AGENT} ClaudeBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Claude-Web [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Anthropic [NC,OR]

# Google AI
RewriteCond %{HTTP_USER_AGENT} Google-Extended [NC,OR]

# Others
RewriteCond %{HTTP_USER_AGENT} CCBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Bytespider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PerplexityBot [NC]

RewriteRule .* - [F,L]
</IfModule>
# END Block AI Crawlers
Want to skip the copy-paste?

Use our .htaccess generator to create these rules automatically.

Try .htaccess Generator

Always backup your .htaccess before editing. A syntax error can take down your site.

Method 3: Security plugins

Several WordPress security plugins can block bots by User-Agent.

Wordfence

  1. Go to Wordfence → Firewall → Blocking
  2. Create a new block rule
  3. Select "Block if User-Agent contains"
  4. Add: GPTBot (repeat for each bot)

Sucuri

  1. Go to Sucuri → Firewall (WAF)
  2. Access blocking rules
  3. Add User-Agent patterns

All In One WP Security

  1. Go to WP Security → Blacklist Manager
  2. Add User-Agents to block

WordPress-specific considerations

Cache plugins

If you use caching plugins (WP Super Cache, W3 Total Cache, LiteSpeed Cache), they might serve cached pages before .htaccess rules fire. Make sure your caching plugin doesn't bypass security rules.

Cloudflare + WordPress

If you're using Cloudflare with WordPress, set up blocking rules in Cloudflare instead of .htaccess. It's faster and more reliable:

  1. Cloudflare Dashboard → Security → WAF
  2. Create custom rule
  3. Block based on User-Agent

CDN considerations

If you serve content from a CDN, the CDN might cache and serve pages to bots before your origin server sees the request. Check your CDN's bot management options.

Which method should you use?

| Situation | Recommendation | |-----------|----------------| | Simple blocking, respects robots.txt | robots.txt (Method 1) | | Need hard blocking | .htaccess (Method 2) | | Already using security plugin | Use plugin's blocking (Method 3) | | Using Cloudflare | Cloudflare WAF rules | | Maximum protection | All of the above |

I typically recommend robots.txt + .htaccess. The robots.txt catches well-behaved bots, and .htaccess catches the rest (like Bytespider which ignores robots.txt).

Testing

After implementing your blocks, test them:

robots.txt check

Visit https://yoursite.com/robots.txt and verify your rules are visible.

.htaccess check

curl -A "GPTBot/1.0" -I https://yoursite.com/

Should return 403 Forbidden.

Monitor your logs

If you have access to server logs (many WordPress hosts provide this), check for bot requests:

grep -i "gptbot\|claudebot\|bytespider" access.log | tail -20

Look for 403 responses.

Common issues

"My robots.txt changes aren't showing"

WordPress caches robots.txt. Clear your cache or wait a few minutes. Also check if a plugin is overriding your changes.

".htaccess changes break my site"

Undo your changes immediately. Common causes:

  • Syntax error in the rewrite rules
  • Conflicting rules
  • mod_rewrite not enabled on your server

"Bots still getting through"

  • Check that .htaccess is actually being processed
  • Verify you're blocking the correct User-Agent strings
  • Some bots use multiple User-Agent variants

Plugin recommendations

While not specifically for AI crawlers, these plugins help with bot management:

  • Blackhole for Bad Bots — Creates a honeypot trap for malicious bots
  • Limit Login Attempts — Helps with bot-based attacks
  • Wordfence — Comprehensive security including bot blocking

Most people don't need a dedicated AI blocker plugin. The robots.txt + .htaccess approach works fine and doesn't add plugin overhead.

My setup

On my WordPress sites, I use:

  1. robots.txt via Rank Math (for documentation and compliant bots)
  2. .htaccess rules (for enforcement, especially Bytespider)
  3. Cloudflare free tier (for additional protection)

This three-layer approach catches virtually everything without requiring premium plugins or complex configurations.

Skip the manual work

Generate your blocking rules in seconds with our free tools.

See also:

Found this helpful? Share it with others.

Share:

Ready to block AI crawlers?

Use our free generators to create your blocking rules in seconds.