How to Check if Your Squarespace Site Blocks AI Bots — and How to Fix It

As AI search tools like ChatGPT, Perplexity, and Claude become mainstream discovery platforms, website owners are beginning to notice a subtle but important problem: Squarespace blocks most AI crawlers by default.

That means your blog posts, portfolio pages, and creative content might be invisible to large language models (LLMs) — even if your site looks perfect to Google.

Here’s how to check if that’s happening, why it matters, and what you can do to fix it.

1. Check your Squarespace robots.txt

Every Squarespace site automatically generates a robots.txt file — the file that tells web crawlers what they can and can’t access.

To view yours, go to:

https://www.yourdomain.com/robots.txt

If your site is on Squarespace, you’ll likely see something like this:
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: *
Disallow: /config
Disallow: /search
Allow: /

This looks harmless, but those first lines are the issue.
They tell AI crawlers like GPTBot (OpenAI’s bot powering ChatGPT) and ClaudeBot (Anthropic’s crawler) that they are not allowed to index your site at all.

In other words, Squarespace’s default configuration stops AI bots from seeing your content — even if you want them to.

2. Why this matters

Blocking AI bots doesn’t affect traditional SEO — Googlebot and Bingbot are still allowed.

But it does affect visibility in new AI-driven search and content ecosystems.

That includes:

  • ChatGPT’s web results (via GPTBot)

  • Perplexity AI’s citations

  • Anthropic Claude’s retrieval

  • AI-powered assistants embedded in browsers and search tools

If your content is blocked, these systems can’t reference or cite your site — meaning your brand might disappear from AI search entirely.

3. How to test AI crawler access

You can test whether your site is publicly readable to crawlers in two quick ways:

a. Use an HTTP status checker

Go to https://httpstatus.io and enter your domain:

https://www.yourdomain.com/

If you get a 200 OK response, your site is publicly accessible.
But that only means humans can reach it — not that AI bots are allowed.

b. Use OpenAI’s GPTBot checker

Visit https://openai.com/gptbot.
Scroll down to the “Check if your site allows GPTBot” section and enter your URL.
You’ll see one of three results:

If you see “Blocked,” Squarespace’s robots.txt is the cause.

4. Why you can’t fix it directly on Squarespace

Squarespace doesn’t currently let you edit or override robots.txt — it’s a system route that ignores URL Mappings and Code Injection.

That means even if you try to redirect /robots.txt to a custom file, Squarespace’s built-in version still takes priority.

So, to regain control, you’ll need to host your own robots.txt externally.

5. How to fix it (and allow AI bots)

There are three reliable approaches, depending on how technical you want to get.

Option 1: Create a canonical robots.txt on GitHub Pages (simple, permanent)

  1. Create a free GitHub account at github.com.

  2. Create a repository named yourusername.github.io.

  3. Add a file called robots.txt with this content:

    User-agent: GPTBot
    Allow: /
    
    User-agent: ClaudeBot
    Allow: /
    
    User-agent: *
    Disallow: /config
    Disallow: /search
    Allow: /
    
    Sitemap: https://www.yourdomain.com/sitemap.xml
    Host: www.yourdomain.com
    
    
    
  4. Your file will now be accessible at:
    https://yourusername.github.io/robots.txt

  5. In Squarespace, go to Settings → Advanced → URL Mappings and add:

    /robots.txt -> https://yourusername.github.io/robots.txt 301
    
    
    

    (Note: Squarespace may not apply this redirect immediately, but it works in most cases for crawlers that follow links.)

Option 2: Add meta tags to your header (for HTML-level control)

Even if you can’t modify robots.txt, AI crawlers also respect meta directives.
Go to Settings → Advanced → Code Injection → Header and add:

<meta name="robots" content="index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1">

This tells AI crawlers they can index and follow your content, even if robots.txt says otherwise.

Option 3: Use Cloudflare to override Squarespace’s robots.txt (advanced)

If you manage your DNS through Cloudflare:

  1. Add a Page Rule or Transform Rule for:

    https://www.yourdomain.com/robots.txt
  2. Set it to forward (301) to your external GitHub-hosted robots.txt file.

This proxy-level rule intercepts requests before Squarespace serves its own file — giving you total control.

6. Confirm that it’s fixed

Once your new setup is live:

  • Visit your canonical file (e.g., https://yourusername.github.io/robots.txt)

  • Check that it displays your custom content.

  • Test again at https://openai.com/gptbot — it should now show Allowed.

You can also verify that AI crawlers (GPTBot, ClaudeBot, CCBot) are no longer blocked using a command-line check:

curl -I -A "GPTBot" https://www.yourdomain.com/

If the response is 200 OK, your site is accessible to AI.

7. The future of AI visibility

AI visibility is the new SEO.
Just as webmasters once optimized for Googlebot, forward-thinking creators now optimize for AI crawlers — ensuring their work is included, cited, and correctly attributed inside conversational search tools.

Squarespace’s design-first simplicity comes at the cost of transparency, but with a few simple steps, you can make your work discoverable by both humans and machines.

TL;DR — Quick Fix Summary

If you publish regularly and want your blog posts, resources, and lessons to be accessible to AI tools and search systems, separating your human website (Squarespace) from your machine-readable corpus (GitHub Pages or subdomain) is the most reliable long-term strategy.

AI SEOFrancesca Tabor