Resolving The Most Common Website Scan Errors
When AccessibleWebBot attempts to crawl your website, it behaves differently than a standard web browser. Occasionally, security configurations or server errors will block our bot from successfully scanning your pages.
When this happens, you will receive an email notification with the subject line: "Action Required: Scan Error for [Your Website Name]".
This guide breaks down why these scan errors happen, how to fix them, and answers the most common questions regarding automated bot security.
In this Article:
- Quick Diagnostics: The 5 Most Common Scan Errors
- FAQ: Why does my site load in a browser, but the bot fails?
- Deep Dive: SSL/TLS Certificate Requirements & Fixes
- How to Verify Your Fixes
Quick Diagnostics: The 5 Most Common Scan Errors
When our bot has trouble crawling your website, it is almost always caused by one of five things. They are listed below in order of how frequently they occur:
1. Your website’s robots.txt file is blocking our bot
A robots.txt file tells automated crawlers which parts of your site they are allowed to visit. If your file contains a rule that blocks our bot, our bot will respect it and halt the scan.
- The Fix: You need to grant our bot permission to scan your pages. Review our quick guide to Allowing AccessibleWebBot to add the proper exception rules to your file.
2. Bot blocking at the CDN level
Content Delivery Networks (CDNs) like Cloudflare, Akamai, or AWS CloudFront use Web Application Firewalls (WAF) to block bad automated traffic. Even though AccessibleWebBot is a verified secure bot, strict or custom CDN security settings can accidentally bucket us with malicious scrapers.
- The Fix: Have your IT or development team add an exception rule for AccessibleWebBot in your CDN/WAF dashboard. Since we are a Cloudflare-verified bot, simply ensuring that your rule allows "Verified Bots" often resolves this immediately.
3. Firewalls (often interpreted as a robots.txt issue)
Network-level firewalls or security plugins (like Wordfence on WordPress) may look at our crawling pattern and mistake it for a brute-force attack, dropping our connection entirely. Because the connection is dropped abruptly, our system may sometimes interpret this initial connection failure as a robots.txt retrieval issue.
- The Fix: Your infrastructure team will need to whitelist our bot’s IP addresses or our User-Agent string (AccessibleWebBot) in your server's hosting firewall configurations.
4. The website is temporarily down or has permanently moved
If our scanner hits a severe server error (like a 500 Internal Server Error) or a broken redirect chain, it cannot read your page content.
- The Fix: If it’s a temporary outage, no action is needed! Our system will automatically attempt to resume scanning once your site is back online. However, if your website has permanently moved to a new domain name, make sure to [update your domain settings inside RAMP](website RAMP settings page).
5. There is an issue with your website’s SSL/TLS certificate
If your website’s security handshake fails standard validation checks, our scanner will safely abort the connection rather than risking a compromised network path.
- The Fix: This requires your IT department or web development team to update, renew, or reconfigure the certificate on your web server (detailed in the deep dive below).
FAQ: Why does my site load in a browser, but your bot cannot access it?
This is the most common question our support team receives. A client will often say, "I can open my website perfectly fine on my phone and desktop, so why is your scanner throwing an error?"
The Answer: Web browsers are designed to be highly forgiving; secure bots are not.
Standard web browsers (like Chrome, Safari, or Edge) are built for everyday consumer convenience. If a website has an underlying security misconfiguration, a browser will often silently "fix" it behind the scenes, ignore minor server omissions, or at most show the user a warning screen with an option to "proceed anyway."
By contrast, AccessibleWebBot is a verified, secure bot. Because we process data programmatically, our crawler must operate on a strict zero-trust model. If your web server fails the standard security handshake, our bot cannot guarantee the integrity of the data stream and will safely abort the scan rather than risking data exposure.
Deep Dive: SSL/TLS Certificate Requirements & Fixes
When our scanner attempts a connection, it looks for a completely valid, publicly verifiable HTTPS connection. Here are some specific certificate issues that will cause our bot to stop scanning, alongside instructions for your development team to fix them:
Incomplete Certificate Chain (Most Common)
Web servers are supposed to send your specific website certificate along with "intermediate" certificates that link your site back to a globally trusted Root Certificate Authority (CA). Modern browsers often cache intermediate certificates and piece the chain together on their own, but secure backend automated tools do not. If your server has an incomplete chain our bot will abort the crawl.
- The Fix: Ask your developer or IT team to ensure all intermediary certificates are bundled, concatenated correctly, and uploaded to your web server configurations (e.g., Nginx, Apache, or IIS).
Expired Certificates
If your certificate's validity window has passed, the encryption can no longer be verified as secure.
- The Fix: Renew your SSL/TLS certificate through your certificate authority (such as Let's Encrypt, DigiCert, etc.) and deploy the updated certificate to your production server.
Hostname Mismatch
This happens if your SSL certificate is registered for one domain (e.g., example.com), but you are trying to scan a different subdomain (e.g., staging.example.com) that isn't explicitly covered by the certificate or its Wildcard rules.
- The Fix: Update your certificate to include the specific subdomain as a Subject Alternative Name (SAN), or issue a separate certificate specifically covering the exact domain URL added to RAMP.
Self-Signed or Untrusted Certificates
If your certificate was self-generated rather than issued by a recognized public certificate authority, our system will reject it. This is highly common in internal testing environments or pre-production staging sites.
- The Fix: Replace the self-signed certificate with a free, valid public certificate (like Let's Encrypt) for the staging environment, or swap the RAMP settings to point to your live, publicly accessible production site.
How to Verify Your Fixes
Before reaching out to support or waiting for your next scheduled RAMP scan, you can check your website’s certificate health using third-party verification tools.
We highly recommend using the free Qualys SSL Labs Server Test.
Simply enter your website URL and run the test. If the results flag an "Incomplete Certificate Chain" or any trust issues, hand those technical specs directly over to your web development or IT infrastructure team. Once they clear those flags, RAMP will be able to successfully scan your site again.
Need Extra Help?
If your development team has verified that your certificates, firewall rules, and robots.txt files are completely clear, but the error persists, please reach out to us at support@accessibleweb.com. We are more than happy to dive into details with you!