Why are there no records or pages missing from my website collections index? <test>
Problem
There are no records or records are missing from your collections index.
Investigation steps for no records:
Use our page debug tool to see if our crawler encounters an error on your homepage
Has the domain been added correctly to the Domains section of our platform?
If the entered domain has a
www
prefix the crawler will not visit any subdomains. See hereIs the site password protected? See here
Are your servers blocking our crawler? See here
Our crawler will only crawl pages under 10mb in file size. If your webpages exceed this our page debug tool will throw the following error
error cleaning HTML: content too large
Investigation steps for records missing from your collection:
Use our page debug tool to see if our crawler encounters an error on any of the URL’s that are missing
Does your HTML contain absolute or relative URLs? The crawler will only follow relative URLs if there is a
<base>
tag on the page<base href="https://www.yourwebsite.com/">
If the content was added / altered recently our crawler may not have visited your site according to its schedule. See here
If you’re using our ping-back / instant indexing code have you met the conditions documented here?
Do you have any exclusion rules set up in the Crawler rules section preventing the pages from being crawled?
Do your pages have the
noindex
ordata-sj-noindex
tags preventing crawling?Is any of your content only accessible via query parameters i.e.
www.mysite/news-articles/press-releases?page=2
? If so please contact our support team here.Do the pages you expect to be indexed have canonical tags pointing our crawler to a different page?
Do you have any canonical loops i.e. homepage redirects to a different page, which redirects back to the homepage
Do you have a sitemap? This page describes the benefits
Our crawler will only only index sitemaps named
sitemap.xml
Our crawler will only crawl pages under 10mb in file size. If your webpages exceed this our page debug tool will throw the following error
error cleaning HTML: content too large