How do I prevent pages from being crawled?
Problem
I want to prevent certain pages from being indexed and appearing in search results.
Solution
You can add an HTML element of data-sj-noindex
anywhere in a page and it will not be indexed. Most commonly this will be defined in the <head>
of an HTML page as follows:
Locate the
<head>
tag of the page you want to prevent from being crawled.Add the following code within the
<head>
:<meta name="robots" content="noindex" data-sj-noindex />
Save the changes. The crawler will ignore this page next time it comes across it.
Additionally you can use crawling rules to programmatically exclude sections or certain pages of your website.