A canonical via HTTP header is a canonical URL declared in the response headers rather than in the page HTML. It is usually sent through an HTTP Link header and can tell search engines which URL should be treated as the preferred version of the resource. This is an important check because canonical signals do […]
Canonical target status code
The canonical target status code shows the HTTP response returned by the URL named in a page’s canonical tag. This is an important check because a canonical only works well when it points to a URL that is accessible and valid. If the canonical target stops returning 200, search engines may receive a mixed or […]
Self-canonical
A self-canonical check shows whether a page’s canonical tag points to that page’s own final resolved URL. In other words, it checks whether the page is declaring itself as the preferred version. This is a strong health check because self-canonicalisation is often the expected setup for indexable pages that do not need to consolidate signals […]
Canonical present
The canonical present check shows whether a page includes a rel=”canonical” tag at all. This may seem like a simple yes-or-no signal, but it can still be important, especially on sites where duplicate or near-duplicate URLs are common. A missing canonical is not always a problem. Many pages can function perfectly well without one. But […]
Canonical href
The canonical href is the exact URL declared in a page’s rel=”canonical” tag. It tells search engines which version of a page should be treated as the preferred one when similar or duplicate URLs exist. Because this check stores the precise canonical URL from the HTML, it is a high-value signal. Even a small change […]
Robots.txt allowed for CSS
The robots.txt allowed for CSS check shows whether the CSS files needed by a page can be crawled under the site’s robots.txt rules. This is important because search engines often need access to CSS to render the page properly, understand its layout, and assess how the content is presented. A page can be technically accessible […]
Robots.txt allowed for page URL
The robots.txt allowed for page URL check shows whether a page’s URL is currently permitted to be crawled under the site’s robots.txt rules. This is a critical signal because robots.txt can block search engines from accessing a page before they even reach its content. When this value changes, it deserves immediate attention. A page may […]
X-Robots-Tag noindex flag
The X-Robots-Tag noindex flag shows whether a page or file includes a noindex directive in its HTTP response headers. This is a high-impact signal because it tells search engines not to index that resource, even if the content itself looks perfectly normal. Unlike a meta robots tag in the page HTML, this instruction is delivered […]
X-Robots-Tag raw value
The X-Robots-Tag raw value is the exact indexing and crawling instruction sent in the HTTP response header rather than in the page HTML. It can control whether search engines index a page, follow its links, or apply other restrictions before they even process the visible content. Because this is a header-level signal, it is especially […]
Meta robots nofollow flag
The meta robots nofollow flag shows whether a page includes a nofollow directive in its robots meta tag. This directive tells search engines not to follow links found on that page in the normal way. That makes it an important signal to monitor. A page may still be live, indexable, and visible to users, yet […]
