Google SEO Update 2023 October 18

Structured data for subscription and paywalled content (CreativeWork)

This page describes how to use schema.org JSON-LD to indicate paywalled content on your site with CreativeWork properties. This structured data helps Google differentiate paywalled content from the practice of cloaking, which violates spam policies. Learn more about subscription and paywalled content.

Here’s an example of NewsArticle structured data with paywalled content.

<html>
  <head>
    <title>Article headline</title>
    <script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "NewsArticle",
      "headline": "Article headline",
      "image": "https://example.org/thumbnail1.jpg",
      "datePublished": "2025-02-05T08:00:00+08:00",
      "dateModified": "2025-02-05T09:20:00+08:00",
      "author": {
        "@type": "Person",
        "name": "John Doe",
        "url": "https://example.com/profile/johndoe123"
      },
      "description": "A most wonderful article",
      "isAccessibleForFree": "False",
      "hasPart":
        {
        "@type": "WebPageElement",
        "isAccessibleForFree": "False",
        "cssSelector" : ".paywall"
        }
    }
    </script>
  </head>
  <body>
    <div class="non-paywall">
      Non-Paywalled Content
    </div>
    <div class="paywall">
      Paywalled Content
    </div>
  </body>
</html>

You must follow the general structured data guidelines and technical guidelines for your page to be eligible to appear in search results. In addition, the following guidelines apply to paywalled content:

  • JSON-LD and microdata formats are accepted methods for specifying structured data for paywalled content.
  • Don’t nest content sections.
  • Only use .class selectors for the cssSelector property.

If you offer any subscription-based access to your website content, or if users must register for access to any content you want to be indexed, follow these steps. The following example applies to NewsArticle structured data. Make sure to follow these steps for all versions of your page (including AMP and non-AMP).

  1. Add a class name around each paywalled section of your page. For example:
    <body>
    <p>This content is outside a paywall and is visible to all.</p>
    <div class="paywall">This content is inside a paywall, and requires a subscription or registration.</div>
    </body>
  2. Add NewsArticle structured data.
  3. Add the highlighted JSON-LD structured data to your NewsArticle structured data.
    {
    "@context": "https://schema.org",
    "@type": "NewsArticle",
    "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.org/article"
    },
    (...)
    "isAccessibleForFree": "False",
    "hasPart": {
    "@type": "WebPageElement",
    "isAccessibleForFree": "False",
    "cssSelector": ".paywall"
    }
    }
  4. Validate your code using the Rich Results Test and fix any critical errors.

If you have multiple paywalled sections on a page, add the class names as an array.

Here’s an example of the paywalled sections on a page:

<body>
<div class="section1">This content is inside a paywall, and requires a subscription or registration.</div>
<p>This content is outside a paywall and is visible to all.</p>
<div class="section2">This is another section that's inside a paywall, or requires a subscription or registration.</div>
</body>

Here’s an example of NewsArticle structured data with multiple paywalled sections.

{
  "@context": "https://schema.org",
  "@type": "NewsArticle",
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.org/article"
    },
  (...)
  "isAccessibleForFree": "False",
  "hasPart": [
    {
      "@type": "WebPageElement",
      "isAccessibleForFree": "False",
      "cssSelector": ".section1"
    }, {
      "@type": "WebPageElement",
      "isAccessibleForFree": "False",
      "cssSelector": ".section2"
    }
  ]
}

This markup is supported for the CreativeWork type or one of the following more specific types of CreativeWork:

Multiple schema.org types can be used, such as the following:

"@type": ["Article", "LearningResource"]

You must include the required properties for Google to understand that your article has paywalled content. You can add the recommended properties for more granularity about which sections of a page are behind a paywall (or require a subscription or registration).

Required properties
isAccessibleForFree BooleanWhether the article is accessible to everyone, or if it’s behind a paywall (or requires a subscription or registration). Set the isAccessibleForFree property to False to specify that this section is behind a paywall.
Recommended properties
hasPart.cssSelector CssSelectorTypeA CSS selector that references the class name that you set in the HTML to specify the paywalled section.
hasPart.@type TextSet the @type to WebPageElement.
hasPart.isAccessibleForFree BooleanWhether this section of the article is behind a paywall (or requires a subscription or registration). Set the isAccessibleForFree property to False to specify that this section is behind a paywall.

Here’s a list of considerations to keep in mind if you use AMP pages:

  • If you have an AMP page with paywalled content, use amp-subscriptions where appropriate.
  • Make sure that your authorization endpoint grants access to content to the appropriate bots from Google and others. This is different per publisher.
  • Ensure that your bot access policy is the same for AMP and non-AMP pages, otherwise this can result in content mismatch errors that appear in Search Console.

SGE (Search Generative Experience) overviews are generated with the help of AI. They are supported by info from across the web and Google’s Knowledge Graph, a collection of info about people, places, and things. Content blocked using snippet controls will not be shown in overviews.

SGE is designed to help people discover helpful information on the web that supports the information in the overview and provides a jumping off point for people to explore further. As in Search more broadly, SGE overviews may include links to paywalled content as a way for people to discover those pages.

SGE while browsing, a separate feature than SGE in Search, will not show key points for paywalled articles, if paywall structured data is on the page.

If you want Google to crawl and index your content, including the paywalled sections, make sure Googlebot, and Googlebot-News if applicable, can access your page.

Use the URL Inspection tool to test how Google crawls and renders a URL on your site.

To prevent Google from showing a cached link for your page, use the noarchive robots meta tag.

To exclude certain sections of your content from appearing in search result snippets, use the data-nosnippet HTML attribute. You can also limit how many characters a search result snippet may have by using the max-snippet robots meta tag.

If you’re having trouble implementing or debugging structured data, here are some resources that may help you.

  • If you’re using a content management system (CMS) or someone else is taking care of your site, ask them to help you. Make sure to forward any Search Console message that details the issue to them.
  • Google does not guarantee that features that consume structured data will show up in search results. For a list of common reasons why Google may not show your content in a rich result, see the General Structured Data Guidelines.
  • You might have an error in your structured data. Check the list of structured data errors.
  • If you received a structured data manual action against your page, the structured data on the page will be ignored (although the page can still appear in Google Search results). To fix structured data issues, use the Manual Actions report.
  • Review the guidelines again to identify if your content isn’t compliant with the guidelines. The problem can be caused by either spammy content or spammy markup usage. However, the issue may not be a syntax issue, and so the Rich Results Test won’t be able to identify these issues.
  • Troubleshoot missing rich results / drop in total rich results.
  • Allow time for re-crawling and re-indexing. Remember that it may take several days after publishing a page for Google to find and crawl it. For general questions about crawling and indexing, check the Google Search crawling and indexing FAQ.
  • Post a question in the Google Search Central forum.

 

SGE (Search Generative Experience) overviews are generated with the help of AI. They are supported by info from across the web and Google’s Knowledge Graph, a collection of info about people, places, and things. Content blocked using snippet controls will not be shown in overviews.

SGE is designed to help people discover helpful information on the web that supports the information in the overview and provides a jumping off point for people to explore further. As in Search more broadly, SGE overviews may include links to paywalled content as a way for people to discover those pages.

SGE while browsing, a separate feature than SGE in Search, will not show key points for paywalled articles, if paywall structured data is on the page.


Robots meta tag, data-nosnippet, and X-Robots-Tag specifications

This document details how the page- and text-level settings can be used to adjust how Google presents your content in search results. You can specify page-level settings by including a meta tag on HTML pages or in an HTTP header. You can specify text-level settings with the data-nosnippet attribute on HTML elements within a page.

Keep in mind that these settings can be read and followed only if crawlers are allowed to access the pages that include these settings.

The <meta name="robots" content="noindex"> rule applies to search engine crawlers. To block non-search crawlers, such as AdsBot-Google, you might need to add rules targeted to the specific crawler (for example, <meta name="AdsBot-Google" content="noindex">).

The robots meta tag lets you utilize a granular, page-specific approach to controlling how an individual page should be indexed and served to users in Google Search results. Place the robots meta tag in the <head> section of a given page, like this:

<!DOCTYPE html>
<html><head>
<meta name="robots" content="noindex">
(…)
</head>
<body>(…)</body>
</html>

In this example, the robots meta tag instructs search engines not to show the page in search results. The value of the name attribute (robots) specifies that the rule applies to all crawlers. Both the name and the content attributes are case-insensitive. To address a specific crawler, replace the robots value of the name attribute with the user agent token of the crawler that you are addressing. Google supports two user agent tokens in the robots meta tag; other values are ignored:

  1. googlebot: for all text results.
  2. googlebot-news: for news results.

For example, to instruct Google specifically not to show a page in its search results, you can specify googlebot as the name of the meta tag:

<meta name="googlebot" content="noindex">

To show a page in Google’s web search results, but not in Google News, use the googlebot-news meta tag:

<meta name="googlebot-news" content="noindex">

To specify multiple crawlers individually, use multiple robots meta tags:

<meta name="googlebot" content="noindex">
<meta name="googlebot-news" content="nosnippet">

To block indexing of non-HTML resources, such as PDF files, video files, or image files, use the X-Robots-Tag response header instead.

The X-Robots-Tag can be used as an element of the HTTP header response for a given URL. Any rule that can be used in a robots meta tag can also be specified as an X-Robots-Tag. Here’s an example of an HTTP response with an X-Robots-Tag instructing crawlers not to index a page:

HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noindex
(…)

Multiple X-Robots-Tag headers can be combined within the HTTP response, or you can specify a comma-separated list of rules. Here’s an example of an HTTP header response which has a noarchive X-Robots-Tag combined with an unavailable_after X-Robots-Tag.

HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noarchive
X-Robots-Tag: unavailable_after: 25 Jun 2010 15:00:00 PST
(…)

The X-Robots-Tag may optionally specify a user agent before the rules. For instance, the following set of X-Robots-Tag HTTP headers can be used to conditionally allow showing of a page in search results for different search engines:

HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow
(…)

Rules specified without a user agent are valid for all crawlers. The HTTP header, the user agent name, and the specified values are not case sensitive.

The following rules, also available in machine-readable format, can be used to control indexing and serving of a snippet with the robots meta tag and the X-Robots-Tag. Each value represents a specific rule. Multiple rules may be combined in a comma-separated list or in separate meta tags. These rules are case-insensitive.

Rules

all

There are no restrictions for indexing or serving. This rule is the default value and has no effect if explicitly listed.

noindex

Do not show this page, media, or resource in search results. If you don’t specify this rule, the page, media, or resource may be indexed and shown in search results.To remove information from Google, follow our step-by-step guide.

nofollow

Do not follow the links on this page. If you don’t specify this rule, Google may use the links on the page to discover those linked pages. Learn more about nofollow.

none

Equivalent to noindex, nofollow.

noarchive

Do not show a cached link in search results. If you don’t specify this rule, Google may generate a cached page and users may access it through the search results.

nositelinkssearchbox

Do not show a sitelinks search box in the search results for this page. If you don’t specify this rule, Google may generate a search box specific to your site in search results, along with other direct links to your site.

nosnippet

Do not show a text snippet or video preview in the search results for this page. A static image thumbnail (if available) may still be visible, when it results in a better user experience. This applies to all forms of search results (at Google: web search, Google Images, Discover). Google SGE overviews also will not show content blocked using nosnippet.If you don’t specify this rule, Google may generate a text snippet and video preview based on information found on the page.

indexifembedded

Google is allowed to index the content of a page if it’s embedded in another page through iframes or similar HTML tags, in spite of a noindex rule.indexifembedded only has an effect if it’s accompanied by noindex.

max-snippet: [number]

Use a maximum of [number] characters as a textual snippet for this search result. (Note that a URL may appear as multiple search results within a search results page.) This does not affect image or video previews. This applies to all forms of search results (such as Google web search, Google Images, Discover, Assistant). Google SGE overviews also will not show content beyond the specified limit. However, this limit does not apply in cases where a publisher has separately granted permission for use of content. For instance, if the publisher supplies content in the form of in-page structured data or has a license agreement with Google, this setting does not interrupt those more specific permitted uses. This rule is ignored if no parseable [number] is specified.If you don’t specify this rule, Google will choose the length of the snippet.

Special values:

  • 0: No snippet is to be shown. Equivalent to nosnippet.
  • -1: Google will choose the snippet length that it believes is most effective to help users discover your content and direct users to your site.

Examples:

To stop a snippet from displaying in search results:

<meta name="robots" content="max-snippet:0">

To allow up to 20 characters to be shown in the snippet:

<meta name="robots" content="max-snippet:20">

To specify that there’s no limit on the number of characters that can be shown in the snippet:

<meta name="robots" content="max-snippet:-1">

max-image-preview: [setting]

Set the maximum size of an image preview for this page in a search results.If you don’t specify the max-image-preview rule, Google may show an image preview of the default size.

Accepted [setting] values:

  • none: No image preview is to be shown.
  • standard: A default image preview may be shown.
  • large: A larger image preview, up to the width of the viewport, may be shown.

This applies to all forms of search results (such as Google web search, Google Images, Discover, Assistant). However, this limit does not apply in cases where a publisher has separately granted permission for use of content. For instance, if the publisher supplies content in the form of in-page structured data (such as AMP and canonical versions of an article) or has a license agreement with Google, this setting will not interrupt those more specific permitted uses.

If you don’t want Google to use larger thumbnail images when their AMP pages and canonical version of an article are shown in Search or Discover, specify a max-image-preview value of standard or none.

Example:

<meta name="robots" content="max-image-preview:standard">

max-video-preview: [number]

Use a maximum of [number] seconds as a video snippet for videos on this page in search results.If you don’t specify the max-video-preview rule, Google may show a video snippet in search results, and you leave it up to Google to decide how long the preview may be.

Special values:

  • 0: At most, a static image may be used, in accordance to the max-image-preview setting.
  • -1: There is no limit.

This applies to all forms of search results (at Google: web search, Google Images, Google Videos, Discover, Assistant). This rule is ignored if no parseable [number] is specified.

Example:

<meta name="robots" content="max-video-preview:-1">

notranslate

Don’t offer translation of this page in search results. If you don’t specify this rule, Google may provide a translation of the title link and snippet of a search result for results that aren’t in the language of the search query. If the user clicks the translated title link, all further user interaction with the page is through Google Translate, which will automatically translate any links followed.

noimageindex

Do not index images on this page. If you don’t specify this value, images on the page may be indexed and shown in search results.

unavailable_after: 2024

Do not show this page in search results after the specified date/time. The date/time must be specified in a widely adopted format including, but not limited to RFC 822RFC 850, and ISO 8601. The rule is ignored if no valid date/time is specified. By default there is no expiration date for content.If you don’t specify this rule, this page may be shown in search results indefinitely. Googlebot will decrease the crawl rate of the URL considerably after the specified date and time.

Example:

<meta name="robots" content="unavailable_after: 2020-09-21">

You can create a multi-rule instruction by combining robots meta tag rules with commas or by using multiple meta tags. Here is an example of a robots meta tag that instructs web crawlers to not index the page and to not crawl any of the links on the page:

<meta name="robots" content="noindex, nofollow">

Here is an example that limits the text snippet to 20 characters, and allows a large image preview:

<meta name="robots" content="max-snippet:20, max-image-preview:large">

For situations where multiple crawlers are specified along with different rules, the search engine will use the sum of the negative rules. For example:

<meta name="robots" content="nofollow">
<meta name="googlebot" content="noindex">

The page containing these meta tags will be interpreted as having a noindex, nofollow rule when crawled by Googlebot.

You can designate textual parts of an HTML page not to be used as a snippet. This can be done on an HTML-element level with the data-nosnippet HTML attribute on spandiv, and section elements. The data-nosnippet is considered a boolean attribute. As with all boolean attributes, any value specified is ignored. To ensure machine-readability, the HTML section must be valid HTML and all appropriate tags must be closed accordingly.

Examples:

<p>This text can be shown in a snippet
<span data-nosnippet>and this part would not be shown</span>.</p>

<div data-nosnippet>not in snippet</div>
<div data-nosnippet="true">also not in snippet</div>
<div data-nosnippet="false">also not in snippet</div>
<!-- all values are ignored -->

<div data-nosnippet>some text</html>
<!-- unclosed "div" will include all content afterwards -->

<mytag data-nosnippet>some text</mytag>
<!-- NOT VALID: not a span, div, or section -->

Google typically renders pages in order to index them, however rendering is not guaranteed. Because of this, extraction of data-nosnippet may happen both before and after rendering. To avoid uncertainty from rendering, do not add or remove the data-nosnippet attribute of existing nodes through JavaScript. When adding DOM elements through JavaScript, include the data-nosnippet attribute as necessary when initially adding the element to the page’s DOM. If custom elements are used, wrap or render them with divspan, or section elements if you need to use data-nosnippet.

Robots meta tags govern the amount of content that Google extracts automatically from web pages for display as search results. But many publishers also use schema.org structured data to make specific information available for search presentationRobots meta tag limitations don’t affect the use of that structured data, with the exception of article.description and the description values for structured data specified for other creative works. To specify the maximum length of a preview based on these description values, use the max-snippet rule. For example, recipe structured data on a page is eligible for inclusion in the recipe carousel, even if the text preview would otherwise be limited. You can limit the length of a text preview with max-snippet, but that robots meta tag doesn’t apply when the information is provided using structured data for rich results.

To manage the use of structured data for your web pages, modify the structured data types and values themselves, adding or removing information in order to provide only the data you want to make available. Also note that structured data remains usable for search results when declared within a data-nosnippet element.

You can add the X-Robots-Tag to a site’s HTTP responses through the configuration files of your site’s web server software. For example, on Apache-based web servers you can use .htaccess and httpd.conf files. The benefit of using an X-Robots-Tag with HTTP responses is that you can specify crawling rules that are applied globally across a site. The support of regular expressions allows a high level of flexibility.

For example, to add a noindex, nofollow X-Robots-Tag to the HTTP response for all .PDF files across an entire site, add the following snippet to the site’s root .htaccess file or httpd.conf file on Apache, or the site’s .conf file on NGINX.

<Files ~ "\.pdf$">
  Header set X-Robots-Tag "noindex, nofollow"
</Files>

You can use the X-Robots-Tag for non-HTML files like image files where the usage of robots meta tags in HTML is not possible. Here’s an example of adding a noindex X-Robots-Tag rule for images files (.png.jpeg.jpg.gif) across an entire site:

<Files ~ "\.(png|jpe?g|gif)$">
  Header set X-Robots-Tag "noindex"
</Files>

You can also set the X-Robots-Tag headers for individual static files:

# the htaccess file must be placed in the directory of the matched file.
<Files "unicorn.pdf">
  Header set X-Robots-Tag "noindex, nofollow"
</Files>

robots meta tags and X-Robots-Tag HTTP headers are discovered when a URL is crawled. If a page is disallowed from crawling through the robots.txt file, then any information about indexing or serving rules will not be found and will therefore be ignored. If indexing or serving rules must be followed, the URLs containing those rules cannot be disallowed from crawling.


max-snippet: [number]

Use a maximum of [number] characters as a textual snippet for this search result. (Note that a URL may appear as multiple search results within a search results page.) This does not affect image or video previews. This applies to all forms of search results (such as Google web search, Google Images, Discover, Assistant). Google SGE overviews also will not show content beyond the specified limit. However, this limit does not apply in cases where a publisher has separately granted permission for use of content. For instance, if the publisher supplies content in the form of in-page structured data or has a license agreement with Google, this setting does not interrupt those more specific permitted uses. This rule is ignored if no parseable [number] is specified.If you don’t specify this rule, Google will choose the length of the snippet.

Special values:

  • 0: No snippet is to be shown. Equivalent to nosnippet.
  • -1: Google will choose the snippet length that it believes is most effective to help users discover your content and direct users to your site.

Examples:

To stop a snippet from displaying in search results:

<meta name="robots" content="max-snippet:0">

To allow up to 20 characters to be shown in the snippet:

<meta name="robots" content="max-snippet:20">

To specify that there’s no limit on the number of characters that can be shown in the snippet:

<meta name="robots" content="max-snippet:-1">

 


nosnippet

Do not show a text snippet or video preview in the search results for this page. A static image thumbnail (if available) may still be visible, when it results in a better user experience. This applies to all forms of search results (at Google: web search, Google Images, Discover). Google SGE overviews also will not show content blocked using nosnippet.If you don’t specify this rule, Google may generate a text snippet and video preview based on information found on the page.

Define a favicon to show in search results

If your site has a favicon, it can be included in Google Search results for your site.

Favicon

Here’s how to make your site eligible for a favicon in Google Search results:

  1. Create a favicon that follows the guidelines.
  2. Add a <link> tag to the header of your home page with the following syntax:
    <link rel="icon" href="/path/to/favicon.ico">

    To extract the favicon information, Google relies on the following attributes of the link element:

    Attributes
    rel Set the rel attribute to one of the following strings:

    • icon
    • apple-touch-icon
    • apple-touch-icon-precomposed
    • shortcut icon
    href The URL of the favicon. The URL can be a relative path (/smile.ico) or absolute path (https://example.com/smile.ico).
  3. Google looks for and updates your favicon whenever it crawls your home page. If you make changes to your favicon and want to inform Google about the changes, you can request indexing of your site’s home page. Updates can take a few days or longer to appear in search results.

You must follow these guidelines to be eligible for a favicon in Google Search results.

  • Google Search only supports one favicon per site, where a site is defined by the hostname. For example, https://www.example.com/ and https://code.example.com/ are two different hostnames, and therefore can have two different favicons. However, https://www.example.com/sub-site is a subdirectory of a site, and you can only set one favicon for https://www.example.com/, which applies to the site and its subdirectories.
    Supportedhttps://example.com (this is a domain-level home page)
    Supportedhttps://news.example.com (this is a subdomain-level home page)
    Not supportedhttps://example.com/news (this is a subdirectory-level home page)
  • The favicon file must be crawlable by Googlebot-Image and the home page by Googlebot; they cannot be blocked for crawling.
  • To help people quickly identify your site when they scan through search results, make sure your favicon is visually representative of your website’s brand.
  • Your favicon must be a multiple of 48px square, for example: 48x48px, 96x96px, 144x144px and so on. SVG files don’t have a specific size. Any valid favicon format is supported.
  • The favicon URL must be stable (don’t change the URL frequently).
  • Google won’t show any favicon that it deems inappropriate, including pornography or hate symbols (for example, swastikas). If this type of imagery is discovered within a favicon, Google replaces it with a default icon.