How to Test SEO Meta Tag Requirements in RSpec

3 min read Kane Jamison Kane Jamison

Recently I have been testing various SEO requirements in a Rails app that uses Rspec.

After copy and pasting expect(page).to have_css "meta[name='robots'][content='noindex']", visible: :hidden for the 5th time it was clear I needed to build out some helper commands that were simpler and more clear.

I came up with the following SEO Helper that lives at spec/support/seo_helper.rb.

These tests are primary checking for meta tags. So if you want to verify other requirements like presence in sitemaps that isn't currently covered by this helper.

If you use this in your own app, you'll need to possibly modify values. For instance when we set meta robots to noindex, if you only set a value of noindex as opposed to noindex, follow, you might need to change that value.

# spec/support/seo_helper.rb

module SEOHelper
  # Check if the page has indexable meta robots tag
  def expect_that_page_is_indexable
    expect(page).to have_css "meta[name='robots'][content='index, follow']", visible: :hidden
  end

  # Check if the page has non-indexable meta robots tag
  def expect_that_page_is_noindexed
    expect(page).to have_css "meta[name='robots'][content='noindex, follow']", visible: :hidden
  end

  # Check that canonical URL path exactly matches the expected path
  # Include the full path after domain
  # eg expect_canonical_url_path_matches("/about-us")
  def expect_canonical_url_path_matches(expected_path)
    # Get the actual canonical URL from the page
    canonical_element = find("link[rel='canonical']", visible: :hidden)
    actual_url = canonical_element[:href]

    # Parse the URL to get just the path
    actual_uri = URI.parse(actual_url)
    actual_path = actual_uri.path

    # Ensure expected_path starts with a slash
    expected_path = "/#{expected_path}" unless expected_path.start_with?('/')

    # Check if the path exactly matches the expected path
    expect(actual_path).to eq(expected_path),
                           "Expected canonical path to be '#{expected_path}', but got '#{actual_path}'"
  end

  # Check for specific meta description
  def expect_meta_description(expected_content = nil)
    if expected_content
      expect(page).to have_css "meta[name='description'][content='#{expected_content}']", visible: :hidden
    else
      expect(page).to have_css "meta[name='description']", visible: :hidden
    end
  end

  # Check for exact page title
  def expect_page_title_matches(expected_title)
    expect(page).to have_title expected_title
  end

  # Check that page title contains the expected text
  def expect_page_title_contains(expected_text)
    expect(page.title).to include(expected_text),
                          "Expected page title to contain '#{expected_text}', but got '#{page.title}'"
  end
end

RSpec.configure do |config|
  config.include SEOHelper
end

Here's an example of how we use this:

require 'rails_helper'

# Note - most tests in this file are confirming that landing pages on the website are renderable, indexable, and that the canonical tag is generated as expected.
# If you need to test specific functionality on these pages it probably deserves a separate spec.
# Pages are sorted alphabetical by URL.

RSpec.describe 'Visit Core Pages', type: :feature do
  it "renders the home page as expected" do
    visit "/"
    expect(page).to have_text("Here is a paragraph that would only appear on the About Us page and definitely not in our footer or other pages of the site that would cause false positives.")
    expect_that_page_is_indexable
    expect_canonical_url_path_matches("/")
  end

  it "renders the /about-us page as expected" do
    visit "/about-us"
    expect(page).to have_text("Here is a paragraph that would only appear on the About Us page and definitely not in our footer or other pages of the site that would cause false positives.")
    expect_that_page_is_indexable
    expect_canonical_url_path_matches("/about-us")
  end

   # etc for other critical pages

end

I don't need to check the following types of values currently, but here are some likely ways to expand this helper in the future:

  1. Testing for Open Graph meta tags (eg og:url matches canonical tag)
  2. Verifying redirect paths are 301 and "single hop"
  3. Verifying JSON-LD structured data presence and validity
  4. Checking hreflang tags for multilingual sites
  5. Validating meta viewport settings
  6. Testing for proper heading hierarchy (H1, H2, etc.)
  7. Checking for rel="next" and rel="prev" pagination tags
  8. Verifying image alt attributes on critical images
  9. Checking for sitemap.xml references in robots.txt
  10. Validating proper URL trailing slash consistency (this is partially inherent in our canonical path checker, however you might need to also render the page with a trailing slash and confirmed canonical does not include it)
  11. Checking for rel="nofollow" on specific links
  12. Testing breadcrumb markup
  13. Verifying schema.org markup for specific page types
  14. Testing for proper canonical links on taxonomy pages, paginated pages, pages with parameters, etc.
  15. Checking for proper sitemap formatting
  16. Testing for valid RSS feeds