Skip to content

Methodology & Dataset Schema

State of Small Business Websites 2026 — Dataset

292 rows. 41 columns. Open data under CC BY 4.0.

Source

DeepAudit AI scan of small business websites sourced from a ZoomInfo prospect list, collected 2026 Q1 through the Cold Call Engine pipeline at Axion Deep Digital. Each site was rendered in a headless Chromium browser (Puppeteer) and evaluated across 100+ technical SEO, performance, accessibility, and security checks.

Anonymization

Personally identifying fields (contact name, email, phone, URL, domain, company name, city) are removed. Each site is identified by site_id, a 16-character SHA-256 prefix of the original domain.

To verify your own site's row:

import hashlib
site_id = hashlib.sha256("yourdomain.com".encode()).hexdigest()[:16]

State is retained at the US-state granularity. The 292 rows are not re-identifiable from state alone.

Columns

Identity

  • site_id — SHA-256 prefix of original domain (16 chars)
  • state — US state code

Overall score

  • overall_score — DeepAudit composite score (0-100)
  • overall_grade — letter grade (A+ through F)

Mobile Lighthouse (PageSpeed Insights, mobile form factor)

  • mobile_lh_valid — 1 if PageSpeed returned useful mobile data, 0 if not
  • mobile_performance, mobile_seo, mobile_accessibility, mobile_best_practices — Lighthouse category scores (0-100)
  • mobile_lcp_seconds — Largest Contentful Paint, seconds
  • mobile_fcp_seconds — First Contentful Paint, seconds
  • mobile_cls — Cumulative Layout Shift (unitless)
  • mobile_lcp_bucket, mobile_fcp_bucket, mobile_cls_bucket — Google's good / needs_improvement / poor thresholds
  • mobile_cwv_all_three — pass if LCP≤2.5s AND FCP≤1.8s AND CLS≤0.1, else fail

Desktop Lighthouse

Same shape as mobile, with desktop_ prefix.

Authority

  • open_pagerank — Open PageRank value (0-10 scale, proxy for domain authority)

Specific high-signal checks (pass / warn / fail / empty)

  • check_link_labels — axe-core Link Labels accessibility check
  • check_focus_indicators — axe-core focus visibility check
  • check_form_labels — axe-core form labeling check
  • check_html_validation — W3C HTML validator
  • check_h1_tag — proper H1 tag presence
  • check_json_ld — JSON-LD structured data presence
  • check_sitemap_xml — sitemap.xml discoverable
  • check_hsts — HTTP Strict Transport Security header
  • check_primary_keyword — primary keyword placement in visible content

Category rollups

  • a11y_checks_total / a11y_checks_failed — accessibility
  • structured_data_checks_total / structured_data_checks_failed — schema.org markup
  • security_checks_total / security_checks_failed — security headers
  • technical_checks_total / technical_checks_failed — technical SEO

Caveats

  • Sample selection. Sites were sourced from a B2B prospect list. Skewed toward US small businesses in sales-visible industries. Not a random sample of the web.
  • Single scan per site. Performance numbers are one snapshot. Page conditions, CDN caching, time of day, and measurement variance affect LCP/FCP within ~10-20%.
  • PSI partial failures. Some sites returned zero across all Lighthouse categories (PageSpeed rendering error). These rows have mobile_lh_valid = 0. Filter them out before computing mobile performance statistics.
  • Open PageRank is a proxy. It is not Google's PageRank (retired in 2016). Useful for rank-ordering domains, not for absolute authority claims.

License

Creative Commons Attribution 4.0 International (CC BY 4.0).

You are free to share and adapt this dataset for any purpose, including commercial, provided you attribute:

Axion Deep Digital (2026). State of Small Business Websites 2026. DeepAudit AI scan (n=292). axiondeepdigital.com/research/state-of-small-business-websites-2026

Citation (BibTeX)

@dataset{axiondeepdigital_sbw_2026,
  author    = {Gutierrez, Joshua R. and Gutierrez, Crystal A.},
  title     = {State of Small Business Websites 2026},
  year      = {2026},
  publisher = {Axion Deep Digital},
  version   = {1.0},
  url       = {https://axiondeepdigital.com/research/state-of-small-business-websites-2026}
}