# State of Small Business Websites 2026 — Dataset

**292 rows. 41 columns.**

## Source

DeepAudit AI scan of small business websites sourced from a ZoomInfo prospect list,
collected 2026 Q1 through the Cold Call Engine pipeline at Axion Deep Digital.
Each site was rendered in a headless Chromium browser (Puppeteer) and evaluated
across 100+ technical SEO, performance, accessibility, and security checks.

## Anonymization

Personally identifying fields (contact name, email, phone, URL, domain, company
name, city) are **removed**. Each site is identified by `site_id`, a 16-character
SHA-256 prefix of the original domain.

To verify your own site's row:

```python
import hashlib
site_id = hashlib.sha256("yourdomain.com".encode()).hexdigest()[:16]
```

State is retained at the US-state granularity. The 292 rows are not re-identifiable
from state alone.

## Columns

### Identity
- `site_id` — SHA-256 prefix of original domain (16 chars)
- `state` — US state code

### Overall score
- `overall_score` — DeepAudit composite score (0-100)
- `overall_grade` — letter grade (A+ through F)

### Mobile Lighthouse (PageSpeed Insights, mobile form factor)
- `mobile_lh_valid` — 1 if PageSpeed returned useful mobile data, 0 if not
- `mobile_performance`, `mobile_seo`, `mobile_accessibility`, `mobile_best_practices` — Lighthouse category scores (0-100)
- `mobile_lcp_seconds` — Largest Contentful Paint, seconds
- `mobile_fcp_seconds` — First Contentful Paint, seconds
- `mobile_cls` — Cumulative Layout Shift (unitless)
- `mobile_lcp_bucket`, `mobile_fcp_bucket`, `mobile_cls_bucket` — Google's
  good / needs_improvement / poor thresholds
- `mobile_cwv_all_three` — pass if LCP<=2.5s AND FCP<=1.8s AND CLS<=0.1, else fail

### Desktop Lighthouse
- Same shape as mobile, with `desktop_` prefix

### Authority
- `open_pagerank` — Open PageRank value (0-10 scale, proxy for domain authority)

### Specific high-signal checks (pass / warn / fail / empty)
- `check_link_labels` — axe-core Link Labels accessibility check
- `check_focus_indicators` — axe-core focus visibility check
- `check_form_labels` — axe-core form labeling check
- `check_html_validation` — W3C HTML validator
- `check_h1_tag` — proper H1 tag presence
- `check_json_ld` — JSON-LD structured data presence
- `check_sitemap_xml` — sitemap.xml discoverable
- `check_hsts` — HTTP Strict Transport Security header
- `check_primary_keyword` — primary keyword placement in visible content

### Category rollups
- `a11y_checks_total` / `a11y_checks_failed` — accessibility
- `structured_data_checks_total` / `structured_data_checks_failed` — schema.org markup
- `security_checks_total` / `security_checks_failed` — security headers
- `technical_checks_total` / `technical_checks_failed` — technical SEO

## Caveats

- **Sample selection.** Sites were sourced from a B2B prospect list. Skewed toward
  US small businesses in sales-visible industries. Not a random sample of the web.
- **Single scan per site.** Performance numbers are one snapshot. Page conditions,
  CDN caching, time of day, and measurement variance affect LCP/FCP within ~10-20%.
- **PSI partial failures.** Some sites returned zero across all Lighthouse categories
  (PageSpeed rendering error). These rows have `mobile_lh_valid = 0`. Filter them
  out before computing mobile performance statistics.
- **Open PageRank is a proxy.** It is not Google's PageRank (retired in 2016).
  Useful for rank-ordering domains, not for absolute authority claims.

## License

Creative Commons Attribution 4.0 International (CC BY 4.0).

You are free to share and adapt this dataset for any purpose, including commercial,
provided you attribute:

> Axion Deep Digital (2026). State of Small Business Websites 2026. DeepAudit AI
> scan (n=292). axiondeepdigital.com/research/state-of-small-business-websites-2026

## Citation (BibTeX)

```
@dataset{axiondeepdigital_sbw_2026,
  author  = {Gutierrez, Joshua R. and Gutierrez, Crystal A.},
  title   = {State of Small Business Websites 2026},
  year    = {2026},
  publisher = {Axion Deep Digital},
  version = {1.0},
  url     = {https://axiondeepdigital.com/research/state-of-small-business-websites-2026}
}
```
