ChatGPTAI SEOCore Web VitalsLighthousetechnical SEODeepAudit AIreal browser SEO audit

What ChatGPT Gets Wrong When You Ask It to Audit Your SEO

Joshua R. GutierrezMay 25, 20267 min read

I wanted to know how good ChatGPT actually is at SEO audits, so I asked it to audit our own site, axiondeepdigital.com. Four prompts. Fresh thread. No setup, no priming. Then I went back and checked every concrete claim against the actual page, against PageSpeed, and against our own SEO workbench.

What came back was about 2,000 words. There was a grade rubric ("Technical SEO 9/10, Architecture 8.8/10"), a prioritized fix list, named architectural risks, an opinion on Googlebot rendering. It read like the kind of audit a senior SEO consultant might write up after actually looking at the site.

Most of it was made up. One of its predictions was exactly backwards. And in the same thread, ChatGPT contradicted itself in a way you'd miss on a casual read.

Here's what it actually said, prompt by prompt.

Prompt 1: The grade rubric was invented

I asked: "Can you do a full technical SEO audit of [our URL]? I want specific issues, not a generic checklist."

ChatGPT came back with a rubric:

Technical SEO: 9/10
Architecture: 8.8/10
Content Depth: 6.8/10
Topical Authority: 6.5/10
Local SEO Structure: 8.5/10
Conversion SEO: 8.7/10

Those numbers look authoritative. They map to nothing measurable. Lighthouse gives you a Performance score and a separate SEO score. Search Console reports impressions, clicks, and CTR. None of these tools produce a "Technical SEO" or "Architecture" or "Conversion SEO" number. The rubric doesn't exist in any real auditing tool. ChatGPT made it up, then graded us against it.

I also checked the factual claims. ChatGPT said eight technology brands were "mentioned" on our homepage:

Vercel, Stripe, Netflix, Next.js, React, FastAPI, OpenAI, AWS

Three of them (FastAPI, OpenAI, AWS) aren't on our homepage at all. That's a 37.5% hallucination rate on something a careful reader could verify in under a minute.

Same pattern with the city pages it cited. It listed four geo URLs: /albuquerque, /las-cruces, /new-mexico, /web-development-new-mexico. The first two exist. The last two don't.

When ChatGPT mixes real observations and made-up ones inside the same paragraph, that's harder to catch than pure fabrication. A reader assumes the whole list was checked the same way.

Prompt 2: ChatGPT contradicted itself two messages later

I asked: "What are the current Core Web Vitals (LCP, CLS, INP) for [URL] on mobile? Give me the actual numbers."

ChatGPT replied:

I cannot reliably give you the real current field numbers for LCP, CLS, and INP from Google's CrUX dataset without direct access to PageSpeed Insights or the Chrome UX Report API for that URL. The web search environment here can access documentation and indexed pages, but it cannot execute a live Lighthouse/PageSpeed test against your domain or pull the live CrUX metrics feed.

That's true. It's also the same model that just spent 2,000 words in the previous message grading our site like it had actually run Lighthouse on it. If it can't run Lighthouse, the 9/10 grade in prompt 1 came from somewhere other than measurement. The two answers contradict each other two messages apart. ChatGPT didn't flag it. A casual reader wouldn't either.

Then it guessed at the vitals anyway:

I would expect CLS to already be very good, LCP to be decent but hero/media dependent, INP to likely be your weakest metric if third party scripts or animations are present.

So I ran the actual Lighthouse mobile audit. Here's what ChatGPT predicted next to what Lighthouse measured:

Metric	ChatGPT predicted	Lighthouse measured (mobile)
CLS	"already very good"	+25 / 25 (perfect)
LCP	"decent but hero/media dependent"	+2 / 25 (lowest tier)
INP / TBT (interaction)	"likely your weakest metric"	+27 / 30 (one of our best)

CLS was a gimme. CLS is good on most modern sites and predicting that requires no information at all. LCP, the metric ChatGPT softened to "decent," scored 2 out of a possible 25 on mobile, the bottom tier Lighthouse reports. The metric ChatGPT confidently called our weakest scored 27 out of 30, the strongest of the three. The ranking was exactly inverted.

One more thing worth flagging from that answer. ChatGPT pointed me to PageSpeed Insights for the real CrUX field data. I went and checked. Our site doesn't have CrUX field data. Sites below Google's traffic threshold don't get it, and the CrUX dataset returns "does not have sufficient real-world speed data" for our origin. A reader following ChatGPT's advice would land at PageSpeed, look for the field numbers it promised, and find nothing.

Prompt 3: A confident render analysis without rendering anything

I asked: "Does Google see the same content on [URL] that a human visitor sees? Walk me through what Googlebot renders vs what the user renders."

ChatGPT gave a careful, technically literate answer:

Based on the rendered structure, Googlebot is seeing substantially the same primary content that human users see on axiondeepdigital.com.

The phrase "based on the rendered structure" is doing all the work in that sentence. ChatGPT didn't render anything. It can't render anything. It said so itself two messages earlier. The whole walkthrough of what Googlebot sees versus what the user sees is pattern-matching on what a "modern Next.js site" usually does, not a measurement of what ours does.

If we had a broken hydration path, if our hero was client-only, if our city pages were rendering empty divs to the bot, the answer would have read identically. That's the worst possible failure mode for an auditing tool: confident output that doesn't change when the thing being audited changes.

Prompt 4: Three priorities that ignored the actual problem

I asked: "If I could only fix three things on [URL] this week to improve search rankings, what should they be and why those three?"

ChatGPT said: build a topical authority cluster, convert portfolio items into standalone case studies, improve internal semantic linking. Each one came with an explanation that would apply word-for-word to any well-built Next.js agency site on the planet.

What it didn't mention: our mobile Performance score is 69, our mobile LCP scores 2 out of 25, and Google has been weighting Core Web Vitals into ranking since 2021. A real "top three this week" list given the data starts with "fix mobile LCP." ChatGPT couldn't start there because it didn't know.

It also said our "fast rendering stack using Next.js" was a strength in prompt 1. On mobile, our Performance score is 69 with LCP at 2 out of 25. "Fast" is generous.

The deeper problem

Our SEO category in Lighthouse scores 100. Best Practices is 100. Our own SEO workbench gives the site 96/100 overall. ChatGPT's gut call of "9/10 Technical SEO" is, by accident, close to the truth.

That's the trap.

ChatGPT didn't arrive at a roughly correct grade by measuring our site. It got there by pattern-matching what a well-built Next.js agency site usually looks like. We happen to be one. If we'd been one of the 96.9% of small business sites our 292-site study found failing mobile Core Web Vitals, the audit would have read identically. Same grades. Same praise for "fast rendering stack." Same "your bottleneck is authority, not technical." Same generic top three.

You can't use ChatGPT to find out if your site is broken, because ChatGPT will tell you it's fine either way.

What to use instead

For SEO auditing, use tools that actually load your page in a real browser:

PageSpeed Insights. Synthetic Lighthouse run, free, gives you real numbers.
Google Search Console URL Inspection. Shows the rendered HTML Google sees.
DeepAudit AI. Our free real browser SEO audit. 100+ checks, full Chromium rendering. Built precisely because pattern-matching auditors lie. axiondeepdigital.com/free-seo-audit

ChatGPT is great at writing, summarizing, and explaining concepts. It is not an auditor. Asking it to audit a website is asking it to do a job it can't do, and the danger isn't that it refuses. The danger is that it answers anyway.

Related services

SEO Services →

Written by

Joshua R. Gutierrez

SEO Engineer, Axion Deep Digital

SEO strategist and full-stack engineer who builds the audit tooling, then does the work. Technical SEO, Core Web Vitals, and content systems for SaaS and B2B.

View full profile & credentials →

Ready to build a website that performs?

Let us audit your current site, identify the biggest opportunities, and build a plan to grow your traffic and leads.

Free Site Audit Get in Touch