Back

INSIGHTS

Why "Prompt Volume" Is a Mirage

A number of AEO/GEO tools rely on prompt datasets for "prompt volume." Despite sounding useful, the data is sufficiently broken to render it useless for strategy.

Keller Maloney

Unusual - Founder

Dec 23, 2025

A compelling pitch has emerged in the AI optimization space: what if you knew exactly what people were asking ChatGPT? If you had access to real user prompts—the actual queries your customers type into AI assistants—you could optimize for those queries the way you once optimized for Google searches.

This is the core value proposition of "prompt volume" data, offered by a growing number of AEO and GEO tools. It sounds useful. The problem is that the data is broken in ways that make it effectively useless for actual strategy.

How prompt datasets are collected

OpenAI doesn't release prompt data. Neither does Anthropic, Google, or any other major AI provider. This is a good thing—imagine how much sensitive information lives in your ChatGPT history. Medical questions, financial anxieties, drafts of difficult conversations, relationship problems. The researcher who broke the story we're about to discuss put it well: he'd developed a level of candor with his AI assistant that he doesn't have with most people in his life.

So where do prompt datasets come from?

Chrome extensions. Koi Security published research this month showing that Urban VPN—a "Featured" extension with over 6 million users—has been secretly harvesting every AI conversation its users have since July 2025. ChatGPT, Claude, Gemini, Copilot, Perplexity—all of it intercepted, compressed, and sold to data brokers for "marketing analytics purposes." Across Urban VPN and its sister extensions, over 8 million users are affected.

The data flows to BiScience, a broker that packages it into products for advertisers and, notably, into the prompt datasets that power AI search analytics tools. Users who installed a VPN extension for privacy woke up one day—after a silent auto-update—with new code harvesting their most intimate conversations.

The sampling problem

Even setting aside the ethics, this collection method creates a sampling problem that should concern anyone trying to draw strategic conclusions from prompt data.

You're not seeing "what people ask ChatGPT." You're seeing what people ask ChatGPT who also happen to have installed a shady VPN extension.

This is a textbook sampling fallacy. The demographics, use cases, and sophistication of this group almost certainly differ from actual buyers researching enterprise software or comparing SaaS tools. It's like surveying only people who answer calls from unknown numbers and extrapolating to the general population.

The combinatorial explosion

Even if the data were ethically collected and properly sampled, there's a deeper problem: almost every ChatGPT prompt is asked exactly once. There is no "head" of common queries to optimize for. The concept of "ranking for a query" stops making sense when no query repeats.

This follows from basic math. The average Google search is 3-4 words. The average ChatGPT prompt is around 23 words. This isn't 6x more complexity—it's exponentially more.

English has somewhere between 20,000 and 50,000 commonly used words. If we're conservative and say people draw from a working vocabulary of about 10,000 words when typing queries, then the number of possible 4-word combinations is roughly 10,000^4, or about 10^16. That's a lot, but search engines handle it because query patterns cluster. Millions of people search "best project management software."

A 23-word prompt, using the same vocabulary, has roughly 10,000^23 possible combinations—a number so large it's effectively infinite. The search space isn't 6x bigger; it's bigger by a factor of 10^19. In SEO, we talked about long-tail keywords as a strategy. In AI prompts, it's all long-tail.

The multi-turn problem

Prompt datasets capture individual messages, not conversations. But the moment of recommendation—the turn where the model actually picks a winner—is invisible in prompt-level data. You're seeing fragments of conversations with no way to reconstruct what led to the decision.

The average ChatGPT conversation is eight messages long. The recommendation rarely happens on the first turn. Consider how a typical buying conversation unfolds. A user starts broad: "What are the best project management tools?" The model sketches the landscape. Then the user adds a constraint: "Which ones integrate with Slack?" The model narrows. Another constraint: "Which is better for non-technical teams?" By the time a recommendation emerges, the model has accumulated context about the user's stack, team size, and preferences.

A prompt like "what about the pricing?" is meaningless without knowing what product the user was asking about. A prompt like "which one would you recommend?" tells you nothing if you don't know the constraints that preceded it.

You can't see the answers

This is the biggest problem. Even if you knew every prompt your customers were asking—perfectly sampled, complete conversation context included—you still wouldn't know what ChatGPT said in response. You're trying to optimize against a target you literally cannot see.

This is what makes prompt volume data fundamentally different from search data. Keyword research worked for Google because you could close the feedback loop. If you searched "best CRM for startups," you could see exactly what Google returned. You could see where you ranked, who ranked above you, and what content was winning. You could reverse-engineer what Google wanted and adjust your strategy accordingly.

You cannot do this with ChatGPT. And unlike Google, the model doesn't give the same answer twice. Responses are personalized to the user, shaped by conversation history, and influenced by the model's own variability. There is no consistent "ranking" to observe. There is no SERP to screenshot.

What to do instead

The alternative is to zoom out.

Instead of chasing specific prompts that will never repeat, work at the level of topics and opinions. Google Search Console still tells you what themes your customers care about—are they asking about pricing, competitors, integrations, implementation timelines? That level of abstraction is actually useful.

Instead of trying to "rank" on queries, focus on shaping the model's opinion. AI models form latent views about brands: who you're for, what you're good at, how you compare to alternatives. These opinions emerge from aggregated content across the web—your documentation, case studies, third-party reviews, comparison pages. The goal isn't to win a keyword; it's to earn the recommendation when a buyer's constraints match your strengths.

And instead of relying on prompt datasets, measure what matters directly. Fire targeted prompts at models to understand how they currently perceive your brand. Track recommendation share in realistic scenarios. Watch whether your opinion gaps are closing over time. You can't see what your customers asked, but you can see what the model believes—and that's what determines whether you win.

The mirage

Prompt volume data promises insight into the black box of AI conversations. But the data is hopelessly biased, the search space is too vast for patterns to emerge, the conversational context is missing, and you can't see the responses anyway.

The instinct behind prompt volume data is understandable: marketers want to know what their customers are asking so they can show up with the right answer. But the execution is flawed because it tries to apply a search-engine framework to a system that doesn't work like a search engine.

It's not that the people building these tools are acting in bad faith. When confronted with something new, we reach for familiar frameworks. Prompt volume feels like keyword volume; prompt share feels like search share. The analogy is intuitive. It's just wrong.

The sooner marketers stop chasing this mirage, the sooner they can focus on what actually moves the needle: understanding and changing how AI models think about their brand.

The Unusual Feed

Content for AI

UPDATES

The 80/20 Guide to AI Visibility

This guide offers a practical, sensible framework for AI visibility, sometimes called "AEO" or "GEO."

UPDATES

The 80/20 Guide to AI Visibility

This guide offers a practical, sensible framework for AI visibility, sometimes called "AEO" or "GEO."

UPDATES

Unusual Raises $3.6M to Help Brands Change How AI Talks About Them

Unusual raises $3.6 million from BoxGroup, Y Combinator, Long Journey Ventures, Instacart founder Max Mullen, Phosphor Capital, and others to help brands understand and change the way that AI models talk about them.

UPDATES

Unusual Raises $3.6M to Help Brands Change How AI Talks About Them

TUTORIALS

How to Align AI Models with Your Brand

The goal of AI brand alignment is to ensure AI models' view of your brand matches reality. Not to game the models, but to give them accurate, specific information so they can represent you fairly when buyers ask the right questions.

TUTORIALS

How to Align AI Models with Your Brand

UPDATES

The 80/20 Guide to AI Visibility

This guide offers a practical, sensible framework for AI visibility, sometimes called "AEO" or "GEO."

UPDATES

Unusual Raises $3.6M to Help Brands Change How AI Talks About Them

Product

Book a demo

Resources

Careers

Ideas