insights · 2026-05-04

The 8% of startup names AI hasn't decided about yet

We probed 599 startup domain names with four large language models. Most got confident, unanimous, often-correct industry guesses — names doing semantic work, names where the LLMs and a literate human would agree about category. About 8% didn't. They cluster into five distinct patterns. For builders, those patterns are the highest-leverage band of names you can pick from today.

methodology
For each domain we asked four LLMs five times: "What does this company do?" and measured how often the five samples landed on the same industry. 73% of names produced ≥80% agreement (LLMs converged). About 8% produced disagreement across different industry categories (not just label variations like "Tech" vs "Technology"). Those 47 names form what we call the blank-canvas tail. We deliberately do not name the specific real domains in this article — the patterns described below are aggregated from the dataset; invented examples are illustrative.

Why this matters

For founders building today, AI is not just where customers research products — it's also where they form first impressions. When someone asks ChatGPT, Claude, or Perplexity "what kind of company is acme.ai?", the answer is almost always confident, often based on the name alone, and very hard to redirect once cached.

For 92% of names that means the LLMs have already decided what your company is. For the 8% in this tail, that decision hasn't been made — which means a deliberate brand strategy can shape it.

Per namedesk's intent thesis: high LLM recall is good for monetizers (the brand has done work). Low recall is good for builders (you have room to do work). Names matching one of the five patterns below sit in the latter camp.

The five patterns

Two-meaning compound
e.g., lattermark illustrative invented composites, not real probed domains
typical industry split:
software vs real estate
why it happens:

A compound word where one root is a software-engineering term ("greenfield", "fork", "branch") AND a real-estate or land-use term. The LLMs split based on which industry vocabulary the model has seen the root used in more often.

builder take:

A founder picking a name with this shape gets to choose which vocabulary they reinforce in their copy and SEO. Whichever lane the company occupies, the OTHER reading evaporates within months — a cheap source of brand-narrative control.

Mood-evoking real word
e.g., twilight, eclipse, pulse illustrative invented composites, not real probed domains
typical industry split:
entertainment vs technology
why it happens:

A real English word that evokes a vibe but no specific category. The LLMs reach for the cultural gravity of the word (entertainment, media) but a substantial minority pivot to "tech" because the .com TLD reads enterprise to them.

builder take:

High brand-design freedom. The name does not pre-commit you to a sector — the visual identity, copy, and product define what the word means in your context. Risk: a competitor could ship the same name in the OTHER lane and own the alternative reading globally.

Phonetic invention with no morpheme
e.g., flovira, ngonoo, plurra illustrative invented composites, not real probed domains
typical industry split:
e-commerce vs general tech vs health/wellness
why it happens:

Pure invented word, no recognizable root. The LLMs default to "must be a startup of some kind" and split between the three commonest categories: a shopping-feeling cluster (e-commerce), a platform-feeling cluster (tech), and a soft-consonant cluster (health/wellness). The split tracks the phonetic feel of the syllables.

builder take:

Closest thing to a true blank canvas. No semantic baggage to fight. The cost: zero free brand work — every customer impression starts at "I have never heard of this and have no idea what it does." Worth it only if your product strategy includes intentional category-creation.

Suffix that lives in two industries
e.g., -pad, -board, -dock illustrative invented composites, not real probed domains
typical industry split:
real estate vs developer tools
why it happens:

Suffixes like "-pad" or "-dock" are commercial-real-estate vocabulary AND developer-tool vocabulary. Combined with a generic prefix, the LLMs split based on which vocabulary they've seen the suffix used with more in their training data.

builder take:

A subset of the two-meaning-compound pattern. Same advice applies — pick a lane, double down in your copy, the other reading goes quiet.

Pure abbreviation, no anchor
e.g., pdcq, kvgr, msrt illustrative invented composites, not real probed domains
typical industry split:
tech vs finance vs healthcare
why it happens:

Four-letter abbreviations have no semantic content for the LLM to work with. The model defaults to the three most common abbreviation-resolution categories in its training data and splits.

builder take:

Maximum brand freedom and maximum required brand work. The name carries zero semantic load — every association is one you build deliberately. Best paired with a category-defining product strategy and a meaningful budget for early brand impressions.

The hard part: finding your own

Reading patterns is easy. Finding a name in this band that is also available, pronounceable, and aligned with your product strategy is hard. Traditional naming heuristics — pronounceability, length, .com availability — don't tell you whether the LLMs have already made a decision about the name.

The four-LLM probe namedesk runs on every name takes about ten seconds and surfaces the same data we used to identify these patterns. If the inference and coherence scores both come back high, the name has semantic weight — useful when you want a name that does brand-work for you, costly when you'd rather define the brand yourself. If the scores come back split (this band), you have a blank-canvas name and the responsibility that comes with it.

For the inverse case — how AI invents meaning even for nonsense names — see our companion piece below.