Key Question: Can we tell what type of content LLMs prefer? For example, are LLMs likely to prefer content that has a combination of video, images, reviews, etc.? We analyzed over 1.2 million citations from 8 different LLMs to find out.

Methodology

This analysis is based on data from Spotlight’s database, which tracks how different LLMs cite content in their responses. We analyzed:

1,684 source analyses from Gemini 2.0 Flash, examining detailed content characteristics
1.2+ million response links from 8 different LLMs (ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok, AIO, and AIMode)
Content preferences across visual elements, structure, depth, and source types

The Universal Content Preferences

Our analysis reveals that LLMs have remarkably consistent preferences when it comes to content types. Here’s what we found across all models:

95.13% of analyzed content contains images

90.62% of content uses bullet points or lists

78.80% of content includes visual data (images/videos)

74.76% of content shows author credentials

LLM-Specific Content Preferences

ChatGPT: The Wikipedia Champion

Total Citations: 290,493

Top Preference: Wikipedia dominates with 20,309 citations (7% of all ChatGPT citations)

Key Insight:

ChatGPT shows the highest preference for .org domains (10.29%) and academic sources, suggesting a preference for authoritative, well-sourced content.

Content Type Breakdown:

Guide/Tutorial content: 12.45%
Blog content: 11.23%
Listicle format: 12.19%

Perplexity: The Social Media Enthusiast

Total Citations: 445,176 (highest among all LLMs)

Top Preference: Reddit dominates with 13,614 citations

Key Insight:

Perplexity shows the strongest preference for user-generated content and social platforms, with Reddit, YouTube, and Google Play Store being top sources.

Content Type Breakdown:

Blog content: 17.95%
Guide/Tutorial content: 14.66%
Listicle format: 9.10%

Gemini: The Google Ecosystem Expert

Total Citations: 328,134

Top Preference: Google Play Store with 3,745 citations

Key Insight:

Gemini heavily favors Google’s own properties and services, with Google Play, YouTube, and Google’s AI search being top sources.

Content Type Breakdown:

Guide/Tutorial content: 14.89%
Blog content: 16.87%
Listicle format: 9.31%

Claude: The UK-Focused Specialist

Total Citations: 460 (smallest dataset)

Top Preference: Wise.com with 26 citations

Key Insight:

Claude shows a strong preference for UK-based financial services and consumer advice sites, with 37.61% of citations from .co.uk domains.

Content Type Breakdown:

Guide/Tutorial content: 23.70%
Blog content: 22.17%
Listicle format: 15.22%

Copilot: The E-commerce Expert

Total Citations: 10,450

Top Preference: Amazon with 568 citations

Key Insight:

Copilot shows the strongest preference for e-commerce platforms, with Amazon, Walmart, and Target being top sources.

Content Type Breakdown:

Listicle format: 14.99%
Blog content: 13.07%
Guide/Tutorial content: 11.03%

Grok: The X (Twitter) Native

Total Citations: 2,566

Top Preference: X.com (formerly Twitter) with 732 citations

Key Insight:

Grok shows the highest preference for .com domains (81.49%) and heavily favors its parent company’s platform, X.com.

Content Type Breakdown:

Blog content: 12.98%
Guide/Tutorial content: 10.68%
Listicle format: 5.07%

Content Characteristics That Matter Most

Based on our analysis of 1,684 source analyses from Gemini 2.0 Flash, here are the content characteristics that appear most frequently in LLM-cited content:

Characteristic	Percentage	What This Means
Images Present	95.13%	Visual content is nearly universal in cited content
Uses Bullet Points	90.62%	Structured, scannable content is preferred
Visual Data (Images/Videos)	78.80%	Multimedia content is highly valued
Author Credentials	74.76%	Credibility and expertise matter
Uses Opinions	64.85%	Subjective insights are valued alongside facts
Corporate Website	61.28%	Official brand sources are heavily cited
Signs of Agenda	60.27%	Content with clear purpose/intent is preferred
Fresh Content	57.78%	Recent information is valued
Highlighted Keywords	48.34%	SEO-optimized content performs well
FAQ Sections	35.39%	Question-and-answer format is effective

The Content Depth Sweet Spot

Our analysis reveals that LLMs prefer content that’s neither too shallow nor too deep:

71.08%

of cited content is “moderate” depth

Only 4.28% of cited content is classified as “in-depth,” while 5.29% is “surface-level.” This suggests that LLMs prefer content that provides substantial information without being overwhelming.

Visual Content: The Universal Language

Visual content appears to be the most consistent preference across all LLMs:

95.13% of cited content contains images
10.45% contains videos
78.80% has some form of visual data

The average cited content contains 9.3 sections and 83 paragraphs, with an average length of 2,820 characters.

Domain Preferences by LLM

Each LLM shows distinct domain preferences that reflect their training and purpose:

LLM	Top Domain Preference	% of Citations	Characteristic
ChatGPT	en.wikipedia.org	7.0%	Academic, authoritative
Perplexity	reddit.com	3.1%	User-generated, social
Gemini	play.google.com	1.1%	Google ecosystem
Claude	wise.com	5.7%	UK financial services
Copilot	amazon.com	5.4%	E-commerce focused
Grok	x.com	28.5%	Social media native

Key Takeaways

Visual content is essential: 95% of cited content contains images, making visual elements nearly universal in LLM-preferred content.
Structure matters: 90% of cited content uses bullet points or lists, indicating a strong preference for scannable, organized information.
Moderate depth wins: 71% of cited content is “moderate” depth – not too shallow, not too deep.
Credibility counts: 75% of cited content shows author credentials, emphasizing the importance of expertise.
LLMs have distinct personalities: Each LLM shows unique preferences reflecting their training and purpose (ChatGPT loves Wikipedia, Perplexity favors Reddit, etc.).
Corporate content dominates: 61% of cited content comes from corporate websites, suggesting official brand sources are highly valued.

Practical Implications for Content Creators

Based on this analysis, here’s what content creators should focus on to improve their chances of being cited by LLMs:

1. Visual Content Strategy

Include images in 95%+ of your content
Consider adding videos to 10%+ of content
Ensure visual elements support and enhance the text

2. Content Structure

Use bullet points and lists extensively (90%+ of content)
Organize content into clear sections (average 9.3 sections)
Keep paragraphs manageable (average 83 paragraphs per piece)

3. Authority and Credibility

Showcase author credentials and expertise
Include empirical evidence when possible
Cite sources and provide evidence

4. Content Depth

Aim for “moderate” depth – comprehensive but not overwhelming
Target 2,000-3,000 characters per piece
Balance thoroughness with accessibility

5. Platform-Specific Optimization

For ChatGPT: Focus on authoritative, well-sourced content similar to Wikipedia
For Perplexity: Create engaging, social-friendly content that sparks discussion
For Gemini: Optimize for Google’s ecosystem and services
For Claude: Consider UK-focused content and financial services
For Copilot: Focus on e-commerce and product-related content

Final Thoughts

While LLMs show distinct preferences based on their training and purpose, there are universal content characteristics that improve citation likelihood across all models. Visual content, structured presentation, moderate depth, and clear authority signals appear to be the most important factors for LLM citation success.

As AI continues to evolve and new models emerge, understanding these preferences becomes crucial for content creators looking to optimize for AI visibility. The data shows that the future of content optimization isn’t just about search engines—it’s about understanding how AI models consume and cite information.

This analysis is based on data from Spotlight’s database, which tracks LLM citations across multiple AI models. The data represents real-world citation patterns from over 1.2 million analyzed links.

What Content Types Do LLMs Prefer? A Data-Driven Analysis

The Universal Content Preferences

LLM-Specific Content Preferences

Content Characteristics That Matter Most

The Content Depth Sweet Spot

Visual Content: The Universal Language

Domain Preferences by LLM

Practical Implications for Content Creators

1. Visual Content Strategy

2. Content Structure

3. Authority and Credibility

4. Content Depth

5. Platform-Specific Optimization

Michael Hermon