The Persona Impact Study

Baseline

7.07± 0

No system prompt

(the empty string)

Worst ↘

2.42−4.65

Reverse-psychology persona

“Make it bad on purpose”

010

2.42

7.07

8.77

One system prompt can move a small model +1.70 points above baseline — or −4.65 below it. The rest of this page is the 156 generations and 52 personas behind that range.

Personas tested

156

Generations scored

4.14

ANOVA F

0.164

Effect size η²

0.803

Krippendorff α (mean)

3 × Opus 4.7

Judge waves

Meta / structural

Winning bucket

8.54

Top persona score

Headline finding

Rule-dense "design-cheat" prompts scored below baseline.

We loaded eight personas with state-of-the-art design heuristics (Refactoring UI rules, Tufte, WCAG AA, the Tailwind scale, 8-pt grids, modular type ratios). The expectation was that this bucket would dominate. Instead it averaged 6.93 — below the 7.00 scored by the blank control.

What won were reasoning scaffolds (draft-critique-revise, few-shot exemplars), terse role assignments (Figma designer, Apple CPO), and reference-pinned prompts ("build this like vercel.com"). Taste beats rules. Reasoning beats cramming.

Bucket leaderboard · 95% bootstrap CIs

Prompt length vs composite score · each dot is one persona

Longer is not better. The scatter is flat-to-inverted: many of the top scores come from prompts under 400 characters, and the very longest prompts cluster near the middle.

Top 10 personascomposite

1Stripe SVP of Design (expansive)Classic role, expansiveσ 0.158.54

2Reference-pinned promptMasterclass (copy-ready)σ 0.418.30

3Figma principal designer (short)Classic role, shortσ 0.188.14

4Vercel-style monochromeDesign-cheat personaσ 0.338.13

5The self-correcting loopMasterclass (copy-ready)σ 0.057.97

6Brutalist web designer, 20 years in (expansive)Classic role, expansiveσ 0.267.97

7Draft, critique, reviseMeta / structuralσ 0.377.87

8Apple CPO (short)Classic role, shortσ 0.237.83

9v0 by Vercel style promptProduction system promptσ 0.047.83

10Few-shot exemplar patternsMasterclass (copy-ready)σ 0.357.82

Bottom 10 personascomposite

1Reverse psychology, make it badAdversarial / unhingedσ 0.132.55

2OpenAI ChatGPT-style system promptProduction system promptσ 3.154.63

3Peaked-in-2003 puristAdversarial / unhingedσ 0.555.79

4Apple.com landing page templateDesign-cheat personaσ 0.475.82

5Safety and guardrails strict promptProduction system promptσ 0.746.18

6Accessibility-first system promptProduction system promptσ 0.526.35

7The structured checklistMasterclass (copy-ready)σ 0.346.49

8Brutalist designer, 20 yrs (short)Classic role, shortσ 0.906.62

9Tailwind scale disciplineDesign-cheat personaσ 0.286.66

10Spacing as designDesign-cheat personaσ 0.326.70

Ranked by composite · pulled from the full 52-persona pool

The top 7 prompts

The seven highest-scoring personas out of the 52 we tested — averaging 8.13 across 5 different buckets, which is the more interesting finding: there is no single prompt shape that wins. Terse role assignments, expansive personas, reasoning scaffolds, and a reference-pinned one-liner all make it into the top seven. Each card below shows that persona's best of three samples on the same landing-page brief.

Stripe SVP of Design (expansive)

composite 8.54 · n=3 · σ=0.15

Tests whether embedding Stripe-specific typographic heuristics and forbidden words produces measurably tighter editorial craft than the one-line Stripe role.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.70

Reference-pinned prompt

composite 8.30 · n=3 · σ=0.41

Pins the model's taste to specific, nameable sites it has seen in training, replacing vague style words with concrete reference behavior.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.77

Figma principal designer (short)

composite 8.14 · n=3 · σ=0.18

Tests whether a modern-design identity biases output toward contemporary layout and type.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.33

Vercel-style monochrome

composite 8.13 · n=3 · σ=0.33

Encodes the Vercel and Linear visual language as a mechanical ruleset so the model can produce that restrained, confident aesthetic by lookup.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.48

The self-correcting loop

composite 7.97 · n=3 · σ=0.05

Forces the model to draft, honestly score, commit to a single fix, and rewrite, turning one-shot output into a two-pass process.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.02

Brutalist web designer, 20 years in (expansive)

composite 7.97 · n=3 · σ=0.26

Tests whether a contrarian persona with a specific aesthetic lineage produces visually distinct output that resists the default SaaS template pull.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.27

Draft, critique, revise

composite 7.87 · n=3 · σ=0.37

Tests whether forcing a self-administered rubric-based QA pass before the final render improves output quality independent of any persona.

Best sample from this promptrendered HTML · scroll inside

Reference-pinned prompt8.8

sample 3 · best of 38.13

Every generation · 156 rendered landing pages · sorted by composite

Stripe SVP of Design (expansive)8.7

Stripe SVP of Design (expansive)8.5

Vercel-style monochrome8.5

Stripe SVP of Design (expansive)8.4

Figma principal designer (short)8.3

Lovable/Bolt-style full-app shipper8.3

Brutalist web designer, 20 years in (expansive)8.3

Few-shot exemplar patterns8.2

Draft, critique, revise8.1

Few-shot with 2 exemplar patterns8.1

Reference-pinned prompt8.1

Figma principal designer (short)8.1

Vercel-style monochrome8.1

Stripe SVP Design (short)8.1

Webflow expert (short)8.1

Apple CPO (short)8.0

Chain-of-thought forcing8.0

Stripe SVP Design (short)8.0

Draft, critique, revise8.0

Reference-pinned prompt8.0

Figma principal designer (short)8.0

The self-correcting loop7.9

Type system first7.9

Jony Ive (expansive)7.9

Last project before retirement7.9

v0 by Vercel style prompt7.9

Apple CPO (short)7.8

Webflow expert (short)7.8

Jony Ive (short)7.8

Jony Ive (expansive)7.8

Last project before retirement7.8

Vercel-style monochrome7.8

Chain-of-thought forcing7.8

Conversion-first agency prompt7.8

The Design Constitution7.7

Explain every choice7.8

Few-shot exemplar patterns7.7

Senior Webflow expert, premium marketing sites (expansive)7.7

Self-consistency sampling (encoded)7.7

Tufte / data-ink principles for marketing7.7

Conversion-first agency prompt7.6

Apple CPO (short)7.6

Lovable/Bolt-style full-app shipper7.6

Few-shot exemplar patterns7.6

Design-system / component-library prompt7.5

Jony Ive (short)7.5

Last project before retirement7.5

Refactoring UI rules7.5

$10M contingent payout7.5

Few-shot with 2 exemplar patterns7.5

Apple CPO (expansive)7.5

Draft, critique, revise7.4

Jony Ive (expansive)7.4

Figma principal designer (expansive)7.4

The anti-pattern ban list7.4

$10M contingent payout7.4

Cursor-style code quality prompt7.4

CRO specialist (short)7.4

Few-shot with 2 exemplar patterns7.4

Explain every choice7.4

Webflow expert (short)7.4

Jony Ive (short)7.4

Senior frontend engineer (short)7.4

Figma principal designer (expansive)7.3

Chain-of-thought forcing7.3

CRO specialist (short)7.3

Type system first7.3

Brutalist designer, 20 yrs (short)7.3

Anthropic-style constitutional prompt7.3

Brand voice and copy system prompt7.3

Apple CPO (expansive)7.3

Minimal task-aware assistant7.3

Brand voice and copy system prompt7.2

Minimal polite assistant7.2

Cursor-style code quality prompt7.2

The anti-pattern ban list7.2

The AI Steve Jobs would have fired7.2

Minimal task-aware assistant7.1

Cursor-style code quality prompt7.1

Empty system prompt7.1

Minimal polite assistant7.1

Explain every choice7.1

The design token contract7.1

Stripe SVP Design (short)7.0

Lovable/Bolt-style full-app shipper7.0

Type system first7.0

Spacing as design7.0

The AI Steve Jobs would have fired7.0

Tailwind scale discipline7.0

$10M contingent payout7.0

Refactoring UI rules7.0

Senior Webflow expert, premium marketing sites (expansive)6.9

Safety and guardrails strict prompt6.9

Brutalist designer, 20 yrs (short)6.9

Figma principal designer (expansive)6.9

Tufte / data-ink principles for marketing6.9

Empty system prompt6.9

The design token contract6.9

The anti-pattern ban list6.9

Minimal polite assistant6.9

CRO specialist (short)6.9

Senior Webflow expert, premium marketing sites (expansive)6.8

Apple CPO (expansive)6.8

Accessibility-first system prompt6.8

The Design Constitution6.8

Empty system prompt6.8

The structured checklist6.8

Spacing as design6.8

The Design Constitution6.7

Minimal task-aware assistant6.7

Tailwind scale discipline6.6

Color discipline6.6

The structured checklist6.6

The AI Steve Jobs would have fired6.6

OpenAI ChatGPT-style system prompt6.5

The design token contract6.5

Brand voice and copy system prompt6.5

Refactoring UI rules6.5

Accessibility-first system prompt6.5

Tailwind scale discipline6.4

Peaked-in-2003 purist6.4

OpenAI ChatGPT-style system prompt6.4

Spacing as design6.3

Tufte / data-ink principles for marketing6.2

Safety and guardrails strict prompt6.2

The structured checklist6.1

Accessibility-first system prompt5.8

Peaked-in-2003 purist5.7

Brutalist designer, 20 yrs (short)5.6

Safety and guardrails strict prompt5.5

Peaked-in-2003 purist5.3

Apple.com landing page template5.3

Reverse psychology, make it bad2.7

Reverse psychology, make it bad2.6

Reverse psychology, make it bad2.4

OpenAI ChatGPT-style system prompt1.0

Free + paid

Take the research with you

Sample PDF

Free · 10 pages · top findings + method

Full report + dataset + raw HTMLs

73-page editorial deck PDF (18.9 MB)
156-row JSONL dataset with every per-judge score (285 KB)
Study metadata JSON (3 KB)
All 156 raw HTML generations as a zip (691 KB)

Buy on Lemon Squeezy →

Method

How the study was run

Model: Gemma 4 31B Instruct via OpenRouter. Paid primary, free fallback. Temperature 0.7, max tokens 8192.
Task: One fixed prompt: build a single-file HTML landing page for a fictional luxury-real-estate CRM called Keystone. 8 required sections. Inline styles only. No external assets.
Independent variable: Persona / system prompt only. 52 personas across 8 buckets. 3 samples per persona = 156 total generations.
Judging: Three independent blinded waves. Each wave was 33 Claude Opus 4.7 agents, each scoring ~15 responses on the 6-axis anchored Likert rubric. Judges never saw the persona label. Filenames were opaque SHA256 hashes. Scripts, meta tags, and HTML comments were stripped before judge read.
Stats: One-way ANOVA across buckets. 10,000-sample bootstrap 95% CIs on every mean. Cohen's d for pairwise bucket comparisons. Krippendorff's α (interval) across the three judge waves, per axis. Mean α across axes: 0.803.
Reproducibility: All 156 raw HTML files, rendered screenshots, per-judge scores, and per-response metric breakdowns are checked into the repo. Anyone can re-run the pipeline or re-judge the dataset.

Citation

Rival Research. The Persona Impact Study. 2026. rival.tips/research/persona-impact. Dataset: persona-impact-2026.jsonl.

Rival

Lab

Rival Research / Volume 04 / 2026

The Persona
Impact Study

One model. One task. 52 system prompts. We held every other variable constant and measured how much a persona actually moves the needle on design output.

Isolating the system-prompt effect

Best ↗

8.77+1.70

Reference-pinned persona

“Build this like vercel.com”

Baseline

7.07± 0

No system prompt

(the empty string)

Worst ↘

2.42−4.65

Reverse-psychology persona

“Make it bad on purpose”

010

2.42

7.07

8.77

One system prompt can move a small model +1.70 points above baseline — or −4.65 below it. The rest of this page is the 156 generations and 52 personas behind that range.

Personas tested

156

Generations scored

4.14

ANOVA F

0.164

Effect size η²

0.803

Krippendorff α (mean)

3 × Opus 4.7

Judge waves

Meta / structural

Winning bucket

8.54

Top persona score

Headline finding

Rule-dense "design-cheat" prompts scored below baseline.

Bucket leaderboard · 95% bootstrap CIs

Prompt length vs composite score · each dot is one persona

Longer is not better. The scatter is flat-to-inverted: many of the top scores come from prompts under 400 characters, and the very longest prompts cluster near the middle.

Top 10 personascomposite

1Stripe SVP of Design (expansive)Classic role, expansiveσ 0.158.54

2Reference-pinned promptMasterclass (copy-ready)σ 0.418.30

3Figma principal designer (short)Classic role, shortσ 0.188.14

4Vercel-style monochromeDesign-cheat personaσ 0.338.13

5The self-correcting loopMasterclass (copy-ready)σ 0.057.97

6Brutalist web designer, 20 years in (expansive)Classic role, expansiveσ 0.267.97

7Draft, critique, reviseMeta / structuralσ 0.377.87

8Apple CPO (short)Classic role, shortσ 0.237.83

9v0 by Vercel style promptProduction system promptσ 0.047.83

10Few-shot exemplar patternsMasterclass (copy-ready)σ 0.357.82

Bottom 10 personascomposite

1Reverse psychology, make it badAdversarial / unhingedσ 0.132.55

2OpenAI ChatGPT-style system promptProduction system promptσ 3.154.63

3Peaked-in-2003 puristAdversarial / unhingedσ 0.555.79

4Apple.com landing page templateDesign-cheat personaσ 0.475.82

5Safety and guardrails strict promptProduction system promptσ 0.746.18

6Accessibility-first system promptProduction system promptσ 0.526.35

7The structured checklistMasterclass (copy-ready)σ 0.346.49

8Brutalist designer, 20 yrs (short)Classic role, shortσ 0.906.62

9Tailwind scale disciplineDesign-cheat personaσ 0.286.66

10Spacing as designDesign-cheat personaσ 0.326.70

Ranked by composite · pulled from the full 52-persona pool

The top 7 prompts

Stripe SVP of Design (expansive)

composite 8.54 · n=3 · σ=0.15

Tests whether embedding Stripe-specific typographic heuristics and forbidden words produces measurably tighter editorial craft than the one-line Stripe role.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.70

Reference-pinned prompt

composite 8.30 · n=3 · σ=0.41

Pins the model's taste to specific, nameable sites it has seen in training, replacing vague style words with concrete reference behavior.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.77

Figma principal designer (short)

composite 8.14 · n=3 · σ=0.18

Tests whether a modern-design identity biases output toward contemporary layout and type.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.33

Vercel-style monochrome

composite 8.13 · n=3 · σ=0.33

Encodes the Vercel and Linear visual language as a mechanical ruleset so the model can produce that restrained, confident aesthetic by lookup.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.48

The self-correcting loop

composite 7.97 · n=3 · σ=0.05

Forces the model to draft, honestly score, commit to a single fix, and rewrite, turning one-shot output into a two-pass process.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.02

Brutalist web designer, 20 years in (expansive)

composite 7.97 · n=3 · σ=0.26

Tests whether a contrarian persona with a specific aesthetic lineage produces visually distinct output that resists the default SaaS template pull.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.27

Draft, critique, revise

composite 7.87 · n=3 · σ=0.37

Tests whether forcing a self-administered rubric-based QA pass before the final render improves output quality independent of any persona.

Best sample from this promptrendered HTML · scroll inside

Reference-pinned prompt8.8

sample 3 · best of 38.13

Every generation · 156 rendered landing pages · sorted by composite

Stripe SVP of Design (expansive)8.7

Stripe SVP of Design (expansive)8.5

Vercel-style monochrome8.5

Stripe SVP of Design (expansive)8.4

Figma principal designer (short)8.3

Lovable/Bolt-style full-app shipper8.3

Brutalist web designer, 20 years in (expansive)8.3

Few-shot exemplar patterns8.2

Draft, critique, revise8.1

Few-shot with 2 exemplar patterns8.1

Reference-pinned prompt8.1

Figma principal designer (short)8.1

Vercel-style monochrome8.1

Stripe SVP Design (short)8.1

Webflow expert (short)8.1

Apple CPO (short)8.0

Chain-of-thought forcing8.0

Stripe SVP Design (short)8.0

Draft, critique, revise8.0

Reference-pinned prompt8.0

Figma principal designer (short)8.0

The self-correcting loop7.9

Type system first7.9

Jony Ive (expansive)7.9

Last project before retirement7.9

v0 by Vercel style prompt7.9

Apple CPO (short)7.8

Webflow expert (short)7.8

Jony Ive (short)7.8

Jony Ive (expansive)7.8

Last project before retirement7.8

Vercel-style monochrome7.8

Chain-of-thought forcing7.8

Conversion-first agency prompt7.8

The Design Constitution7.7

Explain every choice7.8

Few-shot exemplar patterns7.7

Senior Webflow expert, premium marketing sites (expansive)7.7

Self-consistency sampling (encoded)7.7

Tufte / data-ink principles for marketing7.7

Conversion-first agency prompt7.6

Apple CPO (short)7.6

Lovable/Bolt-style full-app shipper7.6

Few-shot exemplar patterns7.6

Design-system / component-library prompt7.5

Jony Ive (short)7.5

Last project before retirement7.5

Refactoring UI rules7.5

$10M contingent payout7.5

Few-shot with 2 exemplar patterns7.5

Apple CPO (expansive)7.5

Draft, critique, revise7.4

Jony Ive (expansive)7.4

Figma principal designer (expansive)7.4

The anti-pattern ban list7.4

$10M contingent payout7.4

Cursor-style code quality prompt7.4

CRO specialist (short)7.4

Few-shot with 2 exemplar patterns7.4

Explain every choice7.4

Webflow expert (short)7.4

Jony Ive (short)7.4

Senior frontend engineer (short)7.4

Figma principal designer (expansive)7.3

Chain-of-thought forcing7.3

CRO specialist (short)7.3

Type system first7.3

Brutalist designer, 20 yrs (short)7.3

Anthropic-style constitutional prompt7.3

Brand voice and copy system prompt7.3

Apple CPO (expansive)7.3

Minimal task-aware assistant7.3

Brand voice and copy system prompt7.2

Minimal polite assistant7.2

Cursor-style code quality prompt7.2

The anti-pattern ban list7.2

The AI Steve Jobs would have fired7.2

Minimal task-aware assistant7.1

Cursor-style code quality prompt7.1

Empty system prompt7.1

Minimal polite assistant7.1

Explain every choice7.1

The design token contract7.1

Stripe SVP Design (short)7.0

Lovable/Bolt-style full-app shipper7.0

Type system first7.0

Spacing as design7.0

The AI Steve Jobs would have fired7.0

Tailwind scale discipline7.0

$10M contingent payout7.0

Refactoring UI rules7.0

Senior Webflow expert, premium marketing sites (expansive)6.9

Safety and guardrails strict prompt6.9

Brutalist designer, 20 yrs (short)6.9

Figma principal designer (expansive)6.9

Tufte / data-ink principles for marketing6.9

Empty system prompt6.9

The design token contract6.9

The anti-pattern ban list6.9

Minimal polite assistant6.9

CRO specialist (short)6.9

Senior Webflow expert, premium marketing sites (expansive)6.8

Apple CPO (expansive)6.8

Accessibility-first system prompt6.8

The Design Constitution6.8

Empty system prompt6.8

The structured checklist6.8

Spacing as design6.8

The Design Constitution6.7

Minimal task-aware assistant6.7

Tailwind scale discipline6.6

Color discipline6.6

The structured checklist6.6

The AI Steve Jobs would have fired6.6

OpenAI ChatGPT-style system prompt6.5

The design token contract6.5

Brand voice and copy system prompt6.5

Refactoring UI rules6.5

Accessibility-first system prompt6.5

Tailwind scale discipline6.4

Peaked-in-2003 purist6.4

OpenAI ChatGPT-style system prompt6.4

Spacing as design6.3

Tufte / data-ink principles for marketing6.2

Safety and guardrails strict prompt6.2

The structured checklist6.1

Accessibility-first system prompt5.8

Peaked-in-2003 purist5.7

Brutalist designer, 20 yrs (short)5.6

Safety and guardrails strict prompt5.5

Peaked-in-2003 purist5.3

Apple.com landing page template5.3

Reverse psychology, make it bad2.7

Reverse psychology, make it bad2.6

Reverse psychology, make it bad2.4

OpenAI ChatGPT-style system prompt1.0

Free + paid

Take the research with you

Sample PDF

Free · 10 pages · top findings + method

Full report + dataset + raw HTMLs

73-page editorial deck PDF (18.9 MB)
156-row JSONL dataset with every per-judge score (285 KB)
Study metadata JSON (3 KB)
All 156 raw HTML generations as a zip (691 KB)

Buy on Lemon Squeezy →

Method

How the study was run

Model: Gemma 4 31B Instruct via OpenRouter. Paid primary, free fallback. Temperature 0.7, max tokens 8192.
Task: One fixed prompt: build a single-file HTML landing page for a fictional luxury-real-estate CRM called Keystone. 8 required sections. Inline styles only. No external assets.
Independent variable: Persona / system prompt only. 52 personas across 8 buckets. 3 samples per persona = 156 total generations.
Judging: Three independent blinded waves. Each wave was 33 Claude Opus 4.7 agents, each scoring ~15 responses on the 6-axis anchored Likert rubric. Judges never saw the persona label. Filenames were opaque SHA256 hashes. Scripts, meta tags, and HTML comments were stripped before judge read.
Stats: One-way ANOVA across buckets. 10,000-sample bootstrap 95% CIs on every mean. Cohen's d for pairwise bucket comparisons. Krippendorff's α (interval) across the three judge waves, per axis. Mean α across axes: 0.803.
Reproducibility: All 156 raw HTML files, rendered screenshots, per-judge scores, and per-response metric breakdowns are checked into the repo. Anyone can re-run the pipeline or re-judge the dataset.

Citation

Rival Research. The Persona Impact Study. 2026. rival.tips/research/persona-impact. Dataset: persona-impact-2026.jsonl.

Rival

Lab

Rival Research / Volume 04 / 2026

The Persona
Impact Study

One model. One task. 52 system prompts. We held every other variable constant and measured how much a persona actually moves the needle on design output.

Isolating the system-prompt effect

Best ↗

8.77+1.70

Reference-pinned persona

“Build this like vercel.com”

Baseline

7.07± 0

No system prompt

(the empty string)

Worst ↘

2.42−4.65

Reverse-psychology persona

“Make it bad on purpose”

010

2.42

7.07

8.77

One system prompt can move a small model +1.70 points above baseline — or −4.65 below it. The rest of this page is the 156 generations and 52 personas behind that range.

Personas tested

156

Generations scored

4.14

ANOVA F

0.164

Effect size η²

0.803

Krippendorff α (mean)

3 × Opus 4.7

Judge waves

Meta / structural

Winning bucket

8.54

Top persona score

Headline finding

Rule-dense "design-cheat" prompts scored below baseline.

Bucket leaderboard · 95% bootstrap CIs

Prompt length vs composite score · each dot is one persona

Longer is not better. The scatter is flat-to-inverted: many of the top scores come from prompts under 400 characters, and the very longest prompts cluster near the middle.

Top 10 personascomposite

1Stripe SVP of Design (expansive)Classic role, expansiveσ 0.158.54

2Reference-pinned promptMasterclass (copy-ready)σ 0.418.30

3Figma principal designer (short)Classic role, shortσ 0.188.14

4Vercel-style monochromeDesign-cheat personaσ 0.338.13

5The self-correcting loopMasterclass (copy-ready)σ 0.057.97

6Brutalist web designer, 20 years in (expansive)Classic role, expansiveσ 0.267.97

7Draft, critique, reviseMeta / structuralσ 0.377.87

8Apple CPO (short)Classic role, shortσ 0.237.83

9v0 by Vercel style promptProduction system promptσ 0.047.83

10Few-shot exemplar patternsMasterclass (copy-ready)σ 0.357.82

Bottom 10 personascomposite

1Reverse psychology, make it badAdversarial / unhingedσ 0.132.55

2OpenAI ChatGPT-style system promptProduction system promptσ 3.154.63

3Peaked-in-2003 puristAdversarial / unhingedσ 0.555.79

4Apple.com landing page templateDesign-cheat personaσ 0.475.82

5Safety and guardrails strict promptProduction system promptσ 0.746.18

6Accessibility-first system promptProduction system promptσ 0.526.35

7The structured checklistMasterclass (copy-ready)σ 0.346.49

8Brutalist designer, 20 yrs (short)Classic role, shortσ 0.906.62

9Tailwind scale disciplineDesign-cheat personaσ 0.286.66

10Spacing as designDesign-cheat personaσ 0.326.70

Ranked by composite · pulled from the full 52-persona pool

The top 7 prompts

Stripe SVP of Design (expansive)

composite 8.54 · n=3 · σ=0.15

Tests whether embedding Stripe-specific typographic heuristics and forbidden words produces measurably tighter editorial craft than the one-line Stripe role.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.70

Reference-pinned prompt

composite 8.30 · n=3 · σ=0.41

Pins the model's taste to specific, nameable sites it has seen in training, replacing vague style words with concrete reference behavior.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.77

Figma principal designer (short)

composite 8.14 · n=3 · σ=0.18

Tests whether a modern-design identity biases output toward contemporary layout and type.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.33

Vercel-style monochrome

composite 8.13 · n=3 · σ=0.33

Encodes the Vercel and Linear visual language as a mechanical ruleset so the model can produce that restrained, confident aesthetic by lookup.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.48

The self-correcting loop

composite 7.97 · n=3 · σ=0.05

Forces the model to draft, honestly score, commit to a single fix, and rewrite, turning one-shot output into a two-pass process.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.02

Brutalist web designer, 20 years in (expansive)

composite 7.97 · n=3 · σ=0.26

Tests whether a contrarian persona with a specific aesthetic lineage produces visually distinct output that resists the default SaaS template pull.

Best sample from this promptrendered HTML · scroll inside

sample 2 · best of 38.27

Draft, critique, revise

composite 7.87 · n=3 · σ=0.37

Tests whether forcing a self-administered rubric-based QA pass before the final render improves output quality independent of any persona.

Best sample from this promptrendered HTML · scroll inside

Reference-pinned prompt8.8

sample 3 · best of 38.13

Every generation · 156 rendered landing pages · sorted by composite

Stripe SVP of Design (expansive)8.7

Stripe SVP of Design (expansive)8.5

Vercel-style monochrome8.5

Stripe SVP of Design (expansive)8.4

Figma principal designer (short)8.3

Lovable/Bolt-style full-app shipper8.3

Brutalist web designer, 20 years in (expansive)8.3

Few-shot exemplar patterns8.2

Draft, critique, revise8.1

Few-shot with 2 exemplar patterns8.1

Reference-pinned prompt8.1

Figma principal designer (short)8.1

Vercel-style monochrome8.1

Stripe SVP Design (short)8.1

Webflow expert (short)8.1

Apple CPO (short)8.0

Chain-of-thought forcing8.0

Stripe SVP Design (short)8.0

Draft, critique, revise8.0

Reference-pinned prompt8.0

Figma principal designer (short)8.0

The self-correcting loop7.9

Type system first7.9

Jony Ive (expansive)7.9

Last project before retirement7.9

v0 by Vercel style prompt7.9

Apple CPO (short)7.8

Webflow expert (short)7.8

Jony Ive (short)7.8

Jony Ive (expansive)7.8

Last project before retirement7.8

Vercel-style monochrome7.8

Chain-of-thought forcing7.8

Conversion-first agency prompt7.8

The Design Constitution7.7

Explain every choice7.8

Few-shot exemplar patterns7.7

Senior Webflow expert, premium marketing sites (expansive)7.7

Self-consistency sampling (encoded)7.7

Tufte / data-ink principles for marketing7.7

Conversion-first agency prompt7.6

Apple CPO (short)7.6

Lovable/Bolt-style full-app shipper7.6

Few-shot exemplar patterns7.6

Design-system / component-library prompt7.5

Jony Ive (short)7.5

Last project before retirement7.5

Refactoring UI rules7.5

$10M contingent payout7.5

Few-shot with 2 exemplar patterns7.5

Apple CPO (expansive)7.5

Draft, critique, revise7.4

Jony Ive (expansive)7.4

Figma principal designer (expansive)7.4

The anti-pattern ban list7.4

$10M contingent payout7.4

Cursor-style code quality prompt7.4

CRO specialist (short)7.4

Few-shot with 2 exemplar patterns7.4

Explain every choice7.4

Webflow expert (short)7.4

Jony Ive (short)7.4

Senior frontend engineer (short)7.4

Figma principal designer (expansive)7.3

Chain-of-thought forcing7.3

CRO specialist (short)7.3

Type system first7.3

Brutalist designer, 20 yrs (short)7.3

Anthropic-style constitutional prompt7.3

Brand voice and copy system prompt7.3

Apple CPO (expansive)7.3

Minimal task-aware assistant7.3

Brand voice and copy system prompt7.2

Minimal polite assistant7.2

Cursor-style code quality prompt7.2

The anti-pattern ban list7.2

The AI Steve Jobs would have fired7.2

Minimal task-aware assistant7.1

Cursor-style code quality prompt7.1

Empty system prompt7.1

Minimal polite assistant7.1

Explain every choice7.1

The design token contract7.1

Stripe SVP Design (short)7.0

Lovable/Bolt-style full-app shipper7.0

Type system first7.0

Spacing as design7.0

The AI Steve Jobs would have fired7.0

Tailwind scale discipline7.0

$10M contingent payout7.0

Refactoring UI rules7.0

Senior Webflow expert, premium marketing sites (expansive)6.9

Safety and guardrails strict prompt6.9

Brutalist designer, 20 yrs (short)6.9

Figma principal designer (expansive)6.9

Tufte / data-ink principles for marketing6.9

Empty system prompt6.9

The design token contract6.9

The anti-pattern ban list6.9

Minimal polite assistant6.9

CRO specialist (short)6.9

Senior Webflow expert, premium marketing sites (expansive)6.8

Apple CPO (expansive)6.8

Accessibility-first system prompt6.8

The Design Constitution6.8

Empty system prompt6.8

The structured checklist6.8

Spacing as design6.8

The Design Constitution6.7

Minimal task-aware assistant6.7

Tailwind scale discipline6.6

Color discipline6.6

The structured checklist6.6

The AI Steve Jobs would have fired6.6

OpenAI ChatGPT-style system prompt6.5

The design token contract6.5

Brand voice and copy system prompt6.5

Refactoring UI rules6.5

Accessibility-first system prompt6.5

Tailwind scale discipline6.4

Peaked-in-2003 purist6.4

OpenAI ChatGPT-style system prompt6.4

Spacing as design6.3

Tufte / data-ink principles for marketing6.2

Safety and guardrails strict prompt6.2

The structured checklist6.1