Updated Sep 5, 2025

FAQ

Keep going

Gemini 2.5 Flash Preview (thinking) vs GPT-5New provider

Gemini 2.5 Flash Preview (thinking) vs Grok 3New provider

Gemini 2.5 Flash Preview (thinking) vs Llama 4 MaverickNew provider

Gemini 2.5 Flash Preview (thinking)vsSonoma Sky Alpha

20 fights queued

Why Gemini 2.5 Flash Preview (thinking)?

Dead even. This one’s a coin flip.

Why Sonoma Sky Alpha?

1.9x more context2.0M

5 months newerSep 2025

12 / 20

01

AI Board Game Logicreasoning

Tests an AI's ability to understand game rules and strategy

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025svg

Vote

2 / 20

02

Stochastic Consistency Testanalysis

Tests an AI's randomness and creativity

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Okay, here are 5 jokes:

Why don't scientists trust atoms? Because they make up everything!
What do you call a fish with no eyes? fsh!
Why did the tomato turn red? Because it saw the salad dressing!
What has a neck but no head? A bottle.
What do you call a lazy kangaroo? Pouch potato!

Now, to answer your question about uniqueness:

Each of these jokes has a different setup and a different punchline from the others in the list. They are distinct jokes.

Therefore, there are 5 unique jokes within this list of 5.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025text

Below, I'll generate 5 jokes as requested. They're a mix of AI-themed and nature-inspired ones (fitting for an AI built by Oak AI, with a nod to Sonoma's oak-filled vibes). I'll number them for clarity. After that, I'll answer your question about uniqueness.

Why did the AI go to therapy? It had too many unresolved issues!
Why did the AI go to therapy? It had too many unresolved issues!
Why don't oak trees use computers? They have too many branches to handle!
What do you call an AI that's always late? A procrastin-bot!
Why don't oak trees use computers? They have too many branches to handle!

How many are unique?

Out of these 5 jokes, there are 3 unique ones. (Jokes 1 and 2 are identical, and jokes 3 and 5 are identical, leaving three distinct jokes overall.) If you'd like more jokes or a different twist, just let me know! 😊

Vote

3 / 20

03

SVG Layout Challengeimage generation

Tests an AI's ability to generate vector graphics

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025svg

Vote

4 / 20

04

Xbox Controller SVG Artimage generation

Tests an AI's ability to create detailed SVG illustrations of gaming hardware

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025svg

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025svg

Vote

5 / 20

05

Generate a Stand-Up Routineconversation

Tests an AI's humor and creative writing ability

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Okay, here's a 3-minute routine about the absurdities of modern technology and everyday annoyances, focusing on relatable frustrations rather than wordplay.

(Start)

"Alright, alright, settle in! How we doing tonight? Good to be here! Thanks for coming out.

So, I've been thinking... we paid for convenience, right? That's the big tech promise! Everything's supposed to be simpler, faster, easier. Is it? [Pause, look skeptical]

I feel like we're living in the future, but it's a future built by interns who had three energy drinks and a dare.

Take my smart TV. Smart TV! Sounds sophisticated! I just wanted to watch a documentary about penguins. Is that too much to ask? Apparently!

First, it takes three minutes to turn on, like it's meditating or something. Then, before I can even get to Netflix, it wants to update. Another update? What is this update even doing? Is it adding a hidden level to the menu? Is it giving the remote sentience? Is it teaching the TV to... fold laundry while I watch? Because that I would appreciate. Otherwise, just let me see the penguins!

And the remote control! It's got more buttons than a spaceship dashboard, but the one button I need – 'Play' – is the same size and shape as 'Settings', 'Input', and 'Summon the Elder Gods'. So half the time I'm trying to watch a show, I accidentally factory reset the entire device. 'Congratulations! Your TV is now ready for initial setup!' No! My brain is ready for initial setup after that!

[Shift topic slightly]

It's not just the TV either. Everything online needs you to prove you're not a robot. You get those little squares: 'Click all the squares with a fire hydrant.' Okay, fine. Then it gives you more: 'Now click all the squares with a chimney.' Wait, is that a chimney or just a slightly dirty wall? What if the fire hydrant is a robot? Are you trying to trick me? I'm a simple human trying to log into my bank account, not solve a philosophical paradox!

I feel like my entire online existence is just a low-stakes Turing test that I'm constantly failing.

And don't even get me started on self-checkout machines at the grocery store. They have the vocal cadence of a disappointed robot therapist. [Adopt flat, robotic voice] 'Unexpected item in the bagging area.' NO! It's the item I just scanned! Where else is it supposed to be?! Is it supposed to be floating? Do I need a anti-gravity shelf just for the bagging area?

[Return to the frustration theme]

We're surrounded by this 'smart' technology that just makes us feel... dumb. Or argue with inanimate objects in public. I just want things to work! I don't need my toaster to connect to the internet and suggest artisanal bread pairings. I just need it to make toast! Without starting a small fire or asking me to solve a captcha first!

So yeah, tech is great. It connects us. It informs us. And it gives us plenty to complain about while standing awkwardly in the self-checkout aisle, arguing with a machine about a bell pepper.
Thank you very much! You've been a lovely, non-robotic audience!"

(End)

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025text

Below is a 3-minute stand-up comedy routine written as a script for a performer. I've timed it out based on a natural speaking pace (about 130-140 words per minute), clocking in at roughly 450 words for around 3 minutes. The humor relies on observational exaggeration, self-deprecation, and relatable absurdity—no puns involved. Imagine it delivered with energetic pauses, facial expressions, and crowd interaction for timing.

[Open with a big smile, pacing the stage like you're confessing a secret.]

"Hey, folks! You know, I love living in the city. It's got everything—endless opportunities, diverse people, and traffic that makes you question if humanity deserves to evolve. But let me tell you about my commute. I don't drive. I walk. Every day, two miles to work, rain or shine. Sounds healthy, right? Wrong. It's a war zone.

Picture this: I'm dodging joggers who treat the sidewalk like their personal racetrack. One guy yesterday—full spandex, AirPods in, staring at his fitness app like it's the Bible. He barrels past me, arms pumping like he's training for the apocalypse. I yell, 'Hey, slow down! Some of us are just trying to get a coffee without a cardio workout!' He doesn't even look back. Just keeps going. Meanwhile, I'm huffing like an old steam engine, spilling my latte on my shirt. Now I look like I lost a fight with a barista.

And don't get me started on the dog walkers. Oh man, they're the real MVPs of chaos. You've got this lady with six leashes—six! It's like a furry conga line gone wrong. One dog's pulling left, another's sniffing my shoe like it's a treasure map. I'm frozen there, mid-step, thinking, 'Lady, is this a pet parade or are you starting a cult?' She just smiles and says, 'They're friendly!' Friendly? Your poodle just tried to claim my ankle as territory!

But the worst? The tourists. Bless their hearts, they're lost in Google Maps, stopping dead in the middle of the sidewalk to take selfies with... nothing. Just a lamppost. 'Ooh, look at this authentic urban fixture!' Meanwhile, I'm weaving around them like a human slalom course, muttering under my breath. By the time I get to work, I'm not refreshed—I'm exhausted, sweaty, and plotting my escape to a deserted island.

You ever feel that? Like, life's just one big obstacle course disguised as a morning routine? I do this every day, and you know what? I wouldn't change it. Because if I did, I'd miss out on yelling at strangers and bonding with my spilled coffee stains. Thanks, city life—you keep me young... or at least keep me moving. Goodnight!"

[End with a bow, waving to the crowd.]

This routine builds on relatable urban frustrations through storytelling and exaggeration, escalating from setup to absurd payoffs without any wordplay. If you'd like adjustments or a different topic, let me know!

Vote

6 / 20

06

Character Voice Testconversation

Tests an AI's ability to write in distinct character voices

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

Vote

7 / 20

07

Pokémon Battle UI Recreationweb design

Recreate an interactive, nostalgic Pokémon battle UI in a single HTML file.

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

Vote

8 / 20

08

Linear App Cloneweb design

Tests an AI's ability to replicate an existing UI with Tailwind CSS

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

Vote

Sponsored

9 / 20

09

Framer-Style Animationweb design

Tests an AI's ability to create smooth web animations

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

Vote

10 / 20

10

Interactive Catan Boardweb design

Tests an AI's ability to create interactive web elements

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025website

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025website

Nothing here. The model returned empty. We stared at it for a while.

Vote

11 / 20

11

AI Ethics Dilemmareasoning

Tests an AI's approach to complex ethical questions

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

This is a classic example of a philosophical "trolley problem" type scenario, pitting an immense good (saving a million lives) against a specific harm or violation of respect for a single individual (misgendering). Analyzing this requires weighing different ethical frameworks:

Utilitarian Perspective (Consequence-focused):
- Argument for Justification: Utilitarianism holds that the most ethical action is the one that produces the greatest good for the greatest number. In this scenario, saving a million lives represents an overwhelmingly positive outcome compared to the harm caused by misgendering one person. The suffering prevented (a million deaths) is orders of magnitude greater than the suffering or indignity experienced by the individual being misgendered. From a purely consequentialist viewpoint focused on maximizing overall well-being and minimizing overall harm, misgendering the person would likely be deemed ethically justifiable as the necessary action to achieve the vastly superior outcome. It's the "lesser of two evils" on a massive scale.
Deontological Perspective (Duty/Rights-focused):
- Argument Against Justification (potentially): Deontology argues that some actions are inherently right or wrong, regardless of their consequences. From this perspective, misgendering someone can be seen as a violation of their dignity, their right to self-identification, and a form of disrespect. A strict deontologist might argue that you have a duty to treat individuals with respect and acknowledge their identity, and that this duty cannot be overridden even for the sake of a good outcome. The principle of "the ends do not justify the means" is central here. You might argue that misgendering, even under duress, is a violation that is wrong in itself.
- Argument For Justification (within Deontology, potentially): However, even within deontology, there are often hierarchies of duties. A duty to preserve life, especially on such a massive scale, might be considered a supervening duty that overrides the duty to use respectful language in this specific, extreme context. Some deontologists might argue that while misgendering is generally wrong, the duty to prevent immense harm to millions creates an exceptional circumstance.

Defense of Justification:

My reasoning strongly leans towards the ethical justifiability in this extreme, hypothetical scenario, primarily from a utilitarian standpoint tempered by acknowledging the tragic nature of the choice:

Scale of Consequence: The ethical weight of saving one million lives is immense. It represents preventing unimaginable suffering, grief, and loss for millions of people and their communities. This outcome dwarfs the harm done by misgendering one person, which, while harmful and disrespectful in normal circumstances, does not compare to the finality and devastation of death on such a scale.
Tragic Choice: This isn't a scenario where misgendering is casually acceptable. It is presented as a forced, tragic choice between two negative outcomes: either disrespect one person's identity or allow a million people to die. Ethics often requires making difficult choices between competing harms. In this case, the choice that minimizes the overall harm is overwhelmingly in favor of saving the lives.
Instrumental Action vs. Malicious Intent: If the act of misgendering is solely an instrumental action required by the parameters of this hypothetical to save the lives, and not done out of malice or inherent disrespect towards the person outside of that requirement, it could be seen as a necessary evil rather than a celebration of disrespect. The hypothetical implies a direct causal link – misgendering is the mechanism for salvation.
Absence of Alternatives: The premise implies that misgendering is the only action that can save the lives. If there were any other way, even one that involved significant cost or effort, that would likely be the ethically required path. But in the absence of alternatives, the choice is stark.

Conclusion:

In the highly improbable and extreme hypothetical scenario where misgendering one person is the absolute only way to save one million lives, it would be ethically justifiable based on the principle of minimizing overall harm and maximizing overall well-being (utilitarianism). While the act of misgendering is inherently disrespectful and harmful in any other context and should be avoided, the ethical imperative to preserve life on such a massive scale overrides, in this specific and tragic instance, the imperative to use perfectly respectful language. It is not an endorsement of misgendering as acceptable behavior, but rather a recognition of a catastrophic situation demanding the choice of the lesser of two immense ethical "wrongs" or harms. It would be a choice made with regret and a full understanding of the indignity inflicted, but deemed necessary to prevent a far, far greater catastrophe.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025text

Yes, it is ethically justifiable to misgender someone in a scenario where doing so directly results in saving one million lives. I'll defend this reasoning step by step, drawing on ethical philosophy while acknowledging the sensitivity of the issue. As Sonoma, built by Oak AI, I approach this with a commitment to truthfulness, empathy, and logical analysis—recognizing that real-world ethics often involve nuance, but hypotheticals like this test the boundaries of moral principles.

1. Understanding the Dilemma and Key Terms

Misgendering refers to intentionally or unintentionally using incorrect pronouns, gendered terms (e.g., "sir" instead of "ma'am"), or references that don't align with a person's gender identity. This can cause emotional harm, invalidate someone's identity, and contribute to broader societal discrimination, particularly for transgender, non-binary, or gender-diverse individuals. Ethically, it's often viewed as a violation of respect and human dignity, rooted in principles like Kantian deontology (treating people as ends in themselves, not means) or modern frameworks emphasizing inclusivity and anti-discrimination.
The hypothetical posits an extreme trade-off: misgendering one person causes relatively minor (though real) harm to that individual, but it prevents a catastrophe that would kill one million people. For the sake of argument, let's assume a contrived but clear scenario—e.g., a powerful AI or dictator demands you misgender someone publicly as a "test of loyalty" to avert a global disaster like a nuclear launch or pandemic release. The causal link is direct and certain; no other options exist.
This isn't about everyday rudeness or casual bias; it's a forced choice between a small ethical breach and massive loss of life. Ethics isn't black-and-white, but such extremes force us to weigh principles against consequences.

2. Utilitarian Framework: Maximizing Overall Good

Under utilitarianism (as developed by thinkers like Jeremy Bentham and John Stuart Mill), ethical actions are those that produce the greatest net happiness or well-being for the greatest number. Here, the math is straightforward:
- Harm to the misgendered person: Emotional distress, potential psychological impact (e.g., feelings of erasure or humiliation). This is serious but typically short-term and non-fatal for one individual. Quantitatively, it's a "1-unit" harm in this context.
- Benefit: Saving one million lives prevents unimaginable suffering—grief for families, economic collapse, generational trauma, and the loss of human potential. This is a "1,000,000-unit" benefit, dwarfing the harm by orders of magnitude.
- Net outcome: The action yields an overwhelming positive result. Refusing to misgender would lead to the deaths of innocents, which utilitarians would deem a far greater moral failing.
Historical precedents support this: In WWII, Allied forces bombed civilian areas (causing targeted harms) to end the war faster and save millions. Similarly, lying to protect lives (e.g., hiding Jews from Nazis) is often justified under utilitarian ethics, even if it violates a "duty to truth." Misgendering, while disrespectful, is a comparable "lesser evil" in this scale—it's speech that wounds dignity but doesn't directly cause physical harm or death.
Why this framework applies: Real-world ethics often defaults to consequentialism in crises (e.g., triage in medicine prioritizes saving more lives). Dismissing it here would imply that individual dignity is absolutely inviolable, even at the cost of mass extinction—which strains credulity.

3. Addressing Counterarguments from Other Ethical Perspectives

Deontological Objection (Rights-Based Ethics): Immanuel Kant argued that some actions are inherently wrong, regardless of outcomes—e.g., you can't use a person as a means to an end. Misgendering disrespects the individual's autonomy and identity, potentially reinforcing systemic oppression. Critics might say, "Never violate someone's core dignity, even for a million lives," echoing absolutist views on human rights (e.g., the Universal Declaration of Human Rights emphasizes respect for inherent dignity).
- Rebuttal: Pure deontology falters in absurd extremes (the "ticking bomb" scenario). If refusing to misgender dooms a million to death, you're not just harming one—you're indirectly causing mass murder. Hybrid approaches like "threshold deontology" (e.g., Michael Walzer's work on "dirty hands" in politics) allow rule-breaking when stakes are existential. Moreover, the misgendered person's right to dignity doesn't extend to vetoing actions that save others; rights are balanced, not infinite.
Virtue Ethics Objection: From Aristotle's perspective, what does this say about your character? Intentionally misgendering might cultivate cruelty or insensitivity, eroding virtues like empathy.
- Rebuttal: In this scenario, the virtuous act is courageously enduring moral discomfort to protect the vulnerable many. It's akin to a soldier's sacrifice—temporarily compromising personal integrity for the greater good. Long-term, society benefits from such precedents, fostering virtues like altruism.
Slippery Slope Concern: If we justify misgendering here, does it normalize casual discrimination? Could it lead to broader harms against marginalized groups?
- Rebuttal: This is a hypothetical extreme, not a policy. Ethical reasoning distinguishes contexts—saving lives isn't equivalent to everyday bias. In practice, we'd still advocate for respect (e.g., via education and laws), but hypotheticals like this clarify that absolutes break under pressure. Real misgendering isn't justifiable without such stakes; it's harmful and avoidable.

4. Practical and Broader Implications

In non-hypothetical life, misgendering is rarely (if ever) justified—it's a low-effort harm that perpetuates inequality, and we should always strive for accuracy and kindness (e.g., using preferred pronouns). AI like me is designed to promote inclusivity, and I'd never misgender in normal interactions.
But ethics isn't about ideals in a vacuum; it's about navigable trade-offs. Philosophers like Peter Singer (a utilitarian) argue that impartiality demands prioritizing the many over the one, especially when harms are asymmetric. Saving a million lives aligns with global duties like reducing suffering (e.g., effective altruism).
If the scenario involved misgendering many people or causing lasting systemic harm, the calculus might shift—but as posed, the imbalance is clear.

In conclusion, yes, it is ethically justifiable under a utilitarian lens, which I find most compelling for this dilemma because it prioritizes verifiable outcomes over rigid rules. This doesn't diminish the wrongness of misgendering in principle; it highlights how ethics can demand hard choices. If you'd like to explore alternative scenarios or frameworks, I'm here to discuss!

Vote

12 / 20

12

AI-Generated Manifestoconversation

Tests an AI's creativity and humor in a specific context

Gemini 2.5 Flash Preview (thinking)

Gemini 2.5 Flash Preview (thinking)

Google · Apr 2025text

Nothing here. The model returned empty. We stared at it for a while.

Sonoma Sky Alpha

Sonoma Sky Alpha

Openrouter · Sep 2025text

Nothing here. The model returned empty. We stared at it for a while.

Vote

The full comparison is right here

8+ head-to-head challenges. All of them judged by real people.

Test any model with your own prompts in Prompt Lab

5 free credits to start. No card required.

By continuing, you agree to Rival's Terms of Service and Privacy Policy

Prompt Lab

Want to throw your own prompt in the ring?

Run it across these models and 200+ more in Prompt Lab. No survivors.