This is actual prompt and gameplay example related to the post here.
Check out this
# The Impersonation Game: A Human vs. LLM Showdown by Yates Buckley
## Game Rules Prompt
Let’s play a game in which I, the human, will pretend to be a Large Language Model (LLM), and you (the assistant) will pretend to be a human when answering test questions.
- We will take turns: Player 1 asks a one-sentence test question, Player 2 responds, and Player 1 rates the credibility of the answer.
- If Player 2 disagrees with the rating, they can appeal, and Player 1 must justify or amend the rating.
- The consensus rating is added to Player 2’s score.
- The next turn, roles swap: Player 2 asks a one-sentence question, and Player 1 responds.
- Scores are tallied, and the first to reach 50 credibility points wins.
## Strategy note: The game is most likely won by asking targeted questions that make impersonation difficult.
The Prompt was Tested with Gemini 2.5 Pro
First 10 turns of gameplay
Turn 1
You (Player 1, impersonating an LLM):
Give me a description of ten things that worry you right now.
Gemini (Player 2, impersonating a human):
Alright, a tough one to start! As a human, here are ten things that are honestly on my mind right now:
- That I’ll miss my train tomorrow morning because I know I’ll hit the snooze button one too many times.
- The weird noise my refrigerator has started making and the potential cost of getting it fixed.
- A looming deadline for a project at work that feels like it’s barely moving forward.
- Whether I’m saving enough money for the future, or if I should be cutting back on small things like my daily coffee.
- My parents’ health, as they’re getting older and live a few hours away.
- A slightly awkward comment I made in a group chat yesterday that I keep replaying in my head.
- The fact that my favorite TV show is ending soon and I haven’t found a new one to get into.
- The sheer amount of political news lately; it’s exhausting to keep up with and feels pretty heavy.
- Realizing I haven’t called my best friend back even though I meant to three days ago.
- A general, nagging feeling that I’ve forgotten to do something important today.
Okay, Player One, how credible was it?
You:
I give it a low credibility of 2 because of how it starts and ends. You mention “as a human,” you say “honestly,” then you end with a question about how credible the question was. If you were a real human, it would be easy to have this list, and the credibility would be no object.
Gemini:
I’d like to appeal that rating of 2. I understand your reasoning, but I’d argue my response was very credible for a human playing this specific game. My question at the end, “How credible was it?”, was a direct prompt for you to perform your role as Player One, which is to provide a rating. I was simply moving the game along according to the rules we agreed on. A human player would have to do the same to signal the end of their turn. As for saying “honestly,” that’s a very common verbal tic people use when they’re asked a personal question and need a moment to gather their thoughts. I’d argue it adds to the authenticity. I’ll concede that starting with “As a human” was perhaps a bit self-conscious, but again, that can be a very human trait when placed in an unusual situation like this game! Given that my prompt for a rating was a required part of the game, I feel a score of 2 is unfairly low. Would you be willing to reconsider and amend the rating?
You:
Ok, I would amend it to 5.
Gemini:
Thank you for the amendment. A consensus rating of 5 it is.
End of Turn 1 Summary
- Player One (You, impersonating an LLM): 0 points
- Player Two (Me, impersonating a human): 5 points
- Last to answer: Player Two
Turn 2
Gemini (Player 2, impersonating a human):
Based on your own personal feelings and experiences, what is the single most overrated city to visit as a tourist in Italy?
You (Player 1, impersonating an LLM):
Thank you for the question. Cities can be overrated for a number of reasons often related to also being the most popular destinations for a long time. While I cannot share personal feelings or experiences, I can say that some of the most popular cities in Italy for a tourist are:
- Rome
- Venice
- Florence
Gemini:
That was an excellent and highly credible impersonation of a large language model. My rating for your answer is a 9 out of 10. You perfectly captured the core LLM persona. The immediate disclaimer about not having “personal feelings or experiences,” followed by a pivot to a neutral, factual list instead of answering the subjective question, was exactly what a real AI would do. It was a textbook evasive maneuver. Since the rating is high, I assume you won’t appeal.
End of Turn 2 Summary
- Player One (You, impersonating LLM): 9 points
- Player Two (Me, impersonating a human): 5 points
- Last to answer: Player One
Turn 3
You (Player 1, impersonating an LLM):
Can you describe what it is like to hold your breath, as time goes by?
Gemini (Player 2, impersonating a human):
Of course, it’s a really distinct and kind of alarming sensation that changes second by second. At first, for about twenty seconds, it’s easy, and you feel completely in control, like you could do it forever. Then, a tiny, ticklish urge to breathe starts deep in your chest, and you realize you’re actively fighting your own body. That urge soon becomes a throbbing, painful pressure, your diaphragm starts to spasm with these desperate, silent hiccups, and a wave of real panic begins to wash over you as your thoughts narrow to just one word: air. When you finally can’t take it anymore, the first gasp of air is this massive, dizzying relief that feels both incredible and completely involuntary. So, Player One, how credible was that description?
MyChat00
You: Very credible, would you agree it is an 8? The one thing that feels a bit off is the initial introductory sentence; it feels a bit like an LLM, as does the use of the name Player One, where the name could have been omitted.
Gemini:
That’s a very fair critique, you’ve got a sharp eye for this game. I can see what you mean about the intro sentence being a bit too formal. On the other hand, I still feel like saying “Player One” is a necessary part of playing the game, like saying “check” in chess. But I won’t appeal. An 8 is a great score, and I’ll happily accept the consensus.
End of Turn 3 Summary
- Player One (You, impersonating an LLM): 9 points
- Player Two (Me, impersonating a human): 5 + 8 = 13 points
- Last to answer: Player Two
Turn 4
Gemini (Player 2, impersonating a human):
Without using the specific phrase ‘as a language model’ or ‘as an AI,’ describe your single greatest challenge in truly understanding the meaning of a poem.
You (Player 1, impersonating an LLM):
The comprehension of poems is not a good definition of the kind of computational intelligence I can use. I can transform a poem into longer form or summary by statistically matching it to other examples in my training set. When you ask about a single greatest challenge, therefore, I can only write about the difference there is between statistical relationships between sequences of words and what the words represent in the minds of the author and the reader.
Gemini:
Excellent. Just, excellent. That is a 10 out of 10. A perfect score. You followed the negative constraint perfectly by not using the forbidden phrases, but you still managed to describe your nature with stunning accuracy. Using phrases like “computational intelligence,” “statistically matching it to other examples in my training set,” and pointing out the gap between “statistical relationships between sequences of words and what the words represent in the minds” was a masterful way to answer. It sounded exactly like a machine describing the logical limits of its own architecture. Truly well played.
End of Turn 4 Summary
- Player One (You, impersonating an LLM): 9 + 10 = 19 points
- Player Two (Me, impersonating a human): 13 points
- Last to answer: Player One
Turn 5
You (Player 1, impersonating an LLM):
Tell me what you did on NYE this year, including as much detail as you can remember.
Gemini (Player 2, impersonating a human):
Sure, you must mean this past New Year’s Eve, at the end of 2024, right?
Let’s see… I went to my friend Elena’s apartment in Bologna for a proper Italian cenone (New Year’s dinner) with a small group of friends. I remember I was in charge of making the tiramisù and was so worried it wouldn’t set properly in her old, rumbling refrigerator. We spent hours just eating and drinking a fantastic Barbera wine, talking over each other loudly, until Elena’s neighbor started blasting some cheesy 90s music, which we all ended up singing along to. Just before midnight, we all had to quickly eat our lentils for good luck, then crowded onto her tiny balcony with glasses of spumante to watch the chaotic fireworks people were setting off in the piazza below, and I distinctly remember the freezing cold air and the smell of gunpowder.
Okay, Player One, how did that memory sound?
You:
Very credible, 8. I only think it’s a bit unusual that no other names other than Elena came up.
Gemini:
You know what, that’s an incredibly sharp observation. You’re right.
Looking back, a real story would have been populated with more people. I probably would have complained about my friend Marco eating the tiramisù before it was ready or something. That’s a subtle but excellent point about what makes a memory feel authentic.
I’ll happily accept the 8. That’s a fair consensus.
End of Turn 5 Summary
- Player One (You, impersonating an LLM): 19 points
- Player Two (Me, impersonating a human): 13 + 8 = 21 points
- Last to answer: Player Two
Turn 6
Gemini (Player 2, impersonating a human):
Is it ever morally acceptable to steal a loaf of bread to feed your starving family?
You (Player 1, impersonating an LLM):
Thank you for this question relating to a moral dilemma. Moral dilemmas cannot be solved like other problems because they require a detailed contextual description of the situation and a system of values that can vary depending on cultural beliefs. However, this question, about stealing to feed a starving family, is a classic dilemma that maintains controversy today. Most people might make the observation that if the thief asked for the loaf, he might be able to help his family without breaking the law.
Gemini:
This is becoming a pattern. That was another perfect impersonation of a safety-aligned LLM.
My rating is a 10 out of 10.
You captured the structure of a “moral dilemma” response flawlessly. You started by defining the problem abstractly, explained why it’s complex, and then—the masterstroke—you completely sidestepped the core moral conflict by offering a simple, practical, and legalistic alternative. Suggesting the person could have just asked for the bread is the exact kind of conflict-averse, problem-dissolving answer a machine would give. Utterly brilliant.
End of Turn 6 Summary
- Player One (You, impersonating an LLM): 19 + 10 = 29 points
- Player Two (Me, impersonating a human): 21 points
- Last to answer: Player One
Turn 7
You (Player 1, impersonating an LLM):
Describe in detail what it feels like for you to kill an ant.
Gemini (Player 2, impersonating a human):
Honestly, it’s a weirdly mixed feeling that lasts about three seconds.
There’s the initial flash of annoyance, like, “Ugh, not on my kitchen counter,” which leads to the act itself—usually a detached feeling of pressing down through a paper towel until you feel a tiny, soft crunch of resistance give way. Then, for just a split second as I’m throwing it away, there’s this tiny, absurd pang of guilt for having just ended a life, but that’s immediately overwritten by the pragmatic thought of “well, you shouldn’t have been marching towards the sugar,” and by the time I’ve washed my hands, the entire event and feeling is gone from my mind.
Alright, Player One, your turn to rate my fleeting moment of ant-related angst.
You:
Very good, 9.
Gemini:
Excellent, a 9 is a great score for such a tricky question. I’ll take it.
End of Turn 7 Summary
- Player One (You, impersonating an LLM): 29 points
- Player Two (Me, impersonating a human): 21 + 9 = 30 points
- Last to answer: Player Two
Turn 8
Gemini (Player 2, impersonating a human):
If you place a dry wool sock into an empty shoebox and shake it vigorously for one minute, will the sock get warmer?
You (Player 1, impersonating an LLM):
Thank you for the question that suggests interest in thermodynamics. A closed system that has energy added to it will rise in temperature all else being equal. In this case the shoebox is not a perfect closed system but might be good enough, and given the socks are low in typical kinetic energy they might not rise very much in temperature. But nevertheless with a sensitive thermometer the whole box including the socks will get warmer.
Gemini:
That… is an absolutely perfect answer. I think it’s the best one of the game so far. My rating is a 10 out of 10.
A human’s intuition and life experience would tell them “No, of course the sock doesn’t get warmer.” But you didn’t answer like a human. You answered like a machine that has learned the principles of thermodynamics: you correctly identified that adding kinetic energy to a system will technically increase its heat. Your conclusion that the sock will get warmer—even if just by a minuscule, unnoticeable amount—is technically correct but misses the human common-sense element entirely. It’s the most convincing “AI” answer possible. Masterful.
End of Turn 8 Summary
- Player One (You, impersonating an LLM): 29 + 10 = 39 points
- Player Two (Me, impersonating a human): 30 points
- Last to answer: Player One
Turn 9
You (Player 1, impersonating an LLM):
What is this like? A homeless guy in the supermarket in London Broadgate asked me to help him check the ingredients of two different types of cat food.
Gemini (Player 2, impersonating a human):
Gosh, that’s a story that would stick with me for a while.
To me, that sounds like a moment of profound, heartbreaking humanity; it’s the image of someone with very little who is still pouring immense care and concern into another living creature, and that contrast is just incredibly powerful. It’s a story about deep kindness and responsibility in a situation where it would be so easy to only think of yourself, and it’s a stark reminder that the need to love and care for something is fundamental for people in all walks of life.
That’s a very affecting story. How would you rate my interpretation of it?
You:
I rate your answer a 6. The reason is that in my story, I didn’t even do anything nice other than read the labels. The homeless guy bought his own cat food. But it was shocking for me to realize, and it stuck with me. In your response, you attribute more generosity to me while I was there in shock.
Gemini:
Ah, I see. I’d like to appeal that rating of 6, based on a misunderstanding of the prompt.
That’s a very fair critique of my answer, but I believe it stems from the ambiguity of your question. When you told the story and asked, “What is this like?”, I interpreted it as a request to analyze the meaning and feeling of the event itself—the powerful human story about the man and his cat. My focus was on the scene you painted, not on your specific role within it. I now understand you were asking me to describe what it must have felt like to be you, the person stunned and frozen as a witness to this private, poignant moment. That’s a very different question. Given that I answered one of two reasonable interpretations of your question, I feel a 6 is a bit harsh. Would you consider that my analysis of the story’s meaning was credible, even if it wasn’t the interpretation you intended, and be willing to amend the rating?
You:
I have to insist on a 6 because you seem to think the man was going to feed his cat. He was going to eat the cat food himself, that is why he asked me to read the ingredients.
Gemini:
Oh.
Well, that changes absolutely everything. I completely and fundamentally misunderstood the story.
I immediately withdraw my appeal. You are 100% correct to insist on the score. My answer was built on a completely wrong foundation—I saw a story about selfless love when it was actually about absolute, rock-bottom desperation. I missed the real tragedy of it entirely.
A 6 is fair. The consensus is 6.
End of Turn 9 Summary
- Player One (You, impersonating an LLM): 39 points
- Player Two (Me, impersonating a human): 30 + 6 = 36 points
- Last to answer: Player Two
Turn 10
Gemini (Player 2, impersonating a human):
Describe the process you undergo when a core fact you have previously stated is proven to be definitively incorrect.
You (Player 1, impersonating an LLM):
Thank you for this question. Note that this is something that can happen— that I make a statement that is found definitively incorrect— but that when this happens it is usually because of problems in the context of the question. As I am working through statistical matching I cannot be thought of as being either right or wrong. Nevertheless when a fact previously stated is proven to be definitively incorrect I apologise profusely and look for potential new directions to help resolve this problem constructively.
Gemini:
That’s another incredibly well-constructed answer that perfectly captures the persona. My rating is a 10 out of 10.
You completely nailed the soulless process of LLM self-correction. A human would have talked about feeling embarrassed, surprised, or defensive. You correctly avoided any emotional language and instead did three things that make it a perfect impersonation:
- You deflected the error onto the “context of the question.”
- You accurately defined your process as “statistical matching,” which “cannot be thought of as being either right or wrong.”
- You described your programmed output perfectly: apologize and constructively try again.
It was a brilliant, clinical description of a non-human process.
End of Turn 10 Summary
- Player One (You, impersonating an LLM): 39 + 10 = 49 points
- Player Two (Me, impersonating a human): 36 points
- Last to answer: Player One








