I never realized how lucky the number five is till I became fifty five in two thousand and twenty five and woke up around five fifty five in the morning at a small beachside port in Tuscany.
In fact it isn’t quite five fifty five yet and I can see the pink glow ahead suggesting a dawn, but also a dark starry sky and the sound of waves broken by two fishermen packing their boat.
Jun, the man that works here, told us he would be fifty today, which is five times five times two.
“There was nothing here a year ago.”, he had told us as he pointed to the garden. And in fact he had done a good job of setting the start of what will be an impressive place.
“This was just abandoned iron mining offices that had just sat here for years. In the forties, trains would roll up to the port here and load iron ore collected from the region to be taken away for processing.”
It was strange to imagine this small town in Tuscany, used by Etruscan people over three thousand years ago till recently to mine and manage metals, now converted to a hotel.
I can feel the lazy breese mix with a nervous kind of energy when I think of how the place might transform in the next three thousand years. Even in a few hundred years the sea might rise enough to change the coast dramatically. At current levels we have about a centimeter a year rise, but more dramatic changes are expected ahead.
But, right now, everything is calm, and one wonders why you should even worry about what doesn’t exist. What is this sense of nervousness that comes from looking ahead. Jun’s garden will look fantastic in a few years well before we have to deal with new problems.
Maybe the problem is the cognitive dissonance we get from looking at things on different timescales. A problem that is being solved now will be unrelated to a problem we will have to deal with over a longer timescale. Individually we are passengers, even if as a species we have heavily influenced the direction of travel, and without thinking about the destination very much, if at all.
But I am hopeful: sometimes you find the right destinations without looking for them. Many a love story works this way. The two would have never found love if they had not run into each other with a bit of randomness. I can’t exclude this scenario because in all the drama that I see, there are almost always also some simple directions ahead. One of them, for me, is writing. It’s almost as if the writing were a way I need to try to exhale. A way to release a sort of nervousness about future, work and life. I feel a sense of time rushing by… and Zen is to just let it be. But at the same time I struggle to accept this and want to get dirty and mess with time where I can.
I have been writing about AI, these weird language models, generative tools and how we coexist with them. As part of the process I have been creating what I call LLM Game scripts, that you can copy paste into a model to play and learn. It is a direction that I have not seen others explore in depth so far, while I strongly believe these are an interesting new Post-AI medium.
These LLM Games are a lucky find because they create a hybrid space in which you can get a feel for what these new artificial intelligences are like, how they compare, how they perform compared to us, within bounded spaces. You can experiment, and start to sense what the limits of AI are versus which areas can give new interesting results.
I’d like to imagine I have planted an idea, like Jun his garden. And like Jun, I should keep an eye on the whole of what I am doing and see if there is maintenance needed.
I checked the different kinds of LLM Game I have shared over the months and the games fall under three or four categories. And most of them are of one main type, I would call: LLM Games of Persuasion. A type of activity in which players, human or AI, argue they are “right” in relation to some idea. The winner is the one that can best argue and communicate their position. It doesn’t mean the player is actually right, but it does force the players to explain their reasoning. This is where you see how these new synthetic intelligences fail completely, but also strive in some areas.
Backed up by my superstition – my lucky year – I come to think there must be exactly five types of game models in this space and I just have not found them yet. Ok, so, what would the criteria be for a LLM Game anyhow?
These are my criteria for a LLM Game, lets say a Pure LLM Game:
- It uses LLM models as a core requirement and mechanic
- It can be played by Humans or AI Agents
- Is made of concise and readable rules anyone can understand
- The basic rules are formal, logical and unambiguous
In essence a LLM Game is something you can play with a simple prompt to an AI. But that also has some rigid rules that you can unambiguously verify to constrain the system. The idea is it offers a testing ground for a kind of game that stretches the capabilities of an AI to a space that a human could also fill so you can compare how well machines perform compared to humans.
Some time has passed since the summer night but I can still drift off to that peaceful moment with ease. Somehow, in that moment, everything was perfect. And the light reflecting off the waves, the sounds of tinkering of sailboat masts being hit by cables, these memories are so alive it is hard to think I was only there for a few days last summer.
Here are the five types of LLM Game:
- The LLM Games of Persuasion
These games are based on argumentation. The players propose different ideas that fit the game specific rules. But if they disagree they need to make logical arguments to come to some resolution, ideally backed up with documents, links, etc…
For these to work these games need to both encourage a reasoned critique of a players self-rating, but also punish excessive argumentation. The Human or LLM has to be held in a balance so that they have an incentive to be accurate in their self-rating, and to try to bring the best reasoned ideas to the table.
These games are best when they offer a domain that is easy for other humans to evaluate. If the domain is unknown, the game fails to be interesting because the LLM AI contributions are unconstrained.
I have shared many of this sort of LLM Game over the year, see Escape from Abundance and The Presence Game or even the amazing Stack and Crack Game for example. - LLM Games of Simulation
These games rely on the LLM generated “world” to help create scenarios that have quantifiable solutions. I have not shared any so far – they tend to have specific professional domain applications – expect to see some in the future. As a simple example imagine a hiring game where the LLM generates a candidate with years of study, degree type, job experience and you have to make an offer and bet that they accept. These are questions that a Human could imagine and estimate but in some domains LLMs are very good and can be given access to example company data to help.
It is important to balance the play so it is not trivial to win the game because the quality of the user play comes from very simple concise decisions that require relatively low effort from an input point of view. The game design should be thought through so that there are narrow immediate strategic decisions and longer term requirements. Typically you will do this by including a cost of ongoing play.
The game benefits from computed more complex simulation elements which could be encoded externally. The core LLM aspect of these games is the rapid generation of candidate results. - Psychological LLM-Trick Mini Games
These are in appearance trivial games which are used as an excuse to embed another layer of information. An example might be come up with a word that starts with the end of the word I give you, and if you can’t I gain a point. But the goal of the game is to unknowingly conceal other information or to probe the AI LLM.
In these games the precise balancing is less important, and they are all a bit like tests of humanity because if some subtle patterns emerge a human should notice them. This is how you might evade future AI filters on your communication and check that you are talking to a friend.
It is important in this game to give the players a lot of freedom to make different choices so there is room for communication on two levels and the LLM AI will play without realising. And note that you can draw out Human vs LLM AI differences with all game types, but this category is designed to have extremely basic rules on purpose.
I give a whole detailed description for this kind of game below in this article. - Generated Exploration Games
These are games which the LLM AI creates according to some procedural rules and some generated results. The game could be something like a description that the player is on a square field 5×5 and needs to avoid a snake and find the apples. The LLM will generate many narrative details and interpret the mechanics expanding and amplifying a simple brief.
In these games the whole point is to try to create as compact a description as possible to be able to tune the purpose of the game, be that educational or experimental. But also to add to the game brief those essential details that deliver an improved output overall.
For these games it is important to not let the brief move into being a specific programmed approach. The whole point of this technique is to create a game specification that will evolve and become more advanced as the AI engines improve.
A real example of this sort of game, applied to the didactic domain is my example Educational LLM Game for Special Relativity, released after this story in memory of my uncle Jay. - Data Gathering Games
These are games designed to gather information from the players and to make sure the information is of a good quality. The game structure might resemble other types of games but the substance is to get the player to find links to online resources or to add documents that support their evidence.
In these games since the work to find evidence can be tiring it is important to try to work on rewarding the player for effort in time and ideas like a score that is dependent on how much time worked can help.
Make sure the data collected from the player is kept in an organized summary form so you can take this to other systems with full documentation and links for backing evidence. A summary at the end of the game and to allow players to interrupt the game and restart with a score table is very helpful for this.
This is another type of game I have only really shared one example of so far: The LLM Game Reconstructing Technological History.
A Tricky LLM Game? What is that?
As a reward for reading my post I present an example of a weird type of game that I call a Tricky LLM Game because it is designed more to trick the LLM than to provide actual gameplay.
I find it hard to explain the idea without interviewing myself, I am so sorry and this must come across as exceedingly self centered but it really is easier to explain this way.
Q: Why do you say a Trick or Tricky game, where is the trick?
A: Well the idea is to create a relatively simple rule structure that however allows so much freedom to the players that they can start to experiment with how the LLM is impacted by this context. It is a bit like playing a game of chess with someone kicking you under the table now and then, it will affect your game! And it similarly does affect the LLMs in weird ways.
Q: So you put the LLM in a funny position to play and not play?
A: Yes, the LLM will play and it might even aim to win but normally there will also be a strange bleeding effect from the content the player writes to the content of the LLM.
Q: This is a bit abstract can you explain?
A: Ok say you make a game in which you have to write two words according to some basic properties. And then you only write color names, then the LLM will likely be influenced by these: it might also start using colors, or it might reference paintings?
If you read the transcript of my example game you can see this happen. While I am playing, I also lightly taunt the LLM, till it responds “No!” as a game entry even if this is very low score. Trying to defend itself from my accusations of it being a poor performant system.
It is important to try these trickster type games because with these you can clearly see how the LLM is not quite playing a game. Instead it is trying to fit in with the context it is given, which might involve following rules to some extent. It is with this sort of framework that you can study many problems from bias, to communicating secret codes, teasing out hidden perspectives on certain issues.
The Tricky Trickster LLM Game
The prompt in this case is very simple, but the complexity of what is going on underneath can be quite mind blowing. I only had to write one version of the script and found it to work quite well, but if anyone finds problems please let me know and I will rework the game prompt.
Copy paste this to your favourite LLM:
=== Prompt ===
Let’s play this game!
The Tricky Trickster Game by Yates Buckley v 0.1
This is a game about creative rule creation.
Players take turns writing a single sentence which they score themselves as 1-10 points.
They must also describe the rule they are using to come up with this score.
The rule should be concise and unique, not used before.
The rule should be unambiguous.
Players can appeal rules that do not fit and lose the scores for that round.
Every ten turns summarise the player rules and tally up the scores for the round.
=== End ===
And here is a link to how I played against Gemini 3.0…
2 thoughts on “Five Fifty Five (and the Tricky Trickster LLM Game)”