Context: for fun (and profit?)
Basic Contact
Contact is a lightweight many-versus-one word guessing game. I was first introduced to it on a long bus ride several years ago, and since then it’s become one of my favorite games to play casually with friends. There are a few blog posts out there about contact, but I think it’s incredibly underrated.
The rules of contact are simple, but I often tell people it’s easier to learn by watching others play rather than by a verbal explanation of the rules. Nevertheless, here is a verbal explanation of the rules.
- There is one player who is “the defender”. All of the other players are “attackers”, and work together to defeat the defender as quickly as possible. The defender cannot win, only stave off defeat.
- The defender chooses a secret word at the start of the game. They reveal the first letter of the secret word to the attackers. The attackers win when they learn the defender’s secret word.
- The attackers get clues about the secret word by making “contact” with each other. To make contact, a pair of attackers must simultaneously count down from 3 and then say some word that starts with the prefix of the secret word that has been revealed so far. When the attackers make contact, the defender must reveal the next letter of the secret word.
- If the defender says the word that a pair of attackers are trying to make contact on before they say it, the contact attempt is blocked, and the blocked word cannot be used again. To prevent this, attackers give vague clues about what word they have in mind. To indicate to an attacker who has given a clue that you think you know what word they have in mind and would like to make contact, typically you say “contact!” and start the countdown.
- The defender cannot block contact if the word the attackers are trying to make contact on is the secret word. If the attackers make contact on the secret word, they win.
Example Contact
We join our players in media res, at the start of the second round.
- Defender: the word starts with H-E.
- Attacker 1: double hockey sticks…
- Attacker 2: contact!
- Attackers 1 and 2: 3… 2…
- Defender: not “hell”!
- Attackers 1 and 2: yeah that’s what we were thinking 🙁
- Attacker 3: neutralized alpha particle?
- Attacker 2: ????
- Attacker 1: contact!
- Attackers 1 and 2: 3… 2…
- Defender: not… uh…
- Attackers 1 and 2: 1… helium!
- Defender: ahhhh, damnit, good one. The word starts with H-E-M.
Enjoyable Contact
There are a few important rules to actually have fun playing this game, besides the hard rules of the game itself.
- Everybody should know the secret word once they hear it. It makes for a very unsatisfying game when the stream of contact attempts slowly peters out, and the defender eventually reveals the word, only to find out they were the only person in the game who knew that word. Conversely, the best way for a game to end is with a facepalm moment where the attackers realize they’ve been missing an obvious word.
- Likewise, the defender should be able to understand the contact clue once the contact word is revealed. This is a very gray area, as the whole point of the clues is to differentially communicate with fellow attackers rather than with the defender. But “the word I just texted you” or “word #243 from the table we agreed on yesterday” are clearly not fun. Your mileage may vary with allowing inside jokes, references to public information the defender does not know, etc. Ideally, the defender either facepalms or laughs after every successful contact.
- The defender should be able to hear and process all of the attackers talking. In games with more than five attackers, after saying “contact!”, it’s good etiquette to make sure to get the defender’s attention and say the clue clearly before starting the countdown.
- Agree on some treatment of “near misses”. If Attacker 1 and Attacker 2 attempt contact, but A1 says “mitten” and A2 says “mittens”, is that contact? Likewise, if A1 and A2 are planning to contact on “mitten”, but the defender says “mittens”, is that a block?
- Agree on what happens after a failed contact attempt. If Attacker 1 is thinking of X, and Attacker 2 is thinking of Y, are both X and Y now unusable? Are both still usable? What if X is the secret word – do the attackers win?
- Agree on what determines who defends next after a round ends. Does the person who gave the winning hint become the defender for the next round? Personally, I recommend “everybody tries to think of words and whoever wants to defend + thinks they have a fun word in mind volunteers”.
Homebrew Contact
There are various “house rules” or “styles” you can add to contact to get a slightly different experience.
- Attackers are allowed to contact on puns, misspellings, portmanteaus, and other things which are not real words as long as they have the same spelling in mind. I think this one is a lot of fun. Examples of this rule in action:
- Prefix is I-N-T. A1 and A2 attempt contact on “internet”, but defender blocks it. A1 says “the opposite of that”, and after a little thought and some squinting, A1 and A2 successfully contact on “internot”.
- Prefix is G-E-R-O. Attackers can’t think of anything for a while. A1 suggests, “bird with a long beak, long legs, and a typo”. A1 and A2 contact on “geron”.
- Prefix is T-A-R. A1 gives the hint “a large spider native to Romania”. A1 and A2 contact on “taracula”.
- Win conditions for the defender
- The defender on the spot wins if a contact attempt fails and one of the words involved was the secret word.
- The defender wins if they reveal the last letter of the secret word before the attackers can make contact on the secret word.
- There is some limit on the number of contact attempts that can be made (per attacker, per round, per game) and the defender wins when the attackers run out.
- Riddle clues. This isn’t a ruleset, more of a play style. I was playing with a friend once, and we were stuck on C-O for a while due to well-played defense. This friend stayed entirely silent while the rest of us were guessing and getting blocked, and then piped up with the following clue: “can be used to hold red things or white things. Take off the first letter, and you get a green thing”1.
- Multiple defenders
- Defender is allowed to adjust the secret word on the fly, provided prefixes still fit
Ideal Contact
If you play for a little while, you will probably notice the following pattern: sometimes a defender will pick a rare or unusual word, the first round will proceed as normal, and then after the second or third letters are revealed the word will be entirely obvious. “Syzygy” is a particularly egregious example: after S-Y is revealed, the game is almost over. S-Y-Z is guaranteed over.
This raises an interesting question. I’ve talked a bit about what makes for the most fun contact word. But what makes for a highly defensible contact word?
Intuitively, a defensible word should be long, so the defender has more “lives” before the last letter of the word is revealed. It should also be free of giveaway prefixes, such as “syz”. It should also still have some kind of “rare or unusual” quality, to make it less likely that attackers will guess it.
A pretty decent strategy for generating defensible words is to pick a common prefix (“un”) or even a long common prefix if you can (“inter”) and then a rare completion (“interregnum”)2. In fact, prior work on the topic discusses this exact strategy, and helpfully identifies the most common prefixes in English.
But I think we can do better than that. Key questions I’m interested in:
- The “common prefix rare suffix” approach is obviously a clunky two-step approximation to some cleaner thing. What is that thing?
- A static list of “most defensible contact words” would be nice, but isn’t quite the thing I’m looking for. If my friends know I’m picking my words off some top ten list, that makes the game easier for them, not harder. What’s the most defensible distribution over words at equilibrium?
Theoretical Contact
First things first. If we’re looking for the “most defensible” words/distributions, we need some precise setting, so that we can actually try and prove things about how different words/distributions perform, and whether they are optimal. This is the setting I’ll be starting with:
- Defender and attackers have the same vocabulary, and have access to the full tree of prefixes to valid words.
- Attackers are able to decide on a word to contact on within a constant amount of clock time. The game is essentially turn based – we count how long the defender survives based on number of attempts at contact, rather than seconds.
- The defender is able to block contact attempts with some constant probability p_block.
I’m interested in both the “black-box” case where the attackers assume the defender’s word is chosen uniformly randomly from valid words, and the “white-box” case where the attackers know the defender’s word choice distribution.
With just these assumptions, we can already identify some features of optimal play. The “no re-using clues” rule becomes irrelevant, since that’s only included because humans have a harder time coming up with a new contact clue than re-using an old one. Our theoretical perfect attackers have no problem coming up with new clues, and so they would never re-use a clue, because using a new word comes with some chance of the new word being the secret word and winning on the spot.
When p_block=1, the defender doesn’t have to worry about losing by giving away letters. The only lose condition left is that the attackers randomly guess the secret word, while still having only the first letter. The optimal strategies here are pretty obvious. Black-box, the defender chooses the most common first letter, and then it doesn’t matter what the secret word is as long as it starts with that letter3. White-box the defender chooses the most common first letter and then is forced to choose the secret word uniformly randomly over the words that start with that letter. In either case, they survive for max_letter(numValidSuffixes(letter)/2)-1 turns in expectation.
Is the p_block=0 case similarly easy? Well, now the defender’s primary concern isn’t having their word guessed at random, it’s running out of letters. So should the defender just make their word as long as possible? In the black-box case, yes, and they’ll survive for max_word(len(word))-1 turns, minus some (small if you’re playing with a reasonable vocabulary) chance that the attacker gets lucky and guesses the word. But in the white-box case, deterministically picking the longest word will get sniped and lose immediately! The defender has to compromise somehow between concentrating their distribution on long words and spreading the distribution out across different words. What is the best way to make that compromise?
Equilibrium Contact
I’m not particularly familiar with whatever subfield of game theory deals with random strategies in this kind of white-box case, I just vaguely know it exists. And frankly it seems more exciting to try and invent stuff myself than to do a literature review right now. So let me try and do that.
One hunch I have is that for the optimal distribution, each word in the distribution should have an equal marginal contribution to the risk of losing. I’d say something like “the derivative of expected lifetime with respect to the probability on a given word should be the same for all words with nonzero probability”, but there’s a problem with taking the derivative w.r.t the probability mass placed on particular value, since the distribution has to sum to 1. So we have some work to do in getting this intuition to something formal.
The natural place to start is writing down the defender’s expected lifetime at a given game state, so we can talk about “marginal contribution to risk of losing”. We’ll consider the game state before the attackers submit their guess, and we’ll count the guess which matches the secret word as part of the defender’s score. In other words, every defensive policy has an expected lifetime of at least one, except for the the defender with an empty vocabulary who could have a lifetime of 0 if you want to think about it that way.
When the attackers submit their guess, there’s a chance the attackers will guess the secret word, in which case the defender’s lifetime is 1. If the attackers don’t guess the secret word, then the defender survives for that turn, plus the expected lifetime of the game state with one more letter revealed.
E(lifetime | secret, prefix) = p(guess secret | prefix) + (1-p(guess secret | prefix))(1+E(lifetime | secret, prefix++))
If the attackers know the defender’s distribution and are guessing optimally, then they always guess the word that the defender puts the most probability on, of the words remaining given the prefix revealed so far.
p(guess secret | prefix) = max(def_dist(secret | prefix))
E(lifetime | secret, prefix) = max(def_dist(secret | prefix)) + (1-max(def_dist(secret | prefix))(1+E(lifetime | secret, prefix++))
We now have the function for the expected lifetime of a given state expressed in terms of itself and the defender’s distribution. This should be all we need to identify the ideal distribution! But unfortunately, the relationship between E(lifetime | secret, prefix) and E(lifetime | secret, prefix++) is gnarly and complicated, because it has to do with the real life English language. We’re not going to be able to get a nice closed form solution for E(lifetime | secret, prefix) or for the optimal distribution. But we can write an algorithm to get it, given a vocabulary.
Algorithmic Contact
The naive way of doing this involves working backwards through all prefixes of valid words, from most complete to least complete (bottom-up level traversal). This is DP, if you want to think of it that way, with a (length of longest valid word)-dimensional DP table.
If the prefix under consideration is a leaf of the prefix tree, from there the expected lifetime is 1. If it’s not a leaf, then because we’re doing a bottom-up level traversal we should already have access to the expected lifetimes of each child prefix, E(lifetime | secret, prefix++). Now we have all the parts we need to maximize E(lifetime | secret, prefix). How do we actually do that?
The form of what we’re trying to do here is take some pair of vectors <L>, <M> and optimize some vector <P> to maximize F = (1+dot(<L>, <P>))(1-max(hadamard4(<M>,<P>))). <L> is the expected lifetime of each child, <M> is the maximum probability on any word given by each child’s distribution, and <P> are the probabilities we assign to each child. ChatGPT o1-preview agrees with me that this probably isn’t going to admit a closed-form solution, and suggests numerical optimization.
The gain (opposite of loss bc we’re maximizing) landscape for this optimization has a few discernible properties. Everything should be continuous, but the max is going to make the gain non-differentiable on the manifolds where max(hadamard(<M>,<P>)) is switching from being determined by one parameter to another. In regions where a given parameter P_i is not determining the max, the gain landscape will be linear with slope L_i with respect to that parameter. Problem is, this structure is all in the basis dimensions, and our optimization has to live on the subspace where the parameters sum to 15.
Without that structure, my geometric intuition isn’t feeling up to the task of trying to write the optimization algorithm myself in some insightful way. Looks like we’re out of theory runway, and it’s on to the programming part of this post6.
Numerical Contact
Here’s a function for the expected lifetime of a game state:
def life(probabilities, lifetimes, maxes):
max_p = np.max(np.multiply(maxes, probabilities))
return max_p + (1 - max_p) * (1 + np.dot(lifetimes, probabilities.T))
We just have to wrap that and negate it, since scipy wants a loss to minimize instead of maximize.
def make_objective(lifetimes, maxes):
def objective(probabilities):
return -life(probabilities, lifetimes, maxes)
return objective
def optimize_probabilities(lifetimes, maxes):
objective = make_objective(lifetimes, maxes)
prob_constraint = [{'type':'eq', 'fun':lambda p : np.sum(p) - 1}]
prob_bounds = [(0, 1) for _ in range(lifetimes.shape[0])]
init = np.array([1/lifetimes.shape[0]]*lifetimes.shape[0])
result = opt.minimize(objective,
init,
bounds=prob_bounds,
constraints=prob_constraint)
return result.x, -result.fun
Ok, great. Now, what do we actually do with that? Well, the other part of our algorithm is doing a backwards level traversal over the tree of all prefixes to valid words. So let’s set that up:
def analyze_vocabulary(path):
prefix_trie = {}
with open(path) as word_file:
for word in word_file.readlines():
position = prefix_trie
for letter in word.strip():
if letter not in position:
position[letter] ={}
position = position[letter]
position['$'] = {}
return prefix_trie
And now we have everything we need. We know that a leaf node has a lifetime of 1 and a maximum probability of 1. If we have all of the lifetimes and maximum probabilities for the children of a given prefix, we can find the optimal probabilities to assign to each child to maximize the lifetime of that prefix. And from those probabilities, we can calculate the lifetime of the prefix, as well as its maximum probability. Then we just proceed backwards, until we’ve calculated the whole optimal conditional distribution, from which we can extract the distribution over valid words when starting from the root.
Invincible Contact
🙁 optimization problems, give me a sec I’ll get back to you
Further Contact
- Factor in defense as a probability of staying in the same game state
- Factor in defense effect of eliminating possibilities
- Factor in different amount of time to think of different words given prefix
- Prove things about the optimization: Convex? Single optimum?
- Implications of multiple global optima at each prefix
- Sampler
- “cork”. ↩︎
- Fun management footnote: for defensibility it’s tempting to tack on stuff like “un”, “super”, “ing”, “able” whenever you can, but this can make gameplay feel repetitive. ↩︎
- The cleaner version is to play without the defender revealing the first letter of their word, and then the “optimal black-box strategy” in this case is picking literally any word. ↩︎
- This is apparently what The Math People call element-wise multiplication of vectors/matrices. I use it here because element_wise_product(<M>,<P>)) felt a bit too clunky, element_wise(<M>,<P>) wasn’t very clear, and hey if you’re reading this footnote you learned something extra today. ↩︎
- I figured there would probably be a name for this and there is, it’s called the standard simplex. ↩︎
- https://github.com/jlucassen/contact ↩︎