On Contact, Part 1 - James Lucassen's Blog

Context: for fun (and profit?)

Basic Contact

Contact is a lightweight many-versus-one word guessing game. I was first introduced to it on a long bus ride several years ago, and since then it’s become one of my favorite games to play casually with friends. There are a few blog posts out there about contact, but I think it’s incredibly underrated.

The rules of contact are simple, but I often tell people it’s easier to learn by watching others play rather than by a verbal explanation of the rules. Nevertheless, here is a verbal explanation of the rules.

There is one player who is “the defender”. All of the other players are “attackers”, and work together to defeat the defender as quickly as possible. The defender cannot win, only stave off defeat.
The defender chooses a secret word at the start of the game. They reveal the first letter of the secret word to the attackers. The attackers win when they learn the defender’s secret word.
The attackers get information about the secret word by making “contact” with each other. To make contact, a pair of attackers must simultaneously count down from 3 and then say some word that starts with the prefix of the secret word that has been revealed so far. When attackers make contact, the defender must reveal the next letter of the secret word.
To prevent contact, the defender can try to guess the word that a pair of attackers are trying to make contact on before they say it. If the defender guesses the contact word, that word is blocked. Blocked words cannot be used to make contact.
To prevent blocking, attackers give each other vague clues about what word they have in mind. To indicate to an attacker who has given a clue that you think you know what word they have in mind and would like to make contact, say “contact!” and start the countdown.
The defender cannot block contact if the word the attackers are trying to make contact on is the secret word. If the attackers make contact on the secret word, they win.

Example Contact

We join our players in media res, at the start of the second round.

Defender: the word starts with H-E.
Attacker 1: double hockey sticks…
Attacker 2: contact!
Attackers 1 and 2 together: 3… 2…
Defender: hell!
Attackers 1: yeah that’s what I was thinking 🙁
Attacker 3: neutralized alpha particle?
Attacker 2: ????
Attacker 1: contact!
Attackers 1 and 3 together: 3… 2…
Defender: uh…
Attackers 1 and 3 together: 1… helium!
Defender: ahhhh, damnit, good one. The word starts with H-E-M.

Enjoyable Contact

There are a few important rules to actually have fun playing this game, besides the hard rules of the game itself.

Everybody should recognize the secret word once they hear it. It makes for a very unsatisfying game when the stream of contact attempts slowly peters out, and the defender eventually reveals the word, only to find out they were the only person in the game who knew that word (looking at you, hydroxyanthraquinone). Conversely, the best way for a game to end is with a facepalm moment where the attackers realize they’ve been missing an obvious word.
Likewise, the defender should be able to understand the contact clue once the contact word is revealed. This is a very gray area, as the whole point of the clues is to differentially communicate with fellow attackers rather than with the defender. But “the word I just texted you” or “word #243 from the lookup table we agreed on yesterday” are clearly not fun. Your mileage may vary with allowing inside jokes, references to public information the defender does not know, etc. Ideally, the defender either facepalms or laughs after every successful contact.
The defender should be able to hear and process all of the attackers talking. In games with more than five attackers, after saying “contact!”, it’s good etiquette to make sure to get the defender’s attention and say the clue clearly before starting the countdown.
Agree on some treatment of “near misses”. If Attacker 1 and Attacker 2 attempt contact, but A1 says “mitten” and A2 says “mittens”, is that contact? Likewise, if A1 and A2 are planning to contact on “mitten”, but the defender says “mittens”, is that a block?
Agree on what happens after a failed contact attempt. If Attacker 1 is thinking of X, and Attacker 2 is thinking of Y, are both X and Y now unusable? Are both still usable? What if X is the secret word – do the attackers win?
Discuss when to end the game. For example the prefix has grown so long that attackers are just guessing different variations of the same word until one sticks, clues can stop being interesting and the game starts to feel like brute force search.
Agree on what determines who defends next after a round ends. Does the person who gave the winning hint become the defender for the next round? Personally, I recommend “everybody tries to think of words and whoever wants to defend + thinks they have a fun word in mind volunteers”.

Homebrew Contact

There are various “house rules” you can add to contact to get a slightly different experience.

Attackers are allowed to contact on puns, misspellings, portmanteaus, and other things which are not real words as long as they have the same spelling in mind. I think this one is a lot of fun. Examples of this rule in action:
- Prefix is I-N-T. A1 and A2 attempt contact on “internet”, but defender blocks it. A1 says “the opposite of that”, and after a little thought and some squinting, A1 and A2 successfully contact on “internot”.
- Prefix is G-E-R-O. Attackers can’t think of anything for a while. A1 suggests, “bird with a long beak, long legs, and a typo”. A1 and A2 contact on “geron”.
- Prefix is T-A-R. A1 gives the hint “a large spider native to Romania”. A1 and A2 contact on “taracula”.
Possible win conditions for the defender:
- The defender wins on the spot if a contact attempt fails and one of the words involved was the secret word.
- The defender wins on the spot if the secret word is said by any player. The secret word cannot be any of “contact”, “three”, “two”, “one”.
- The defender wins if they reveal the last letter of the secret word before the attackers can make contact on the secret word.
- There is some limit on the number of contact attempts that can be made (per attacker, per round, or per game) and the defender wins when the attackers run out.
Riddle clues. This isn’t a ruleset, more of a play style. I was playing with a friend once, and we were stuck on C-O for a while due to well-played defense. This friend stayed entirely silent while the rest of us were guessing and getting blocked, and then piped up with the following clue: “can be used to hold red things or white things. Take off the first letter, and you get a green thing”¹.
Multiple defenders share a secret word.
Defender is allowed to adjust the secret word on the fly, provided the revealed prefix still fits.

Ideal Contact

If you play for a little while, you will probably notice the following pattern: sometimes a defender will pick a rare or unusual word, the first round will proceed as normal, and then after the second or third letters are revealed the word will be entirely obvious. “Syzygy” is a particularly egregious example: after S-Y is revealed, the game is almost over. S-Y-Z is guaranteed over.

This raises an interesting question. I’ve talked a bit about what makes for the most fun contact word. But what makes for a highly defensible contact word?

Intuitively, a defensible word should be long, so the defender has more “lives” before the last letter of the word is revealed. It should also be free of giveaway prefixes, such as “syz”. It should also still have some kind of “rare or unusual” quality, to make it less likely that attackers will guess it.

A pretty good strategy for generating defensible words is to pick a common prefix (“un”) or even a long common prefix if you can (“inter”) and then a rare completion (“interregnum”)². In fact, prior work on the topic discusses this exact strategy, and helpfully identifies the most common prefixes in English.

But I think we can do better than that. Key questions I’m interested in:

The “common prefix + rare suffix” approach is obviously a clunky two-step approximation to some cleaner thing. What is that thing?
A static list of “most defensible contact words” would be nice, but isn’t quite the thing I’m looking for. If my friends know I always pick one of the same ten words when defending, that makes the game easier for them, not harder. What’s the most defensible distribution when the attackers know the distribution?

Theoretical Contact

First things first. If we’re looking for the “most defensible” words/distributions, we need some precise setting, so that we can actually try and prove things about how different words/distributions perform, and whether they are optimal. This is the setting I’ll be starting with:

Defender and attackers have the same vocabulary, and have instantaneous access to the full tree of prefixes to valid words.
Attackers are able to decide on a word to contact on within a constant amount of clock time. This makes the game essentially turn based – we count how long the defender survives based on number of attempts at contact, rather than seconds³.
The defender is able to block contact attempts with some constant probability p_block.

I’m interested in both the “black-box” case where the attackers don’t know the defender’s word choice distribution, and the “white-box” case where they do.

With just these assumptions, we can already identify some features of optimal play. The “no re-using clues” rule becomes irrelevant, since that’s only included because humans have a harder time coming up with a new contact clue than re-using an old one. Our theoretical perfect attackers have no problem coming up with new clues, and so they would never re-use a clue, because using a new word comes with some chance of the new word being the secret word and winning on the spot.

When p_block=1, the defender doesn’t have to worry about losing by giving away letters. The only lose condition left is that the attackers randomly guess the secret word, while still having only the first letter. The optimal strategies here are pretty obvious. Black-box, the defender chooses the most common first letter⁴, and then chooses any distribution they want over words starting with that letter. White-box, the defender chooses the most common first letter and then is forced to choose the secret word uniformly randomly over the words that start with that letter.

Is the p_block=0 case similarly easy? Let’s examine the black-box case first. Assuming the vocabulary is reasonably large relative to its maximum depth, running out of letters becomes a much more pressing concern than the attackers guessing the word at random. So should the defender just choose the longest word in the vocabulary? If we’re willing to assume that every subtree in the vocabulary has a number of leaf nodes much greater than its depth, then yes, pretty much. If we’re not willing to assume that there are problems⁵, but not problems I’m particularly interested in right now.

I’m mainly interested in the white-box case, where the defender can’t afford to concentrate their distribution so much for fear of the attackers guessing the secret word. For example, deterministically picking the longest word will get sniped and lose immediately! The defender has to compromise somehow between concentrating their distribution on long words and spreading the distribution out across different words. What is the best way to make that compromise?

Equilibrium Contact

I’m not particularly familiar with whatever subfield of game theory deals with random strategies in this kind of white-box case, I just vaguely know it exists. And frankly it seems more exciting to try and invent stuff myself than to do a literature review right now. So let me try and do that.

One hunch I have is that for the optimal distribution, each word in the distribution should have an equal marginal contribution to the risk of losing. I’d say something like “the derivative of expected lifetime with respect to the probability on a given word should be the same for all words with nonzero probability”, but there’s a problem with taking the derivative w.r.t the probability mass placed on particular value, since the distribution has to sum to 1. So we have some work to do in getting this intuition to something formal.

The natural place to start is writing down the defender’s expected lifetime at a given game state, so we can talk about “marginal contribution to risk of losing”. We’ll consider the game state before the attackers submit their guess, and we’ll count the guess which matches the secret word as part of the defender’s score. In other words, every defensive policy has an expected lifetime of at least one, except for the the defender with an empty vocabulary who could have a lifetime of 0 if you want to think about it that way.

When the attackers submit their guess, there’s a chance the attackers will guess the secret word, in which case the defender’s lifetime is 1. If the attackers don’t guess the secret word, then the defender survives for that turn, plus the expected lifetime of the game state with one more letter revealed.

E(lifetime | secret, prefix) = p(guess secret | prefix) + (1-p(guess secret | prefix))(1+E(lifetime | secret, prefix++))

If the attackers know the defender’s distribution and are guessing optimally, then they always guess the word that the defender puts the most probability on, of the words remaining given the prefix revealed so far.

p(guess secret | prefix) = max(def_dist(secret | prefix))

E(lifetime | secret, prefix) = max(def_dist(secret | prefix)) + (1-max(def_dist(secret | prefix))(1+E(lifetime | secret, prefix++))

We now have the function for the expected lifetime of a given state expressed in terms of itself and the defender’s distribution. This should be all we need to identify the ideal distribution! But unfortunately, the relationship between E(lifetime | secret, prefix) and E(lifetime | secret, prefix++) is gnarly and complicated, because it has to do with the real life English language. We’re not going to be able to get a nice closed form solution for E(lifetime | secret, prefix) or for the optimal distribution. But we can write an algorithm to get it, given a vocabulary.

Algorithmic Contact

The naive way of doing this involves working backwards through all prefixes of valid words, from most complete to least complete (bottom-up level traversal). This is DP, if you want to think of it that way, with a (length of longest valid word)-dimensional DP table.

If the prefix under consideration is a leaf of the prefix tree, from there the expected lifetime is 1. If it’s not a leaf, then because we’re doing a bottom-up level traversal we should already have access to the expected lifetimes of each child prefix, E(lifetime | secret, prefix++). Now we have all the parts we need to maximize E(lifetime | secret, prefix). How do we actually do that?

The form of what we’re trying to do here is take some pair of vectors <L>, <M> and optimize some vector to maximize F = (1+dot(<L>, ))(1-max(hadamard⁶(<M>,))). <L> is the expected lifetime of each child, <M> is the maximum probability on any word given by each child’s distribution, and are the probabilities we assign to each child. ChatGPT o1-preview agrees with me that this probably isn’t going to admit a closed-form solution, and suggests numerical optimization.

The gain (opposite of loss bc we’re maximizing) landscape for this optimization has a few discernible properties. Everything should be continuous, but the max is going to make the gain non-differentiable on the manifolds where max(hadamard(<M>,)) is switching from being determined by one parameter to another. In regions where a given parameter P_i is not determining the max, the gain landscape will be linear with slope L_i with respect to that parameter. Problem is, this structure is all in the basis dimensions, and our optimization has to live on the subspace where the parameters sum to 1⁷.

Without that structure, my geometric intuition isn’t feeling up to the task of trying to write the optimization algorithm myself in some insightful way. Looks like we’re out of theory runway, and it’s on to the programming part of this post⁸.

Numerical Contact

Here’s a function for the expected lifetime of a game state:

def life(probabilities, lifetimes, maxes):
    max_p = np.max(np.multiply(maxes, probabilities))
    return max_p + (1 - max_p) * (1 + np.dot(lifetimes, probabilities.T))

We just have to wrap that and negate it, since scipy wants a loss to minimize instead of maximize.

def make_objective(lifetimes, maxes):
    def objective(probabilities):
        return -life(probabilities, lifetimes, maxes)
    return objective

def optimize_probabilities(lifetimes, maxes):
    objective = make_objective(lifetimes, maxes)
    prob_constraint = [{'type':'eq', 'fun':lambda p : np.sum(p) - 1}]
    prob_bounds = [(0, 1) for _ in range(lifetimes.shape[0])]
    init = np.array([1/lifetimes.shape[0]]*lifetimes.shape[0])
    result = opt.minimize(objective,
                          init,
                          bounds=prob_bounds,
                          constraints=prob_constraint)
    return result.x, -result.fun

Ok, great. Now, what do we actually do with that? Well, the other part of our algorithm is doing a backwards level traversal over the tree of all prefixes to valid words. So let’s set that up:

def analyze_vocabulary(path):
    prefix_trie = {}
    with open(path) as word_file:
        for word in word_file.readlines():
            position = prefix_trie
            for letter in word.strip():
                if letter not in position:
                    position[letter] ={}
                position = position[letter]
            position['$'] = {}
    return prefix_trie

And now we have everything we need. We know that a leaf node has a lifetime of 1 and a maximum probability of 1. If we have all of the lifetimes and maximum probabilities for the children of a given prefix, we can find the optimal probabilities to assign to each child to maximize the lifetime of that prefix. And from those probabilities, we can calculate the lifetime of the prefix, as well as its maximum probability. Then we just proceed backwards, until we’ve calculated the whole optimal conditional distribution, from which we can extract the distribution over valid words when starting from the root.

Invincible Contact

I’m having problems with my optimization, and I want to get this post out while it still feels fun and motivating. Hopefully I’ll be back sometime with a Part 2, complete with a script for sampling from the Theoretically Perfect Defense Distribution for the n most common English words.

For now, I’ll leave off with a list of open problems in Contact Studies:

Optimal black-box defense without vocabulary density assumption
Factor in defense probability
Factor in effects of guesses eliminating possibilities
Approximate effects of more obscure words on guess time / guess probability

“cork”. ↩︎
Fun management footnote: for defensibility it’s tempting to tack on stuff like “un”, “super”, “ing”, “able” whenever you can, but this can make gameplay feel repetitive. ↩︎
Note that this throws out any effects from the secret word being obscure or hard to think of, which is an important effect when playing contact against humans. ↩︎
The cleaner version is to play without the defender revealing the first letter of their word, and then the “optimal black-box strategy” in this case is picking literally any word. ↩︎
Consider the vocabulary [AA, AB, CCCCC]. If the defender survives the guess and the secret word is CCCCC, they will have to reveal a C, which tells the attackers exactly what the secret word is, so they can just guess it next turn, and the extra length doesn’t really matter. This only happens because the in the prefix tree “CCCCC” is a stick, allowing the attackers to get lots of information about the leaves of that subtree before the prefix has reached that point. If we add in the assumption that every subtree has a large number of leaves relative to its depth, then the attackers can never get much information lead over the revealed prefix, and picking one of the longest words is optimal (there can’t be a single longest word, by assumption). ↩︎
This is apparently what The Math People call element-wise multiplication of vectors/matrices. I use it here because element_wise_product(<M>,)) felt a bit too clunky, element_wise(<M>,) wasn’t very clear, and hey if you’re reading this footnote you learned something extra today. ↩︎
I figured there would probably be a name for this and there is, it’s called the standard simplex. ↩︎
https://github.com/jlucassen/contact ↩︎