On Contact [UNDER CONSTRUCTION]

Context: for fun (and profit?) Basic Contact Contact is a lightweight many-versus-one word guessing game. I was first introduced to it on a long bus ride several years ago, and since then it's become one of my favorite games to play casually with friends. There are a few blog posts out there about contact, but I think it's incredibly underrated. The rules of contact are simple, but I often tell…

Evaluating Stability of Unreflective Alignment

This post has an accompanying SPAR project! Apply here if you're interested in working on this with me. Huge thanks to Mikita Balesni for helping me implement the MVP. Regular-sized thanks to Aryan Bhatt, Rudolph Laine, Clem von Stengel, Aaron Scher, Jeremy Gillen, Peter Barnett, Stephen Casper, and David Manheim for helpful comments. 0. Key Claims Most alignment work today doesn’t aim for alignment that is stable under value-reflection1. I…

In Search of Strategic Clarity

Context: quickly written up, less original than I expected it to be, but hey that's a good sign. It all adds up to normality. The concept of "strategic clarity" has recently become increasingly important to how I think. It doesn't really have a precise definition that I've seen - as far as I can tell it's mostly just used to point to something roughly like "knowing what the fuck is…

DIY Asymmetric Weapons With Symmetric Weapons And Bayescraft

Epistemic status: Follow-up to this post. Fairly well considered, few hours total epistemic effort. Substantially more confident than before that this is correct, but still feel very ick about it. An asymmetric weapon is any strategy that has a higher probability of winning p(Win) if it is aligned with one side its axis of asymmetry than with the other. In other words, p(Win | X) > p(Win | ~X). This…

Discuss the Substance, Not the Symbol

Epistemic status: content summarized and synthesized (0-1 steps of reasoning) from the Sequences by Eliezer Yudkowsky, specifically A Human's Guide to Words. Guiding Puzzle: Is X a Y? Questions of the form "is X a Y" are all over the place, and a huge amount of cognitive power goes into trying to answer them. Some of them are fun or trivial, like "is water wet" or "is cereal a soup".…