Context: I keep wanting one place to refer to the research I did Summer 2022, and the two Lesswrong links are kind of big and clunky. So here we go! Figured I’d add some brief commentary while I’m at it, mostly just so this isn’t a totally empty linkpost.
Summer 2022 I did AI Alignment research at MIRI under Evan Hubinger’s mentorship. It was a lot like SERI MATS, but more unstructured and “direct”, largely because I had already read all of Evan’s usual assigned readings for MATS scholars during their first few weeks. The resulting write-ups can be found here:
- https://www.lesswrong.com/posts/HAz7apopTzozrqW2k/strategy-for-conditioning-generative-models
- https://www.lesswrong.com/posts/bzkCWEHG2tprB3eq2/attempts-at-forwarding-speed-priors
In retrospect I’m not particularly optimistic about either of these posts being directly useful to some eventual solution to alignment. They’re fairly deep into a big gnarly DAG of problems, sub-problems, sub-sub-problems, etc.
I do think the summer was very productive in terms of learning about how to do pre-paradigmatic theoretical research. I don’t expect any of this to be particularly ground-breaking, and a lot of the important stuff is probably illegible, but briefly:
- Strategic clarity.
- Tracking all your “open” research threads is super helpful, but the tree gets unwieldy very quickly.
- Writing is extremely helpful for clarifying things and finding new threads, but is very time consuming.
- Builder/breaker is great.
- A research partner with a *slightly* different set of heuristics from yourself can be tremendously helpful. But beware, as “styles” differ, communication overhead builds up.
- The more framings you have for a given system/behavior/problem/thing, the better.