How important is visualization w.r.t. the interpretability (or broader alignment) problem? Specifically, is there need+opportunity for impact of frontend engineers in that space?

Additional context:

I’ve got 10 years of experience in software engineering, most of which has been on frontend data visualization stuff, currently at Google (previously at Microsoft). I looked around at some different teams within Google and saw Tensorboard and the Learning Interpretability Tool, but it’s unclear to me how much those teams are bottlenecked by visualization implementation problems vs research problems of knowing where/how to even look, and I’d like to have more background before I cold-call them directly

I've started to get burned out by the earning to give path and am currently considering semi-retirement to focus on other pursuits, but if there’s somewhere I can contribute to alignment without needing to go back for a PhD that would be perfect (I have been eagerly studying ML on the side though)




New Answer
New Comment

1 Answers sorted by

Visualization is pretty important in exploratory mechanistic interp work, but this is more about fast research code: see any of Neel's exploratory notebooks.

When Redwood had a big interpretability team, they were also developing their own data viz tooling. This never got open-sourced, and this could have been due to lack of experience by the people who wrote such tooling. Anthropic has their own libraries too, Transformerlens could use more visualization, and I hear David Bau's lab is developing a better open-source interpretability library. My guess is there is more impact if you're willing to participate in interp research yourself, but still probably some opportunities to mostly do data viz at some interp shop.

With regard to bottlenecks being on knowing where/how to look, the important thing is to work with the right team. From a quick glance the Learning Interpretability Tool is not focused on mechinterp, and the field of interp is so much larger than the subset targeted at alignment that you'd likely have more impact at something more targeted. In your position I'd likely talk to a bunch of empirical alignment researchers about their frontend / data viz needs, see if a top tier team like Superalignment is hiring, and have an 80k call while developing a good inside view on the problem