Sonia Joseph

Featured in: Favicon

dovepress.com

Articles

Multimodal interpretability in 2024

Nov 29, 2024 | soniajoseph.ai | Sonia Joseph

I'm writing this post to clarify my thoughts and update my collaborators on multimodal interpretability in 2024. Having spent part of the summer in the AI safety sphere in Berkeley, and then joining the video understanding team at FAIR as a visiting researcher, I'm bridging two communities: the language mechanistic interpretability efforts in AI safety, and the efficiency-focused Vision-Language Model (VLM) community in industry. Some content may be more familiar to one community than the other.
Bridging the VLM and SAE communities for multimodal interpretability — LessWrong

Oct 28, 2024 | lesswrong.com | Sonia Joseph

I wrote this post after spending my summer in the MATS community looking at sparse autoencoders on vision models, and then joining FAIR/Meta as a visiting researcher on their video generation team. It has been absolutely fascinating navigating both the cultural and research differences, but that is a subject for another post. (During my time at MATS, I got into several lively debates with other cohort members about going to Meta after the program.
Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems — LessWrong

Mar 13, 2024 | lesswrong.com | Sonia Joseph |Neel Nanda |Charlie Steiner |Praneet Neuro

I think working on mechanistic intepretability in a variety of domains, architectures, and modalities seems like a reasonable research diversification bet. However, it feels pretty odd to me to describe branching out into other modalities as crucial when we haven't yet really done anything useful with mechanistic interpretability in any domain or for any task.

Contact details

Emails

[email protected]

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

Sonia Joseph

Articles

Multimodal interpretability in 2024

Bridging the VLM and SAE communities for multimodal interpretability — LessWrong

Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems — LessWrong

Contact details

Emails

Socials & Sites