Research
I am interested in the structure and interpretation of natural and artificial systems. In the past, I studied mechanisms of multi-step reasoning in transformers, cross-lingual feature sharing in language models, and sparse autoencoders as a tool for interpreting language model representations. Beyond understanding models themselves, I am excited about the prospect of discovering useful knowledge in them and teaching it to humans.Selected Publications
-
Large Language Models Share Representations of Latent Grammatical Concepts across Typologically Diverse Languages. Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025. (Oral)
-
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task. Association for Computational Linguistics (ACL), 2024.