Patrick Leask

Postgraduate Student

Affiliations
Affiliation
Postgraduate Student in the Department of Computer Science

Conference Paper

Sparse Autoencoders Do Not Find Canonical Units of Analysis

Leask, P., Bussmann, B., Pearce, M., Bloom, J., Tigges, C., Al Moubayed, N., Sharkey, L., & Nanda, N. (2025, April 25). Sparse Autoencoders Do Not Find Canonical Units of Analysis. Presented at ICLR2025: The Thirteenth International Conference on Learning Representations, Singapore.
Sparse Autoencoders Do Not Find Canonical Units of Analysis

Leask, P., Bussmann, B., Pearce, M. T., Isaac Bloom, J., Tigges, C., Al Moubayed, N., Sharkey, L., & Nanda, N. (2025, January 22). Sparse Autoencoders Do Not Find Canonical Units of Analysis. Presented at The Thirteenth International Conference on Learning Representations, Singapore.
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models

Leask, P., & Al Moubayed, N. (2025). Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models. In A. Singh, M. Fazel, D. Hsu, S. Lacoste-Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, & J. Zhu (Eds.), International Conference on Machine Learning, 13-19 July 2025, Vancouver Convention Center, Vancouver, Canada (pp. 32803-32829).

Staff profile